Helmut Wollmersdorfer [blogs.perl.org]

Unicode is 20++ years old and still a problem

By Helmut Wollmersdorfer on June 17, 2013 1:16 PM

Just did a quick hack to read out product data from an old shop site and import it into a new one:

- wget -r
- File::Find
- Mojo::Dom for parsing
- Text::CSV::Slurp for the result

After 11 minutes running for 14 K pages I experienced the bad surprise:

One file had non-ASCII characters in its name and File::Find does not use char-mode. I forgot about this. Text::CSV::Slurp crashed.

Why the hell are there so many CPAN modules still ignoring Unicode?

I blog about Perl.