Stupid CPAN Tricks

Stupid CPAN Tricks:

I’ve uploaded my slides for 9 Perl modules you should have in your tool belt to the MadMongers file archive.

[From my blog.]

4 Comments

Oops, I meant to leave this comment here, rather than on the post where the talk was announced:

Those are all cool and nifty modules, but one of them screamed to me as an
outlier: Text::Unidecode.

tchrist can say it better than I ever could:

http://stackoverflow.com/a/6163129/40468

"Code that assumes you can remove diacritics to get at base ASCII letters is
evil, still, broken, brain-damaged, wrong, and justification for capital
punishment."

"Code that tries to reduce Unicode to ASCII is not merely wrong, its
perpetrator should never be allowed to work in programming again. Period. I’m
not even positive they should even be allowed to see again, since it obviously
hasn’t done them much good so far."

So, this module isn't a "useful cpan trick", but actively spreads
disinformation and bad practices about handling unicode text.

Do you even read Text::Unidecode's POD? It doesn't claim to cover all cases or even "reduce Unicode to ASCII". Its goal is to try to display Unicode characters to a non-Unicode display as best as it can. Do you prefer your users to see '????????' instead, or perhaps choose to not display anything at all "because it's impossible to do things 100% correctly anyway"?

Sure the module gets things wrong some of the time, but it's still damn useful and cool. I wouldn't execute Sean, but would buy him a beer instead. :-)

That's the problem with the slides from a talk: you miss all the important discussion around the minimal parts that you see.

"Useful", like "best", doesn't have an absolute definition that everyone shares. It's based on context.

The module itself is quite clear about what's it's doing, and calls itself a tool "of last resort". It's not spreading any disinformation. The author is fully aware of what he's doing and why. He says "It's better than nothing!", meaning "there's nothing that Text::Unidecode's algorithm is better than".

The bad practice is to not do your research. :)

Leave a comment

About JT Smith

user-pic My little part in the greater Perl world.