Stupid CPAN Tricks
At MadMongers’s tomorrow night we’ll be covering a bunch of CPAN modules that people probably don’t know about, but that are super useful.
[From my blog.]
At MadMongers’s tomorrow night we’ll be covering a bunch of CPAN modules that people probably don’t know about, but that are super useful.
[From my blog.]
Those are all cool and nifty modules, but one of them screamed to me as an
outlier: Text::Unidecode.
tchrist can say it better than I ever could:
http://stackoverflow.com/a/6163129/40468
"Code that assumes you can remove diacritics to get at base ASCII letters is
evil, still, broken, brain-damaged, wrong, and justification for capital
punishment."
"Code that tries to reduce Unicode to ASCII is not merely wrong, its
perpetrator should never be allowed to work in programming again. Period. I’m
not even positive they should even be allowed to see again, since it obviously
hasn’t done them much good so far."
So, this module isn't a "useful cpan trick", but actively spreads
disinformation and bad practices about handling unicode text.
Ether: There is nothing wrong about what you said. However, you are missing one critical point. Pragmatic beats Perfection every single time. In this case my web site is taking in addresses in UTF-8, but a shipping web service I'm using is not capable of dealing with UTF-8 or even the entire ASCII spectrum. It only handles standard alpha characters without accents. Therefore I needed to convert UTF-8 and accented characters into standard alphanumeric characters in order to not have this web service crashing on me.