I like reading the Perl Weekly Challenge even if I rarely participate.
It struck me that task 1 of Week 20 asked "to accept a string from command line and split it on change of character" - but every solution that I read in the recap looked for runs of the same character instead of the literal interpretation of the challenge.
Then I found, it was a dead end for me...
$ perl -E 'say join " ",split /(.)(?!\1)/,scalar '
ABBCDEEF
A B B C D E E …
I think of text, grammars that parse the text, and the syntax trees (data) generated by a parser as a triangle. Most of the time in computerland, people doing something with this triangle are interested in converting a text into a tree using a parser.
Every once in a while I need to write a parser. My first serious parser was in the late 90's, for a radio station that needed each week to convert a large plain-text weekly email to tables of venues and shows. For that, I used a version of Bison (yacc) that allowed me to write actions in Perl./var/www/users/yary/index.html
I recently had to do some quick search/replace on a bundle of Windows XML files. They are all encoded as UTF-16LE, with the Windows \n\r line endings encoded as 0D 00 0A 00.
Perl can handle UTF-16LE just fine, and it handles CRLF endings on windows out-of-the-box, but the problem is that the default CRLF translation happens too close to the filehandle- on the wrong side of the Unicode translation. The fix is to use the PerlIO layers :raw:encoding(UTF-16LE):crlf - the ":raw" prevents the default CRLF translation from happening at the byte level, the UTF secti…