Splitting on a change in Perl6

I had more thoughts about splitting on character changes, partly thanks to a mailing list thread, which led to more questions.

As a prelude, here are regexps that splits when changing to the letter B

# If the cursor is before B, and is after a character, and is not after B, split

> say "ABBBCDEEF".split(/<?before (B) {} ><?after . ><!after B >/).perl
/var/www/users/yary/index.html

Splitting on a change, Challenge 20 Task 1

I like reading the Perl Weekly Challenge even if I rarely participate. It struck me that task 1 of Week 20 asked "to accept a string from command line and split it on change of character" - but every solution that I read in the recap looked for runs of the same character instead of the literal interpretation of the challenge.

Then I found, it was a dead end for me...

$ perl -E 'say join " ",split /(.)(?!\1)/,scalar '
ABBCDEEF
 A B B  C  D E E …

Apropos proto: Perl6.c multi thoughts

Multi routines are pretty neat, but seem incomplete to me. Some background- one can compute factorials this way:


multi fac(0) { 1 }
multi fac(Int $n where 1..Inf) { $n * fac( $n-1 ) }
say fac(4); # 24

Now what if we want to pass our recursive-multi-sub "fac" as a callback?

given &fac -> $some_fun { say "some_fun(4)=",$some_fun(4) }

Now... what about defining an anonymous multi-sub?

Text, Grammar, Tree- the 3

I think of text, grammars that parse the text, and the syntax trees (data) generated by a parser as a triangle. Most of the time in computerland, people doing something with this triangle are interested in converting a text into a tree using a parser.

Every once in a while I need to write a parser. My first serious parser was in the late 90's, for a radio station that needed each week to convert a large plain-text weekly email to tables of venues and shows. For that, I used a version of Bison (yacc) that allowed me to write actions in Perl./var/www/users/yary/index.html

UTF-16 and Windows CRLF, oh my

I recently had to do some quick search/replace on a bundle of Windows XML files. They are all encoded as UTF-16LE, with the Windows \n\r line endings encoded as 0D 00 0A 00.

Perl can handle UTF-16LE just fine, and it handles CRLF endings on windows out-of-the-box, but the problem is that the default CRLF translation happens too close to the filehandle- on the wrong side of the Unicode translation. The fix is to use the PerlIO layers :raw:encoding(UTF-16LE):crlf - the ":raw" prevents the default CRLF translation from happening at the byte level, the UTF secti…