September 2010 Archives

Reading META.yml when it's not UTF-8

Part of the 3% of the distributions I couldn't index with MyCPAN had encoding issues. YAML is supposed to be UTF-8, but when I don't always get UTF-8 when I generate a META.yml for files that don't have one. I guess I could do the work to poke around in Makemaker, etc, to convert all the values before I generate the META.yml, but um, no. Not only that, not all of the META.yml files already in the dists are UTF-8. Remember, however, this is a very small part of BackPAN: about 700 distributions out of 140,000 (or about 1/7th of my problem cases).

A couple hundred distros have Makefile.PL files encoded as Latin-1 in a way that it matters. If it's not collapsable to ASCII, the META.yml ends up with Latin-1 in it. Some YAML parsers refuse to deal with that.

I'm not particularly satisfied with this solution, but I assume that it's UTF-8, which is mostly true, but if the YAML loader barfs on it, I try to load it as Latin-1 and convert it.

sub _load_meta_yml { $_[0]->_try_utf8( $_[1] ) || $_[0]->_try_latin1( $_[1] ) }

sub _try_utf8 { $_[0]->_load_yaml( $_[0]->_load_file( 'utf8', $_[1] ) ) }

sub _try_latin1 {
    require Encode;
    Encode::from_to( my $utf8 = $_[0]->_load_file( 'bytes', $_[1] ), 'latin1', 'utf8' );
    $_[0]->_load_yaml( $utf8 );
    }

sub _load_file {
    $logger->debug( "Trying to load $_[2] as $_[1]" );
    local $/; open my $f, "<:$_[1]", $_[2]; 
    my $content = scalar <$f>;
    }

sub _load_yaml {
    require YAML::Syck;
    my( $caller ) = ( caller(1) )[3]; 
    my $yaml = eval { YAML::Syck::Load( $_[1] ) } or 
        $logger->error( "$caller: $@" );
    $yaml;
    }

I liked YAML::XS for a bit, but it has a problem with the utf8 pramga that messed up some other stuff I was handling. I don't quite understand it, but LibYAML seems to be fine if everything was always UTF-8, and not so fine otherwise.

MyCPAN indexes 97% of BackPAN

A history of Perl variables

I was curious when various Perl variables showed up, so I started diving through perlvar and perl*delta. Ignoring those that were already there in Perl 4, I have so a draft list. It's a bit dodgy because some of the variables existed before they were documented, but I'm really interested in the point where they became supported variables (so, I also don't care about blead versions):

Does anyone have any corrections or predictions for 5.14? :)

perl 5.12.0 - 5.20.0
-----------

--none--

perl 5.10.0
-----------
${^PREMATCH}
${^MATCH}
${^POSTMATCH}
%+
%-
${^WIN32_SLOPPY_STAT}
${^WARNING_BITS}
${^RE_TRIE_MAXBUF}
${^RE_DEBUG_FLAGS}

Perl 5.8.9
-----------
${^CHILD_ERROR_NATIVE}
${^UTF8CACHE}

perl 5.8.8
-----------
${^UTF8LOCALE}

perl 5.8.2 ???
-----------
${^ENCODING}
${^OPEN}
${^UNICODE}

perl 5.8.0
-----------
$^N
${^TAINT}

perl 5.6
-----------
$^C
$^V
@-
@+
%^H

perl 5.005
-----------
%!
$^R

perl 5.004
-----------
$^M
$^S
$^A ???

perl 5.003
-----------
$^E
$^H
$^O

What non-Perl books do you recommend to Perlers?

I'm overhauling the perlbook documentation and moving the book list from perlfaq2 into it. Besides updating the references, I'd like to include a short section on non-Perl (technical) books that are useful to the Perl programmer. So far I have Jon Bentley's Programming Pearls, but that's an easy one.

What else is there? What other books do you think Perlers should read to help them be better Perl programmers?

Just to head off all the posts I know are coming, Lord of the Rings might help you understand the perl source code, but it's not going in perlbook.

Okay, maybe it is.

How can I troubleshoot my Perl CGI script?

Awhile ago I moved my How can I troubleshoot my Perl CGI script? to StackOverflow. I'm just getting around to telling everyone about it because it was pretty far down on my to do list.

I think this has almost pushed the old location on SourceForge out of the googlebrain, but it wouldn't hurt for people to link to it in a blog post, tweet, whatever to encourage Google to find this one. Someday SourceForge will disappear and we won't have to worry about it anymore. How is it even still alive? StackOverflow has pretty good googlejuice though, maybe because Google likes StackOverflow.

Since it's on StackOverflow, this also means that I'm basically letting go of it. StackOverflow encourages people to revise the questions and answers of other to improve them, and I've given it wiki status to encourage that even more. Take a look, see what I've left out (or left in), what's new and exciting (or old and boring).

Even if you don't (or can't) edit it just yet, I'd appreciate any comments on how to bring it up to date. Maybe another StackOverflow user can make the changes if I'm too busy.

Also, sadly, the only thing keeping the bad Perl info out of StackOverflow is a small band of knowledgeable Perlers patrolling the answers (Sinan used my summer absences to pass me as the highest rated Perl user there). If you're looking for a way promote Perl in a useful way (and you actually know Perl), consider helping out. Providing good answers, voting on good answers (and against bad answers), and refining other answers helps the entire world.

An index for The Perl Journal articles

A couple of years ago I put together a list of The Perl Journal articles I could find on the Dr. Dobbs website. They changed some of their URLs, so I updated those to avoid all of the redirects and in the process found several more articles. My TPJ index is on Perlmonks. You can see some of the beginnings of popular projects, such as Moose, in some of the articles.

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).