I have a couple hundred thousand YAML files that I created with plain, ol' YAML.pm. I created these just before I realized how slow YAML.pm is, but I have the files already. Processing them with YAML.pm is really, really slow, so I wanted to see how much faster the other YAML modules might be.
My problem, which Google doesn't know much about (yet), is that the faster parsers complain "block sequence entries are not allowed in this context" when I try to parse these files while YAML.pm (really old, but pure Perl) and YAML::Syck (deprecated, uses YAML 1.0) don't. YAML::XS is based on libyaml, an implementation that actually conforms to the YAML 1.1 specification. I didn't create the files with YAML::XS though, so I have lines like:
cpplast: -
cppminus: -
When in YAML 1.1 those lines should be something like:
In May I decided to stop using Debian's perl 5.10.1 in favor of using a 5.13.1 built with perlbrew, and CPAN modules built with cpanminus. It's been great, here's how I did it.
Before switching over I ignored Debian's perl library packages, and installed everything with cpanm into /usr/local. But since I wanted to use the new post-5.10 features of Perl I thought I might as well replace all of it and use a newer perl.
Both these demonstrate a slightly dubious mechanism for manipulating the dzil generated Makefile.PL. You might want Dist::Zilla::Plugin::MakeMaker::Awesome instead.
Joe McMahon will be talking about Hudson on June 22nd at 7pm, at the office of Mother Jones.
"Continuous integration" sounds like a great idea: you automatically run your build on every checkin, so you know very soon after you've committed if you make a mistake or checked in a bug. However, like any
properly lazy Perl programmer, the last thing you want to do is write more code; you want to take advantage of work that's already done: that's Hudson.
Hudson is a continuous integration server that's easy to set up, customize, and use. Unlike other similar Java-based tools, Hudson is language-agnostic, even well-integrated with other tools.For Perl
projects, with a little assistance from CPAN, it's easy to set up and use for Perl projects. We'll look at a sample setup that covers most of the bases, including a few pointers on making it easy to build and track things
under Hudson, and finish up with a look at using Hudson to get your team involved - even enjoying - continuous integration.
Lately I've been playing around with the OTRS ticketing system on one of my servers. OTRS is written in Perl, and is typically run as a CGI, mod_perl or FastCGI app. Usually I'd map it as a FastCGI app on Nginx and start the 2 FastCGI servers via a init.d script (one for the customer helpdesk and another for the management console).
But this time I wanted to give Plack a try.
I'm new to Plack and PSGI, but I can't wait to start moving my apps to this badass middleware framework.
Plack comes with 2 CGI wrapper modules, Plack::App::WrapCGI and Plack::App::CGIBin. WrapCGI seems like the most appropriate for my needs. Apparently it even precompiles CGIs using CGI::Compile, for added performance.
So I wrote a little app.psgi in the /opt/otrs directory:
In my perspective Perl syntax for grep doesn't help making it faster when searching for exact values. Let me explain. While it is not that common, what you do when trying to grep for an exact string from an array?
Perl makes the use of pattern matching on greps easy:
@selected = grep { /^foo$/ } @list;
You can argue on the usability. But trust me, every once and then, some strange constructs get really useful. Unfortunately the above expression is not efficient. If you replace it by
@selected = grep { $_ eq "foo" } @list;
you get a two times faster code (check the bottom for Benchmark results).
Following the idea of split that accepts a string and uses it for the split process, I think grep could accept a string as well (at least on grep EXPR,LIST construct):
Tim Bunce's Devel::NYTProf has a bunch of improvements in version 4.00, which was released yesterday.
The compatibility problem with Devel::Declare code like Module::Signatures::Simple that I previously blogged about has been fixed. It can now profile inside string evals, and more.
Update: Tim Bunce now has a posting about NYTProf 4.00 on his blog.
After blogging about my small patch to Data::Dump, I contacted Gisle Aas. He is quite responsive and finally comes up with a new release (1.16) of Data::Dump containing the cool new filter feature. My previous example after converted to use the new feature becomes:
$ perl -MData::Dump=dumpf -MDateTime -e'dumpf(DateTime->now, sub { my ($ctx, $oref) = @_; return unless $ctx->class eq "DateTime"; {dump=>qq([$oref])} })'
[2010-06-09T12:22:58]
This filter mechanism is quite generic and allows you to do some other tricks like switching classes, adding comments, and ignore/hide hash keys. The interface is also pleasant to work with, although starting with this release the "no OO interface" motto should perhaps be changed to "just a little bit of OO interface" :-)
Aren't we glad that stable and established modules like this are still actively maintained and getting new features.
I now have my copy of the new edition of Effective Perl Programming. I'm halfway through another book, a wedding on the 20th and some other major personal (good) news, so I won't be able to review it right away. However, a few quick notes:
Pros
Covers 5.12.
A full chapter on Unicode
Good discussion of testing
Cons
Recommends SQLite for a test database (I used to like this idea myself).
I don't think the testing points are that serious; there needs to be a far more in-depth treatment of testing and for this book, it can't possibly cover this area properly. I also noticed it had nice things to say about Moose, but didn't use it in examples. I think this was the right decision, but I wish it weren't.
And in Web sites it recommends, blogs.perl.org is listed, but use.perl.org is not. Rather interesting.
In any event, those were just a few things I noticed flipping through the book. I'll have a better description later. For now, suffice it to say that it looks very, very good. Just a few of the places I've taken the time to read are well-thought out and show lots of experience.
If you read my last post, I was wondering why Moose wasn't accepting array or hash references as default values that could be cloned and, instead, used a code reference to create a new array/hash.
Decided to benchmark the two approaches. The results were... surprising:
With these results, I think the correct behavior is the one already present on Moose, complaining about the default value being an array reference and suggesting an alternative:
References are not allowed as default values, you must wrap the default of 'a' in a CODE reference (ex: sub { [] } and not [])
I've wanted to write a reasonably useful Perl 6 module for a while, and I finally realised that the Viterbi algorithm would be a pretty simple place to start (hopefully it'll be useful as well).
There's a module with the same name on CPAN, but I've based the code on a Common Lisp version I wrote for school a while back. At the moment the module is pretty basic, and doesn't support assigning probabilities to unseen data for example. The module is available on GitHub.
A more advanced version will support computing the sum of the log-probabilities rather than the product, smoothing and unobserved data, and the option of mixing in a role to domain objects so that they can be passed directly to the decode method from the client code.
So, I finally got around to doing what I was threatening to do at Copenhagen and got a tool created to automate the building and configuration of perls for CPAN Testing.
Smokebrew is now available on CPAN and after some smoking of my own dog-food appears to work really really well.
It doesn't do the environment tweaking, for instance, to switch between the installed perls.
The goal has been to build, install and configure various versions of perl for CPAN Testing.
The configuration is dealt with by App::SmokeBrew::Plugin's. These are objects that are called by Smokebrew after it has successfully built and installed a particular version of perl. Smokebrew comes with two plugins, Null and CPANPLUS::YACSmoke. The former does no configuration and the latter configures for CPAN Testing with CPANPLUS::YACSmoke.
I now officially declare my participation at iron man perly blog sport content after I skope withe a blonde with skirt.
But why I met Matt. Because I'm in schorndorf or course. I's german Perl workshop and my Talks went well. Both of them i will give in Pisa too, in english, and longer. Especially the Rebol talk needs a lot more time I found out. My testing talk was very well recieved. Fine. The Kephra Talk will come tomorrow. All the slides are online now.
Started to look to Moose today. Better later than never. But I never liked OO programming, and Perl core OO was enough for me. So, I am kind of forcing me to do it.
To start with, I am rereading a presentation from Yuval on Portuguese Perl Workshop 2008. I know probably things changed, and so, probably there is a better syntax handling.
At the moment I did not like the way object properties (or variables) are initialized with an array reference:
default => sub { [qw(Bob Alice Tim)] },
I really would prefer it without the sub. Not sure (yet) if if it required or not, but I will find out very soon...
...ahs! Got it, I think. If no sub were supplied, all the objects would get the same reference to the same array. Therefore, the lazy option, and the anonymous function. Well, couldn't that be solved with Clone?
Recently I've finished rewriting and fixing up Test::SFTP. Version 1.05 and 1.06 were released and they mark the API change and the switch to Test::Builder and Net::SFTP::Foreign.
Test::SFTP is now something I consider "good" and definitely "usable." There's another fix waiting for the next stable release of Net::SFTP::Foreign for some output redirection cleanups. Despite that, it's still good to use.
We all have that code that we put aside completely. At first you assume that you'll get to it sometime, then as time goes by you slowly let go of that idea. Small changes or fixes become tedious and annoying, you hate touching it and at some point you realize you'll probably never work on it again.
There is something joyous in going over that old code and fixing it up, correcting it, finally getting the job done. There's something very motivating in seeing something through to the end.
Perl is a great language. Not just because it is flexible, it has a great community, it has Larry Wall as its creator or because it has CPAN. Perl is a great language because it is multi-paradigm. That is, you can write Perl code in different ways, resembling imperative languages, functional languages or object-oriented languages (yes, we can discuss on this division, but that is not relevant at the moment).
What I want to say here is that I love to write functional programming lines with Perl. Check, for instance, on the task of reading a table (tab separated lines) with name, weight and height, and adding a new column with the BMI.
I confess I cheated, defining a new function. Unfortunately push is not functional. Also substitution is not functional at the moment (being prepared on 5.13). Not sure if a functional push version would help. Probably not...
There's news with the latest version of Marpa (0.102000).
Marpa now parses grammars with right-recursions in linear time (O(n)).
(Marpa already
handled left-recursion in linear time.)
This means that Marpa is now O(n) for all LR-regular grammars.
LR-regular means LR with infinite lookahead using regular expressions.
That a big class of grammars. It obviously includes all the LR(k) grammars,
and therefore everything parsed by Yapp, yacc, and bison.
LR-regular grammars also include
everything parseable by recursive descent, PEGs,
and other LL(k) grammars.
LR-regular definitely includes all regular expressions.
Marpa's O(n) behavior has another nice feature.
When it does not parse in O(n) time, it still parses.
Some parser generators always parse quickly, because when they
can't parse quickly, they don't parse at all.
Marpa will parse anything you can write in BNF,
even highly ambiguous grammars,
and the absolute worst case is cubic (O(n**3)).