April 2010 Archives

Why Perl 6 is different

Let’s be honest. Perl 5, Python, Ruby, they’re almost the same. There are some differences, but when your compare them with C, Java, Haskell or some such they suddenly feel rather superficial. They suitable or unsuitable for pretty much the same tasks, occupying a niche that Perl pioneered: that of a high manipulexity and whipuptitude.

They each operate at the same abstraction level. Even if a language is lacking a feature that the others have, it’s easily implemented using other constructs. There are plenty of valid reasons to prefer one over the other (taste, library availability, programmer availability), but they all offer the same power. Perl 6 is going to change that.

Perl 6, like Perl 5, Ruby and Python steals a lot from other languages. As you may expect, it steals too many things to mention from Perl 5. It steals chained comparisons from Python, objects from Smalltalk (in particular Squeak’s traits should be mentioned). It thankfully steals nothing from PHP.

It has been said that Ruby is Smalltalk with a perly syntax. Perl6 extends on that: Perl 6 is Lisp with a perly syntax.

Perl 5 is already more lispy than most outsiders realize, but Perl 6 takes that to a new level. It is built around a MOP and multimethods. Lists are quite important in it (though their semantics are more like Haskell than Lisp in being lazy). And it has macros.

Macros in Lisp are one of its most powerful features. It gives the programmer the power to mold Lisp into any shape he wants it. In Lisp macros are possible because its uniform syntax of trees of symbols. In Perl 6 macros are made possible by its key innovation. One of the feature it didn’t steal from another language: rules and grammars.

Regular expressions is the most stolen feature from Perl among other languages. It’s ironic that even Python, whose community is most critical about Perl’s syntax, took its ugliest feature in essentially unmodified form. (For an overview of all that is wrong with it, see the introduction of Apocalypse 5). It’s no coincidence that this part of the language has been redesigned from the ground up. In doing so, Larry profoundly changed the language. To quote A5: “Regular expressions are our servants or slaves”; in Perl 6 he emancipated rules and grammars to first class citizens.

Rules are like regexps, but also like methods/functions. Grammars are like classes for rules. This makes rules vastly more powerful. So powerful in fact that Perl 6’s syntax itself is defined in rules and thus in Perl 6 itself.

I can’t overstate how profound I think that change is. I think it’s no less profound than when Lisp made functions first class citizens. For the first time it will be possible to have all the metaprogramming power Lisp has without having to compromise on having syntax. Regardless of the success of Perl 6 (though obviously I do hope it will be successful), I predict that this feature will be its long term contribution to language design.

threads::lite and the chameneoses

The Challenge

Last week on stackoverflow I came across an interesting challenge. Since my new module threads::lite seems to have stabilized enough for such a task I decided to to port the erlang submission to it (while using some helper routines from the perl submission). The porting was a fairly straightforward process that resulted in a pleasantly readable program (specially when compared to the other entry).

Then I ran it. It ran almost 6 times as slow as the other perl program!

Reason enough to profile it and see what was going on. Renodino's comment proved to be quite accurate: Storable and locking seemed to be the main culprits.

To tackle the first issue I added a simple but effective feature: if a message contains only simple elements (no references or undefined values), it is packed instead of frozen.

So I changed my script to use only simple messages. The result: it was suddenly 3 times as fast (edit: compared to my first version). The lesson was simple: avoid expensive serialization steps when you don't need them. Storable is fast, but not fast enough for this kind of task.

But there is more to win. The message passing routines consume lots of CPU. Run-times seem to fluctuate quite a bit and all profilers seem to report unreliable results (except NYTProf, which simply panicked), making profiling rather hard. It seems so far the costly part is blocking for input when receiving on an empty queue. Unfortunately that scenario is quite common in this benchmark. I don't really know how to solve that though, so far my attempts at finding something faster have been less than successful. This proves to be an interesting challenge.

What does this mean?

Any sufficiently complicated concurrent program in another language contains an ad hoc informally-specified bug-ridden slow implementation of half of Erlang.

Not that much, actually. The results from this benchmark aren't all that relevant for most real applications. It's a benchmark for synchronization primitives in a very tight loop. It proves how fast you can communicate a single byte of data between threads, but that's not very representative for real programs. Message passing is not a synchronization primitive, but a high level abstraction build on top of them.

It is telling that the Erlang entry is actually two orders of magnitude slower than the fastest C entry, despite having a reputation as a highly capable for multi-threading. In my opinion, the Erlang score is the one to benchmark against, not the C one. It's way more representative for real programs than the latter. That still means I'm not doing that well (Erlang is an order of magnitude faster), but it does put things in perspective.

As for the speed, sending a message appears to costs 7-10 µs on my laptop. I expect receiving a message costs the same when it doesn't have to block and it seems to cost about 30-60 µs when it has to block. It's not that slow, but if you want to call it millions of times you will notice the impact.

But there's something more important here. The erlang entry isn't only the shortest, but also the most readable solution of the list. I'm an experienced Perl programmer, but I find the Perl entry hard to comprehend, I can't really tell if it's doing the locking correctly by reading the code. The C entry is even worse (though that's mostly because it's bizarrely optimized). I'm not nearly as fluent in Erlang (in fact I've only worked with it on one project), but I could immediately understand what it was doing and why that would work.

That's the sort of thing I aim to recreate.

Enlightening perl's documentation (you too can help!)

Enlightened perl programmers often complain about how outdated most online perl manuals. Truth is, some of the official documentation is quite outdated too. Perl ships with a lot of documentation, some of it is old and badly needing some maintenance.

For example, a quick ack through the documentation showed about 250 cases of open used with a glob as its first argument (e.g. open FOO;), even one in perlstyle! I think everyone agrees that in 2010 that's no longer a good example. I don't think it has a place outside of pe…

About Leon Timmermans

user-pic