Moving my blog from use.perl.org

I have moved my blog to here from use.perl.org (Mark Leighton Fisher).

Smart match versus hash deathmatch

A couple of days ago, I posted about my answer to a Stackoverflow question asking about the speed of the smart match operator. Smart matches looking for a scalar in an array, can short circuit, so they are pretty speedy.

The trick, however, is if this sort of smart match is faster than a hash lookup. Yes they are and no they aren't. I've updated my Stackoverflow answer with additional benchmarks and a new plot.

Smart matches are faster if you have to create the hash, but slower if you already have the hash.

There's a middle ground in between that I don't care to find. For some number of searches of the hash, the cost of creating the hash amortizes enough that it's faster than a smart match.

It depends on what you are doing, but that's the rub with every benchmark. The numbers aren't the answers to your real question, in this case "Which technique should I use?". They only support a decision once you add context.

Storable: "freeze" versus "nfreeze"

I was doing a code review and discovered that one of our developers wrote code using Storable's freeze() function. This turned out to be a bug because we store objects in memcache with nfreeze() instead. Storable's docs have only this to say about nfreeze().

If you wish to send out the frozen scalar to another machine, use "nfreeze" instead to get a portable image.

Since people generally use freeze() instead, I decided to dig around and figure out what was going on. After all, if nfreeze() is portable, there must be a price to pay, right?

Fun with recursive anonymous subroutines

I'm doing lots of work with representing stuff stored in the file system as trees at the moment as part of my toolkit for open source qualitative research software.

One of the things I need to do (for making reports) is to transform this:

 [ [qw/foo bar/],
   [qw/a b /],
   [qw/x y/], ];

into this tree structure:

 {
   'foo' => {
       'some_data' => 'lvl0',
       'children' => {'a' => {
           'some_data' => 'lvl1',
           'children' => { 'y' => 'leaf', 'x' => 'leaf' } },
                      'b' => {
                          'some_data' => 'lvl1',
                          'children' => {
                              'y' => 'leaf', 'x' => 'leaf' }}}}};

Being a nice golf problem I thought I'd ask on irc if there was a hacker better than me who felt like taking a look at this. ribasushi++ obviously had a little procrastination time available and wrote me a nice solution which I needed to make into a closure via a recursive subref:

The Pearl Metaphor

After "What Weird Al and Larry have in common" and "some thoughts about Pearls" comes here the showdown of our little trilogy about the meaning of the name of our favorite language.

Some people asked me, why I don't use more words to explain some terms like binah and give more links. I try to do it this time a bit more. Some Jews may even say thats not good to talk about such things at all in the open, but i prefer to orient myself on the baal shem tov who said otherwise. to some this may be completely of the top, but on other hand you not might get so easily to that kind of information. :)

How fast is Perl's smart match?

Karel Bílek on StackOverflow wondered if the smart match operator was smartly searching. We know it's smart about what it should do, but is it also smart in how it does it? In this case, is it smart about finding scalars in an array?

I benchmarked three important cases: the match is at the beginning of the array, the end of the array, and in the middle of the array. Before you look at my answer on StackOverflow though, write down what you think the answer should be. Done? Okay, now you can peek

Perl 101: avoid "elsif"

We had some code which looked (sort of) like this:

local::lib and perlbrew

Because I seem to be doing this a lot at the moment, here’s my quick-start to local::lib and perlbrew … the saner way to run perl!

DBIx-Class and database schemas in PostgreSQL

Database schemas are a little like packages in Perl, they provide namespace. If you have a database with dozens, or even hundreds of tables, you really like to divide them into logical groups.

In PostgreSQL you do like this

CREATE SCHEMA <db_schema>;
SET search_path TO <db_schema>;

If you don't create a schema, all your stuff goes into the default schema public.

DBIx::Class knows about db schemas, but not enough to make them work out of the box. Or at least it seems that way. Here's how I did it.

FIrst (well, after creating the database with the db schemas itself. But that's an exercise left to the reader), I created the DBIC classes for the tables with the excellent tool dbicdump. (It's installed together with DBIx::Class::Schema::Loader). dbicdump creates the class structure right below your current directory. So I started with cd lib/ and then:

Nice joke in thread about booking.com looking for Perl hackers

It seems that booking.com is looking for Perl programmers, which is discussed here: http://news.ycombinator.com/item?id=1784399

This thread contains a very nice joke (IMO):
16 points by mmaunder 2 days ago http://jobs.perl.org/ While Erlang and Haskell may get you laid, Perl remains the glue of the web.
21 points by mustpax 2 days ago

Not that I don't know but for the other readers here, how would one get laid with Erlang or Haskell?

12 points by blackdog 2 days ago quickly and in parallel.

Server Deployment Packaging (2)

Last week I posted about my current experiments in deploying perl applications to our Centos 5 servers - or rather the first steps of building a perl package along with the required modules.

I am just starting work on testing this all through, when suddenly one of the blocks to using the current stable perl (ie 5.12.2) has disappeared - TryCatch is now supported on 5.12.x

So, although I have some current tests running, I am just in the process of modifying a few parts of the build scripting (mainly down to me missing a couple of local modules from the build), and then a new version based on current stable perl will hit the build systems.

PostgreSQL Conference West 2010

I will be attending the PostgreSQL Conference West 2010 at the Sir Francis Drake hotel in San Francisco from November 2nd to 4th. I'm waiting to hear back from the travel agency to see when I'm flying out of and arriving back at D/FW.

Converting Complex SVN Repositories to Git - Part 2

Initial Import into Git

Creating a mirror

SVN is slow, and git-svn is slower. The amount of network traffic needed by SVN makes everything slow, especially since git-svn needs to walk the history multiple times. Even if I made no mistakes and only had to run the import once, having a local copy of the repository makes the process much faster. svnsync will do this for us:

# create repository
svnadmin create svn-mirror
# svn won't let us change revision properties without a hook in place
echo '#!/bin/sh' > svn-mirror/hooks/pre-revprop-change && chmod +x svn-mirror/hooks/pre-revprop-change
# do the actual sync
svnsync init file://$PWD/svn-mirror http://dev.catalyst.perl.org/repos/bast/
svnsync sync file://$PWD/svn-mirror

Importing with git-svn

Next, we have to import it with git-svn:

Moving to Amsterdam to work for Booking.com

I thought I'd note here too as well as on my blog that I'll be moving to Amsterdam tomorrow to work for Booking.com. I'm looking forward to the new challenges and getting settled in a new city, as well as meeting and working with some new people.

Excel::Writer::XLSX


I have released a new module to CPAN for writing Excel files in the 2007 XLSX format: Excel::Writer::XLSX

It uses the Spreadsheet::WriteExcel interface but is in a different namespace for reasons of maintainability.

Not all of the features of Spreadsheet::WriteExcel are supported but they will be in time.

The main advantage of the XLSX format over the XLS format for the end user is that it allows 1,048,576 rows x 16,384 columns, if you can see that as an advantage.

From a development point of view the main advantage is that the XLSX format is XML based and as such is much easier to debug and test than the XLS binary format.

It has become increasingly difficult to carve out the time required to add new features to Spreadsheet::WriteExcel. Even something as seemingly innocuous as adding trendlines to charts could take up to a month of reverse engineering, debugging, testing and implementation.

Hopefully the XLSX format will allow for faster, easier test driven development and may entice in some other contributors.

Announcing Marpa 0.200000

Marpa is now at 0.200000. Following a standard rhetoric of version numbers, this indicates that it's an official release and a major step forward, but still alpha. Marpa is a general BNF parser generator -- it parses from any grammar that you can write in BNF. It's based on Earley's algorithm, but incorporates recent advances, so that it runs in linear time for all those grammars parseable by yacc or recursive descent.

The big news with Marpa 0.200000 is Marpa's 3rd generation evaluator. The previous version of Marpa had two evaluators -- one fast, but only good for producing a single parse result, the other capable of dealing with ambiguous grammars, but slower. The 3rd generation has a single evaluator which combines the best of both. Not the least advantage of this change is that it simplifies the documentation and the interface.

Converting Complex SVN Repositories to Git - Part 1

In May and June, I worked on converting the DBIx::Class repository from SVN to Git. I’ve had a number of people ask me to describe the process and show the code I used to do so. I had been somewhat busy with various projects, including working on the web client for The Lacuna Expanse, but I’ve finally had some time to write up a bit about it. The code I used to make the conversion is on my github account, although not in a form meant for reuse.

RHEL and perl

At Jobindex we use Red Hat Enterprise Linux. The OS is very stable and we feel that Red Hat is doing a lot of good stuff for Linux and OSS in general.

When it comes to perl the current version RHEL ships with version 5.8.8 which causes a bit frustration however. Some CPAN modules won't install and it seems like the people who writes modules for CPAN don't really care about our (good) old perl verison.

At YAPC::EU several of the speakers recommended installing perl ourselves instead relying on the OS version.

We have now decided to follow this recommendation. At the same time we will also start using git to manage perl and the installed modules to keep the versions in testing and production in sync. This way we wil also avoid messing with RPM packages.

So now I am looking forward to getting my hands on perl 5.12.2 :)

Disable a global RT scrip for some queues

We're using RT quite a lot. Today I needed to disable some 'Scrips' in one queue, while keeping the in all other queues. Unfortunatly, RT does not support this out of the box. While there is a plugin that seems to implement that feature (RT-Extension-QueueDeactivatedScrips) I decided to fix this without touching RTs innards.

The trick is to add a 'Custom condition' to the global script, which returns false for the relevant queue:

return $self->TicketObj->QueueObj->Name ne 'babilu::support'

Unfortunantly, this is not enough (and it took me some testing and manic clicking through RT to figure this out). You also need to change the 'Condition' from whatever the Scrip is using to 'User Defined', and then test for the condtion yourself:

Padre 0.72 has been released.

For those of you who may notice such things, you might see that Padre's version number has jumped two numbers since the last stable release.

This development cycle we introduced a new versioning system whereby the odd number 0.71 was the development version with 0.72 the stable release version.

The reason we have done this is to accommodate development and changes to the plugin subsystem during the development cycle. We'll track how this goes and make changes where it needs to be made.

For much of this release, and I'll have to admit quite a few before, I've really been off doing other things, but I follow the irc logs for #padre to keep an eye on how things are developing within Padre.

However in saying that I have to say when it came time to roll out this release, the number of fixes and improvements in the Changes file blew me away. Admittedly it was a longer than normal period between releases, but still, there are some serious fixes in 0.72.

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.