January 2010 Archives

Test::Class::Most

I'm really tired of boilerplate. In fact, I hate it so much that I can't stand when I write this:

package Some::Test::Class;
use strict;
use warnings;
use base 'My::Test::Class';
use Test::More;
use Test::Exception;

Of course, you already know about Test::Most and Modern::Perl, so you could reduce it to this:

package Some::Test::Class;
use base 'My::Test::Class';
use Modern::Perl;
use Test::Most;

But that's still boilerplate. So here's what I've just uploaded:

package Some::Test::Class;
use Test::Class::Most parent => 'My::Test::Class';

That gets you strict, warnings, all test functions from Test::Most and, if you have 5.10 or better, all the modern features of Modern::Perl. It (reluctantly) supports multiple inheritance (pass an arrayref of class names as the value to 'parent') and if you don't want the modern features, you can do this:

package Some::Test::Class;
use Test::Class::Most parent => 'My::Test::Class', feature => 0;

To create your own Test::Class base class (which inherits directly from Test::Class), just don't specify an import list:

package My::Test::Class;
use Test::Class::Most;

The documentation includes links to my Test::Class tutorials for those not familiar with it:

  1. Organizing your test suites
  2. Reusing test code
  3. Making Test::Class easier to use
  4. Test control methods
  5. Final tips and summary

Test::Class::Most is not quite as pretty as Piers Cawley's lovely Test::Class::Sugar, but it has far less magic. I think the docs are clear, but if they confuse you see the the Test::Class tests I ship with it for a good example. Hopefully it will make your test classes much more pleasant to write.

Testing with PostgreSQL

I've been working on a personal project lately and I decided that, amongst other things, I was going to use PostgreSQL. Some of you may recall that I had an interesting testing strategy for MySQL. The basic idea is that I don't want to teardown and rebuild the database for every test. Truncating a table is generally much faster than dropping and recreating it. However, if I leave the database up, how do I guarantee it's always in a pristine state? One way is to use transactions and always roll them back at the end of a test. That means, amongst other things, that I can't easily test "commit". You can make it work with nested transactions (if your database supports them), but "rollback" can cause issues.

There's also the problem that by breaking "commit", you're altering the behavior of your code somewhat. Plus, if you have more than one process, unless you can share the database handle, separate processes can't see what's happening in another's transaction.

My strategy is not one that everyone is comfortable with, but I prefer to track the changes to the database and simply truncate tables which have changed, possibly restoring the "static" data which some tables need to have when the app is launched. Making this work with PostgreSQL really helped me to relearn a lof things I had forgotten about this excellent database. Here's the full code, with some interesting goodies you may not have expected (plus some hacks I need to fix at some point).

Perl en France?

(Pardonnez mon mauvais français). Ma belle-mère (presque) visite Londres cette semaine. Elle aimerait que Leïla (ma fiancée) habiter en France et me demanda s'il serait difficile pour moi de trouver un emploi en France. J'avais l'impression que Perl n'est pas très populaire en France et je lui ai expliqué que, sans parler couramment le français, il serait difficile pour moi d'obtenir un emploi en France dans un autre langage que Perl. Pour Perl, il semble qu'il ya très peu d'emplois à Perl en France. Est-ce vrai?

Nous n'avons aucune intention de quitter le Royaume-Uni, mais je suis curieux de connaître l'état de Perl en France.


My future mother-in-law is visiting this week. She wants Leila to move back to France and asked me if it would be difficult for me to find a job in France. I was under the impression that Perl is not very popular in France and found myself trying to explain that without speaking French fluently, it would be difficult for me to get a job in France in a programming language other than Perl. As for Perl, it's seems that there are very few Perl jobs in France. Is this true?

We are not planning on leaving the UK, but I am curious about the state of Perl in France.

Dear Lazyweb: 3D Graphing Software?

Assuming I have an undirected graph of 3D points, what (free) Javascript or Flash software would you recommend for plotting them (along with their connections) in a browser? I vaguely need an end result like this 3d scatter plot, but with fewer (87) points and a similar number of connections.

It would be nice if it was cross-browser compatible and didn't require a Flash authoring tool :)

Roles without Moose?

I'm on a new team at the BBC. On the previous team, PIPs, we gathered BBC programme data for television and radio. The rest of the BBC could use PIPs to pull schedules, get information about Doctor Who (note, that's "Doctor", not "Dr."!) or understand how a radio programme is broken down into segments which might be rebroadcast on a different programme. The work was complex, but fun. If our system went down, large parts of the BBC wouldn't be able to update their programme data.

On the new team, Dynamite, it's a different story. If we go down, large parts of the BBC's online presence go down. Ever visit www.bbc.co.uk/iplayer/? That's ours. Given that the BBC is one of the most heavily trafficked web sites in the world, we have to worry about performance. We count milliseconds. As a result, the team I'm on now doesn't use Moose. Ah, but you tell me Moose is fast now! Yes, Moose is fast and most of its performance issues are in the startup, not in the runtime. I'll agree with you on this, but look at this benchmark:

#!/usr/bin/env perl

{
    package Foo::Moose;
    use Moose;
    has bar => (is => 'rw');
    __PACKAGE__->meta->make_immutable;
}
{
    package Foo::Manual;
    sub new { bless {} => shift }
    sub bar {
        my $self = shift;
        return $self->{bar} unless @_;
        $self->{bar} = shift;
    }
}
my $foo1 = Foo::Moose->new;
sub moose {
    $foo1->bar(32);
    my $x = $foo1->bar;
}
my $foo = Foo::Manual->new;
sub manual {
    $foo->bar(32);
    my $x = $foo->bar;
}
use Benchmark 'timethese';

print "Testing Perl $]\n";
timethese(
    1_500_000,
    {
        moose  => \&moose,
        manual => \&manual,
    }
);

Sample output:

Testing Perl 5.010001
Benchmark: timing 1500000 iterations of manual, moose...
    manual:  2 wallclock secs ( 1.86 usr +  0.00 sys =  1.86 CPU) @ 806451.61/s (n=1500000)
     moose:  1 wallclock secs ( 1.93 usr +  0.00 sys =  1.93 CPU) @ 777202.07/s (n=1500000)

No matter how many times I run this, we see the manual output only a hair faster than Moose. Of course, we had to avoid constructing the object in this benchmark. Otherwise, we see that object construction in Moose is slow:

Benchmark: timing 1500000 iterations of manual, moose...
    manual:  5 wallclock secs ( 4.43 usr +  0.01 sys =  4.44 CPU) @ 337837.84/s (n=1500000)
     moose:  6 wallclock secs ( 7.40 usr +  0.00 sys =  7.40 CPU) @ 202702.70/s (n=1500000)

(Look at the @$num/s figures).

That's not fair, though, because you construct an object once and then do lots of things with it. That being said, Moose offers so many benefits that our tiny, tiny performance hit is worth it, isn't it? Look at the original code and you'll see that we're not really taking advantage of Moose, so let's add a type check.

{
    package Foo::Moose;
    use Moose;
    has bar => (is => 'rw', isa => 'Int');
    __PACKAGE__->meta->make_immutable;
}

And the benchmark:

Benchmark: timing 1500000 iterations of manual, moose...
    manual:  1 wallclock secs ( 1.88 usr +  0.00 sys =  1.88 CPU) @ 797872.34/s (n=1500000)
     moose:  6 wallclock secs ( 5.14 usr +  0.00 sys =  5.14 CPU) @ 291828.79/s (n=1500000)

Oops. If we actually try to take advantage of the features of Moose, we still take a serious performance hit. For most people will this won't matter. Ah, but you argue that I should have that type checking and you're right, but in reality, much Perl code deep in a system doesn't have type checking. But let's add a quick check of our own, just to be more fair.

    sub bar {
        my $self = shift;
        return $self->{bar} unless @_;
        croak "Need int, not ($_[0])" unless 0+$_[0] =~ /^\d+$/;
        $self->{bar} = shift;
    }

That's not a great check, but it's better than many people provide. Here's the benchmark:

Benchmark: timing 1500000 iterations of manual, moose...
    manual:  2 wallclock secs ( 3.35 usr +  0.00 sys =  3.35 CPU) @ 447761.19/s (n=1500000)
     moose:  4 wallclock secs ( 5.20 usr +  0.00 sys =  5.20 CPU) @ 288461.54/s (n=1500000)

Again, with carefully crafted code, we can outperform Moose, but we still don't get Moose's flexibility. This will matter to very few people and unless you have a very clear reason, don't skip Moose just for this. Regrettably, our millisecond response times mean that we have a problem.

That problem, in this case, is multiple inheritance. As with many codebases that evolve over time, lots of programmers have had a chance to "improve" the system and I'm seeing a lot of multiple inheritance. I'm seeing classes which have five parents! Running Class::Sniff over them is showing quite a few issues and it's clear from even a cursory examination that this MI is for sharing behavior.

Sharing behavior is exactly what roles are for. So if we're concerned about the overhead of Moose, what options do we have? I've deprecated Class::Trait. Is it time to resurrect (and benchmark) it? Mouse seemed promising, but initial benchmarks with the above code showed it's getters/setters running slightly slower than Moose getter/setters! We can take a performance hit on load, but on runtime, we have to be careful.

Maybe we need a very lightweight:

use role 
  'Does::Seriliazation',
  'Does::TitleSearch',
  'Does::IdMatching' => { excludes => 'some_method' };

No runtime application would be allowed. There would be no introspection beyond DOES. Multiple "use role" in the same class would fail (this solves a few problems I won't go into now). No Moose, Mouse or anything else would be required. Better suggestions are welcome. I'll guess that I could use Moose without accessors and without inlining constructors and take advantage of roles that way. Sounds better, but more benchmarking is needed.

Post!

I really wish more people would post here. I don't want this to be Ovid + the rest of the Perl community. I feel bad about the number of posts I make because I feel like I'm almost dominating this blog.

In other news, I've discovered that using GraphViz and star charts is harder than I thought.

Things on my "not" todo list

Silly facebook conversation inspired the following:

package UNIVERSAL::Don't; # Because putting things in UNIVERSAL is fun!'

sub don::t {
    my ( $proto, $method ) = @_;
    my $class = ref $proto // $proto;
    my $method = $proto->can($method) or return;
    *{"$proto\::$method"} = sub {};
    return $method;
}

Just think of how much fun you could have with that ;)

And even though the single quote mark package separator is deprecated, it will be at least a decade before a more strongly worded deprecation notice is added. This, oddly, was inspired by the fact that today at work, I wrote the following package: Don't::Put::Me::In::t::lib:Unless::You::Want::To::Drag::The::Test::Suite::To::A::Screaming::Halt.

And for some reason, this blog is cutting off the rest of that package name, so here it is, broken up: Don't::Put::Me::In::t::lib:Unless::You::Want:: To::Drag::The::Test::Suite::To::A::Screaming::Halt

Committed to Testing (by accident)

I recently moved to a new BBC team (if we go down, iPlayer goes down too) and am getting used to working in a different environment with different tools. One issue is that certain tools I'm used to working with are not available (Test::Most being the most noticeable) and getting them added to all environments is a bit bureaucratic (all things considered, shipping new tools should be treated with more caution at many shops than it tends to be). Thus, when I accidentally merged to trunk with a test which contained Test::Most, I was embarrassed.

That's when the rather obvious answer occurred to me. I don't want to accidentally commit use of that module, but it's easy to do. Thus, here's how I do it safely:

BEGIN {
  eval <<'  END';
  use Carp::Always;
  use Test::Most 'die';
  END
}

I won't be happy if I commit that, but at least it won't break anyone's build. It will just won't alter the behavior for those who don't have these modules installed.

Unless what?

You know, I really would love to have a heart-to-knife conversation with developers who use complex unless conditions. While trying to tease apart some code, I came across this gem:

unless ( $blocklist->has_block_with_id( $db_obj->id )
      || ( ! $allow_duplicates 
        && exists $episode_ids->{$db_obj->version->episode_id} ) ) {

To be fair, code grows over time and it's easy to understand how issues like this crop up (and this is a fairly old bit of code), but it took me a while to make sure I understood this. My naive conversion of this to an if statement failed. I had to reduce it down to its simplest components and do a truth table:

unless ( x || ( !y && z ))

     x y z    x yz    xyz   =
     0 0 0    0  0      0   1  
     0 0 1    0  1      1   0
     0 1 0    0  0      0   1
     0 1 1    0  0      0   1
     1 0 0    1  0      1   0
     1 0 1    1  1      1   0
     1 1 0    1  0      1   0
     1 1 1    1  0      1   0

And reversing that gives us:

if ( !$x && ($y || !$z)

     x y z    x yz    xyz   = 
     0 0 0    1  1      1   1
     0 0 1    1  0      0   0
     0 1 0    1  1      1   1
     0 1 1    1  1      1   1
     1 0 0    0  1      0   0
     1 0 1    0  0      0   0
     1 1 0    0  1      0   0
     1 1 1    0  1      0   0

Or:

if (
  ! $blocklist->has_block_with_id( $db_obj->id )
  && ( $allow_duplicates
    || !exists $episode_ids->{ $db_obj->version->episode_id } )
  )
{

I know that many of you would have no problem breaking down the conditional and have De Morgan's Law encoded in your DNA, but I'm not that smart. I still have to do it the hard way.

Please, take pity on us mere mortal programmers and don't use complex "unless" conditions!

Dear Recruiters

To any and all recruiters who might stumble across this blog entry, here's a useful tip: be courteous.

I'm not looking for work, but I certainly understand why you contact me on LinkedIn or via referrals. That's how you make your money and while some people I know loathe any contact from you, I shrug it off as a fact of life. I am contacted by enough recruiters that I don't remember all of you, but there is one group of you I strive to remember. The email exchange almost always go like this:

You: I have a great opportunity that I think is perfect for you.
Me: I'm not looking for work, but if I were, here are my minimum requirements.
You: ... crickets chirping ...

Excuse me? I admit the fact that I'm not looking for work is not appealing to you. I also admit the fact that the "great opportunity" as a junior developer doesn't fit me. I don't, however, accept that you can initiate a conversation and then not so much as acknowledge my reply once you realize that the huge pay cut you tried to entice me with isn't enticing.

I do remember your name and when you contact me again, I'll remind you why I won't do business with you.

Next QA Hackathon -- What Do You Need?

So I hear that the next Perl QA Hackathon will be in Vienna. What should we accomplish? The following is not complete as I was so focused on the areas I was working on that I really didn't follow the other areas.

In the first QA Hackathon, in Oslo, we nailed down a bunch of issues we'd like to see in TAP. We clarified part of the spec and started work on tests for TAP itself. (And convinced Nadim Khemir to release App::Asciio).

The second QA Hackathon, in Birmingham, UK, saw the creation of nested TAP (i.e., subtests).

The third one, I think, should result in either better parsing of nested TAP or shoe-horning structured diagnostics into TAP.

Another possibility is to do something really, really awful and take the most popular Perl testing modules and manually register all of their testing functions. By registering a function, we can better associate a diagnostic with a given test. This would also allow much cleaner behavior on Test::Most's 'die' and 'bail' on fail behaviors. It's an internal hack which should be invisible to most people writing tests, but if you're writing test modules, it makes a lot of sense to be explicit about what your testing functions are.

What's important, though, is what you want to see produced. If there's consensus, maybe shifting priorities would be good?

Run Individual Test::Class Methods via Vim

A couple of years ago I had a simplistic way to run Test::Class methods on my use.perl blog. Unfortunately, it littered the test class with $ENV{TEST_METHOD} assignments. I should have fixed that. Here's a better version:

noremap <buffer> <leader>tm ?^sub.*:.*Test<cr>w"zye:!TEST_METHOD='<c-r>z' prove -v %<cr>

With that, if your cursor is inside of a test method, typing ",tm" (without the quotes and assuming that a comma is your leader), will run just that test method.

If you can't run your Test::Class classes directly, see my in-depth tutorial on Test::Class.

Most Popular Testing Modules - January 2010

Back in September 2008, I had a list of the most popular testing modules on the CPAN. I created this list because I was writing Test::Most and I needed to know them. Tonight, after hearing that the next Perl-QA hackathon is in Vienna, I thought about what I might want to accomplish and decided to see if the most popular modules had changes. In 2008, out of 373 test modules, we had the following top 20:

 1  Test::More                          69396
 2  Test                                10912
 3  Test::Exception                     2314
 4  Test::Simple                        962
 5  Test::Base                          610
 6  Test::NoWarnings                    386
 7  Test::Builder::Tester               308
 8  Test::Deep                          254
 9  Test::Pod                           223
10  Test::Warn                          217
11  Test::Differences                   213
12  Test::MockObject                    187
13  Test::Pod::Coverage                 157
14  Test::Builder                       131
15  Test::WWW::Mechanize::Catalyst      118
16  Test::XML                           113
17  Test::Block                         112
18  Test::Perl::Critic                  110
19  Test::Distribution                  107
20  Test::SQL::Translator               101

In January 2010, out of 493(!) Test:: modules, we have the following top 20:

 1  Test::More                          86194
 2  Test                                10789
 3  Test::Exception                     3548
 4  Test::Base                          1006
 5  Test::Simple                        962
 6  Test::NoWarnings                    884
 7  Test::Deep                          413
 8  Test::Warn                          350
 9  Test::Pod                           335
10  Test::Builder::Tester               322
11  Test::Differences                   311
12  Test::Pod::Coverage                 277
13  Test::Perl::Critic                  256
14  Test::MockObject                    205
15  Test::Builder                       203
16  Test::Most                          170
17  Test::Block                         169
18  Test::WWW::Mechanize::Catalyst      162
19  Test::Distribution                  150
20  Test::Kwalitee                      143

That's when I noticed something very interesting. My Test::Most module, the one I created this list for, was already in the number 16 slot (it's currently at the number 13 slot if you ignore modules in the Test:: namespace but which don't export test functions)! That's when my failure dawned on me. Given that people who use Test::Most are more serious about testing, it's reasonable that they're more serious about testing, thus artificially inflating my stats. I was counting the number of times the modules were being used. However, if I write, say, an extra 160 test programs for my modules, each of which uses Test::Most (or indeed, just type use Test::Most 160 times in a single module), then I could push my module into the top ten. So I reran my stats per distribution:

1   Test::More                          14111
2   Test                                 1736
3   Test::Exception                       744
4   Test::Simple                          331
5   Test::Pod                             328
6   Test::Pod::Coverage                   274
7   Test::Perl::Critic                    248
8   Test::Base                            228
9   Test::NoWarnings                      155
10  Test::Distribution                    142
11  Test::Kwalitee                        138
12  Test::Deep                            128
13  Test::Warn                            127
14  Test::Differences                     102
15  Test::Spelling                        101
16  Test::MockObject                       87
17  Test::Builder::Tester                  84
18  Test::WWW::Mechanize::Catalyst         79
19  Test::UseAllModules                    63
20  Test::YAML::Meta                       61
21  Test::Synopsis                         57
22  Test::Compile                          56
23  Test::Portability::Files               54
24  Test::Most                             49

OK, so I dropped down to a more reasonable 24th place. But a quick check of CPANTS showed that I wasn't finding as many uses of Test::Most as I should because there were cases where:

There was also the case where it turns out that Robert Krimen is a huge fan of Test::Most (Hi Robert!) and has enough modules out there to artificially inflate my stats. So here's yet another list, showing test modules per author:

1   Test::More                          2669
2   Test                                 685
3   Test::Exception                      260
4   Test::Simple                         161
5   Test::Pod                             76
6   Test::Warn                            66
7   Test::Perl::Critic                    65
8   Test::Deep                            60
9   Test::Pod::Coverage                   55
10  Test::NoWarnings                      51
11  Test::Differences                     50
12  Test::Builder::Tester                 49
13  Test::MockObject                      49
14  Test::Base                            47
15  Test::WWW::Mechanize::Catalyst        36
16  Test::Builder                         27
17  Test::Kwalitee                        26
18  Test::MockObject::Extends             24
19  Test::Distribution                    21
20  Test::Output                          19
21  Test::XML                             17
22  Test::Moose                           16
23  Test::Harness                         16
24  Test::LongString                      15
25  Test::Tester                          13
26  Test::Spelling                        12
27  Test::MockModule                      11
28  Test::UseAllModules                   10
29  Test::TCP                             10
30  Test::Most                            10

That seems a touch more reasonable, though I confess I'd like to see more folks using this module. It really does make your testing life simpler.

And as for the beautiful Test::Class which more people should be using:

  • 95 test programs (25th place)
  • 19 distributions (42nd place)
  • 9 authors (33rd place)

Come on, folks! You can do better than that, right?

Test::Class and FindBin

I'm on a new team at the BBC and I was rather curious to note that I couldn't run the Test::Class tests by simply doing a prove -lv t/lib/Path/To/Test.pm. A bit of research reveals the culprit is FindBin, a module I've never been terribly happy with. Seems we have configuration information located relative to the $FindBin variable that module sets.

package Dynamite::Test::FindBin;

my $EXECUTABLE;
BEGIN { 
  $EXECUTABLE = $0;
  # point it at the test directory if we're not there
  if ( $0 !~ m{^t/[^./]+\.t$} ) {

    # FindBin requires a real filename
    $0 = (glob('t/[^/]+.t'))[0];
  }

  # In case something does a "use FindBin" before we get here
  FindBin->again if exists $INC{'FindBin.pm'};
}
use FindBin;
BEGIN { $0 = $EXECUTABLE };

1;

Seems FindBin uses $0 (not surprising) to do its magic. Regrettably, that assumption makes it hard to run a test class directly. This also explained why the test classes required driver *.t files, something which drives me up a wall. Now they're not longer needed, but I confess I feel like I need to take a shower and scrub myself really, really clean.

Cool Things in Perl 6

Yeah, Perl 6 is going to allow us to do very interesting things. Given this code:

use v6;

subset Filename of Str where { $_ ~~ :f };

sub foo (Filename $name) {
    say "Houston, we have a filename: $name";
}

my Filename $foo = $*EXECUTABLE_NAME;
foo($foo);
foo($*EXECUTABLE_NAME);
foo('no_such_file');

We get this output:

Houston, we have a filename: /Users/ovid/bin/perl6
Houston, we have a filename: /Users/ovid/bin/perl6
Constraint type check failed for parameter '$name'
in Main (file src/gen_setting.pm, line 324)

Perl 6 Config::INI Improvements

I've renamed the Config::INI repository to something more sensible (since no one had watched/forked it) and now you can write out Config::INI files.

my Config::INI $config .= new;
my %properties = (
    a => 'b',
    c => 'd',
);
my %next = (
    one   => 'two',
    three => 'four',
);
$config.add_properties(:%properties);
$config.add_properties(properties => %next, name => 'next');
$config.write($ini_filename);

It needs more work (primarily solid error checki…

Maintaining State in Perl 6 Grammars?

I've been curious how one would match the following with Perl 6 grammars/regexen (spaces represented by dots)?

    XXXX
    ..XXXX
    ....XXXX
    ....XXXX
    ......XXXX
    ..XXXX

In short, you can indent or unindent by a particular amount, with allowed indentation levels being multiples of the first indentation amount found. Each level of indentation must equal the previous or be one greater. You can unindent as much as you want so long as it's a multiple of that first indentation level. Seems like you need to have a recursive grammar that allows you …

About Ovid

user-pic Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant. Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.