Eating my own dogfood (parallel tests)

I'm relatively pleased with my work in creating parallel testing with Test::Class::Moose, but I wanted to make sure that it worked with a real world example, so today I took a small, but real test suite and converted it and tried out my parallel testing code. The results were interesting.

The tests were for a personal project of mine that I've hacked on for a while and they were originally written using Test::Class::Most. The test suite is small and has a total of 24 test classes, 53 test methods and 469 tests. It takes around 8 seconds to run on my box. That's very small, but real.

The code is a standard Catalyst, DBIx::Class, Template and Moose stack. I consistently found that just loading the core modules takes about .5 seconds on my iMac and about a second on my MacBook pro.

$ time perl -MCatalyst -MMoose -MDBIx::Class -MDateTime -e 1

real    0m0.483s
user    0m0.455s
sys 0m0.025s

For 24 test classes, that should add about 12 seconds if I had run them in separate processes. So I expected that my test suite would now take a little over 20 seconds to run.

Boy was I wrong. You see, considering class loading time isn't enough. You also have to consider that when you actually use (not just load) those classes, they do a lot of things internally that don't always need to be done more than once. Here, in a real test suite, running the test classes with separate *.t files ballooned the run time of the test suite from 8 seconds to 57 seconds. In other words, running separate *.t files slowed the test suite by a factor of seven.

(Note: because I don't use inheritance in my code, the above slowdown is not due to accidentally duplicated tests).

After seeing that, I converted the tests to use Test::Class::Moose. I had expected, given the overhead of Moose, to see the tests run a bit slower. That was a concern because I could see people pointing to that and saying "Moose is too slow for production work" (yes, I still hear this). Instead, the test suite ran marginally faster. It wasn't much of a gain, but it was consistently about half a second faster. I was very surprised, not to mention pleased!

Plus, because I could now use roles, there was some common code that I was using that was easy to refactor out into a role to share a test fixture (the slight runtime gain was with the roles, I might add).

Next, it was time to see how parallel testing did.

I was pretty careful when I designed the original test suite to ensure that it would run in parallel when I could figure out how to make that happen. I did that with the following test control methods in my base class:

sub test_setup {
    my $test = shift;
    $test->app->schema->txn_begin;
}

sub test_teardown {
    my $test = shift;
    $test->app->schema->txn_rollback;
}

With that, every test method will run in its own transaction. It was a simple matter of adding the Test::Class::Moose::Role::Parallel role to my base class and running my tests:

prove -l t/tcm.t :: -j 2 # using Getopt::Long to capture the number of jobs

That only increased the speed of the test suite to 6 seconds (from 8). For a test suite this small, the benefits of forking off extra processes is marginal, at best. Of course, over time, the transaction contention via locked tables may prove to be an issue, too.

I stepped it up to 4 and 8 jobs and got the same results, but with intermittent test failures. Hmm.

I added -v to make my tests verbose and realized my error. One of my classes, TestsFor::App::DBIx::Migration requires a test database without a certain table. Oops! So I added a noparallel tag to all tests that could not safely run in parallel. For example:

sub test_migrate : Tags(noparallel) {
    my $test = shift;
    my $m    = $test->migrator;

    for my $level (qw/1 2 1 0 2 0/) {
        my $old = $m->version // 0;
        ok $m->migrate($level),
          "We should be able to migrate from level $old to level $level";
        is $m->version, $level,
          '... and have out database at the correct level';
    }
}

And then in my base class, I had to write my own schedule:

with qw(
  Test::Class::Moose::Role::Parallel
  Test::Class::Moose::Role::AutoUse
);
use aliased 'Test::Class::Moose::TagRegistry';

# skip some code

sub schedule {
    my $self   = shift;
    my $config = $self->test_configuration;
    my $jobs   = $config->jobs;
    my @schedule;

    my $current_job = 0;
    my %sequential;
    foreach my $test_class ( $self->test_classes ) {
        my $test_instance = $test_class->new( $config->args );
        METHOD: foreach my $method ( $test_instance->test_methods ) {
            if ( TagRegistry->method_has_tag( $test_class, $method, 'noparallel' ) ) {
                $sequential{$test_class}{$method} = 1;
                next METHOD;
            }

            $schedule[$current_job] ||= {};
            $schedule[$current_job]{$test_class}{$method} = 1;
            $current_job++;
            $current_job = 0 if $current_job >= $jobs;
        }
    }
    unshift @schedule => \%sequential;
    return @schedule;
}

And then my tests blew up due to a bug in Test::Class::Moose forks branch. I fixed a couple of issues and then pushed it.

Now I can safely run all of my tests in parallel and as the test suite grows, it will be more of a win as time goes on. If a test can't run in parallel, I just add the noparallel tag and forget about it.

Interestingly, chromatic wrote about his preference for One test class per file. I didn't comment there as he's had to disable comments due to blog spam, so I'll comment here.

chromatic wrote that he prefers separate test drivers per test class because he likes:

  • The ability to run an individual class's tests apart from the entire suite
  • The knowledge that each test's environment is isolated at the process level

For the first, it's because he doesn't like to type this to run an individual test class:

prove -l t/test_class_runner.t :: Name::Of::Class::To::Run

He states that this is laziness and concedes that it's not that big of a deal. For me, with my mappings in vim, I never notice this. I just hit ,t and run the individual class.

His next concern is the more serious one (and is the most valid objection):

Second--and this is more important--I like the laziness of knowing that each individual test class I write will run in its own process. No failing test in another class will modify the environment of my test class. No garbage left in the database will be around in my process's view of the database. Maybe that's laziness on my part for not writing copious amounts of cleanup code to recover from possible failures in tests, but it is what it is.

I can understand that concern and I wondered with people using jUnit don't seem to worry about this. Then I realized that Java is far less of a dynamic language and the quick 'n easy hacks we use to just get stuff done are less common.

I don't have to worry about garbage in the database due to my use of transactions and I generally avoid nasty hacks that impact global state. Maybe it's just me, but speeding up my test suite by a factor of seven seems like enough of a win that I'm willing to pay the price. Plus, if my application is naughtily munging state, running my tests in separate processes is less likely to catch that, but running them in the same process increases the odds of finding that tremendously.

So far, everything here appears to be a huge win. Test::Class::Moose is shaping up to be (in my humble opinion) the best testing framework for Perl. Roles make it easy to share fixtures. Running tests in a single process is a huge win for performance. Running tests in parallel works, but it remains to be seen what the impact will be in the long run.

7 Comments

I'm currently converting a test suite to use TCM and your example of transactional approach is pure genius. What happens when there's transaction within transaction? Eg application code already uses transactions? Thank you

Ovid, Very nice, I have some questions regarding the use of transactions for your tests and making sure not to step on any other tests. Have you used DBICx::TestDatabase and if so, is there a reason for using transactions instead of this? Does using the TestDatabase module add too much time to make it impractical? I'm wondering only because I've started to use it for my projects and I'm curious if I should switch to transactions vs using an in-memory database.

  • Joel

Very interesting. Thanks for posting this.

Based on experience on a relatively small project, using transactions are faster then recreating the database for every single test. Dropping/creating a database is quite faster on SSD or in ramdisk as on HDD.

For code, that uses transactions already I used to create separate test database - i.e. all the regular tests operate on the same database with transactions (on the beginning of the test i start a transaction, on the end of the test i do a rollback), except that special ones, which use transactions. They operate each on a separate database and those tests does not have transaction/rollback on the start/end of the test file. So far it is working quite good for me.

I have figured it out. MySQL supports nested txns via savepoints. You can set "auto_savepoint => 1" on connection to enable that functionality.

More info: https://metacpan.org/pod/DBIx::Class::Manual::Cookbook#Nested-transactions-and-auto-savepoints

Leave a comment

About Ovid

user-pic Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/