Merry Christmas! Parallel testing with Test::Class::Moose has arrived

You'll want to checkout the forks branch to see it in action. Read the docs for Test::Class::Moose::Role::Parallel to see how to use it (you'll probably need to create your own schedule).

What follows is a very naïve benchmark where I reduced a 12 minute test suite down to 30 seconds.

Before we start, the following script is what I used to generate benchmarking material. It generates about 300 test classes. We assume an average load time of one second for the lib/ directory and for added fun, we threw a sleep in a test in the base class (which is overridden in child classes, so they don't experience that slow down).

use 5.10.1;
use strict;
use warnings;
use autodie ':all';

use File::Path 'make_path';

make_path('lib');
make_path('t/lib/TestsFor');
make_path('tcm');

#
# Create a slow loading class
#
open my $fh, '>', 'lib/SlowLoader.pm';
print $fh <<END;
package SlowLoader;
use strict;
use warnings;
BEGIN { sleep 1 }
1;
END
close $fh;

#
# Create our test base class
#

open my $base_class, '>', 't/lib/MyBaseClass.pm';
print $base_class <<'END';
package MyBaseClass;
use Test::Class::Moose;
with 'Test::Class::Moose::Role::Parallel';
1;
END

#
# Create our test classes and their driver .t files
#
my $module = 'A';
for ( 1 .. 100 ) {
    foreach my $sub ( \&parent, \&child, \&grandchild ) {
        my ( $name, $code ) = $sub->($module);
        my $filename;
        if ( $name  =~ /TestsFor::(.*)/ ) {
            $filename = $1;
            open my $fh, '>', "t/lib/TestsFor/$filename.pm";
            print $fh $code;

            open my $driver, '>', "t/$filename.t";
            print $driver <<"END";
use lib 't/lib';
use $name;
Test::Class::Moose->new->runtests;
END
        }
    }
    $module++;
}

#
# Create a Test::Class::Moose single process driver
#
open my $tcm, '>', 'tcm/tcm_standard.t';
print $tcm <<'END';
use Test::Class::Moose::Load 't/lib';
MyBaseClass->new(
    jobs => ( $ENV{NUM_JOBS} // 0 ),
    statistics => 1
)->runtests;
END

sub parent {
    my $module = shift;
    my $name   = "TestsFor::$module";
    my $code   = <<"END";
    package $name;

    use Test::Class::Moose extends => 'MyBaseClass';
    use SlowLoader;

    sub test_this { sleep 1; ok 1, "test $_" for 1 .. 5; }
    sub test_that { ok 1, "test $_" for 1 .. 5; }

    1;
END
    return $name, $code;
}

sub child {
    my $module = shift;
    my $name   = "TestsFor::Child$module";
    my $code = <<"END";
    package $name;

    use Test::Class::Moose extends => 'TestsFor::$module';

    sub test_this { ok 1, "test $_" for 1 .. 3; }

    1;

END
    return $name, $code;
}

sub grandchild {
    my $module = shift;
    my $name   = "TestsFor::GrandChild$module";
    my $code   = <<"END";
    package $name;
    use Test::Class::Moose extends => 'TestsFor::Child$module';
    1;
END
    return $name, $code;
}

We can run the test suite with prove:

$ prove -l t
t/A.t ............. ok   
t/AA.t ............ ok   
t/AB.t ............ ok   
t/AC.t ............ ok   
t/AD.t ............ ok    
...
t/O.t ............. ok   
t/P.t ............. ok   
t/Q.t ............. ok   
t/R.t ............. ok   
t/S.t ............. ok   
t/T.t ............. ok   
t/U.t ............. ok   
t/V.t ............. ok   
t/W.t ............. ok   
t/X.t ............. ok   
t/Y.t ............. ok   
t/Z.t ............. ok   
All tests successful.
Files=300, Tests=1200, 704 wallclock secs
Result: PASS

So that's almost 12 minutes. As you know from my previous post, there are a lot of duplicated tests in there due to test inheritance.

So now let's run them in a single process, using Test::Class::Moose::Load (note: this program is also created by the script above):

prove -l tcm/
tcm/tcm_standard.t .. 301/301 # Test classes:    301
# Test methods:    600
# Total tests run: 2600
tcm/tcm_standard.t .. ok       
All tests successful.
Files=1, Tests=301, 106 wallclock secs
Result: PASS

Not bad! Less than two minutes.

We can do even better with forkprove:

forkprove -Ilib -MSlowLoader -j8 t
...
t/Z.t ............. ok   
All tests successful.
Files=300, t/Z.t ............. ok   
All tests successful.
Files=300, Tests=1200, 58 wallclock secs
Result: PASS

Unfortunately, forkprove doesn't offer scheduling, a key need of large test suites.

So now let's try again with 8 jobs, using Test::Class::Moose::Role::Parallel features:

$ NUM_JOBS=8 prove -l tcm/
tcm/tcm_standard.t .. ok   
All tests successful.
Files=1, Tests=8, 31 wallclock secs
Result: PASS

Down to half a minute. Awesome!

There are some caveats:

  • You cannot (currently) use Test::Class::Moose reporting features with parallel tests
  • The tests will currently appear to hang until they're finished
  • The scheduler is naïve and you'll probably need to write your own

While those numbers look impressive, it's important to remember that your results are very unlikely to be this good. Just because you've forked off process, you will likely have code fighting for resources (bandwidth, databases, etc.)

Also, you'll probably need to provide your own schedule() method because:

  • Not all tests can be run in parallel
  • Some tests can be run in parallel, but only with a subset of other tests
  • You'll want to distribute long-running methods across separate jobs

For those curious how I pulled this off, this is all subject to wild change, but surprisingly, I didn't have to do any monkey-patching of code. It works like this:

I use Parallel::ForkManager to create our jobs.

For each job, I grab the schedule for that job number and the test_classes and test_methods methods only return classes and methods in the current job schedule. Then I run only those tests, but capture the output like this:

my $builder = Test::Builder->new;

my $output;
$builder->output( \$output );
$builder->failure_output( \$output );
$builder->todo_output( \$output );

$self->runtests;

# $output contains the TAP

Afterwards, if there are any sequential tests, I run them using the above procedure.

All output is assembled using the experimental TAP::Stream module bundled with this one. If it works, I may break it into a separate distribution later. That module allows you to combine multiple TAP streams into a single stream using subtests.

Then I simply print the resulting combined TAP to the current Test::Builder output handle (defaults to STDOUT) and prove can read the output as usual.

Note that because we're merging the regular output, failure output, and TODO output into a single stream, there could be side effects if your failure output or TODO output resembles TAP (and doesn't have a leading '#' mark to indicate that it should be ignored).

Have fun and let me know what you think!

1 Comment

Cool beans. Maybe a next level parallelism can be achieved by distributing tests via Gearman.

About Ovid

user-pic Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/