Speeding Up Perl Test Suites & Test2::Aggregate

I gave a talk at TPC 2019 based on my experiences speeding up the Perl test suite at room/roommate finding service SpareRoom, also serving as an introduction to the - just released at the time - Test2::Aggregate. The talk was a bit too dense, as I had prepared a pretty packed 20 minute presentation, only to realize a couple of days before (newbie speaker) that I had just 15 minutes real time excluding the Q&A. So, some attendees asked me to put up a blog post with the notes etc, and especially more about Test2::Aggregate, which is why I am writing this. I will try to give a longer and more detailed talk about the subject in one of the Perl conferences this summer.

In any case, the talk is up on youtube and gives an overview of the various lessons learned and techniques used while making our frustratingly slow 20 min test suite almost 10x faster, making a huge difference in our dev process:

The slide deck is available here.

Test2::Aggregate

One of the most interesting parts of the presentation seemed (judging from the reactions during the live demo) to be the introduction of the Test2::Aggregate module. While trying to optimize our test suite I wrote a little module that loaded unit tests as subtests of a single test, in order to easily profile them together. When I noticed how much faster it was to run them like that, even compared to a Test2::Harness with preloading, I looked into modules that did this. Unfortunately, Test::Aggregate tried to do a bit too much and was not compatible with most tests and the Test2 framework, so I went with my own simple solution to use at SpareRoom and then published it.

It is straightforward to use, the simplest way is to pass a list of files/directories:

use Test2::Aggregate;

my $stats = Test2::Aggregate::run_tests(
    dirs => ['t']
);
which runs the tests and outputs a hashref with the individual test results:
$stats = {
  'test.t' => {
    'test_no'   => 1,
    'pass_perc' => 100,
    'timestamp' => '20190705T145043',
    'time'      => '0.1732'
  }
};

The caveat is that not all tests can be aggregated (and not all tests need to be aggregated) and an incompatible test can break the test run completely and fail everything after it, so it takes some effort to at least find the compatible tests. If we take the example of our own Test2-based suite, about 80% of the tests run under the Aggregator and most of those required no or minimal changes. Fortunately we only had a handful of Test::Class tests which seem to be a no-go (for more reasons than just the aggregator in my opinion!). For my talk I looked at the Moose-2.2011 distribution because of its popularity, despite the fact that it is not well suited for Test2::Aggregate, as it is Test::More-based. Despite that fact, from its 478 tests, at least 204 pass without any changes aggregated (8 of them with warnings - and if you change the order some might break, some might start passing). I demoed running 180 of them (excluded the slowest for dramatic effect), they took 39 seconds under prove, 23 seconds under yath -PMoose and... a mere 2 seconds under Test2::Aggregate!
If you are curious to try, you can download the sample set here - 'prove/yath aggr.t' will run the aggregated tests. As you can see, I load Test::More on the aggregator test, which helps with a couple of tests, but in general you should really try to switch to Test2::Suite, it helped us in many and important ways, the least of which is Test2::Aggregate.

Obviously you can only speed up the test setup time, so the more numerous and small tests you have, the more the effect. In our own test suite the overall benefit of using Test2::Aggregate on an already heavily parallel workload is a speed increase of almost 2.5x - still impressive. We use lists of files to pass to the aggregator, which we make sure are kept in check so that each aggregated unit does not take too long compared to the entire test suite. For an example of using a list, in the Moose example above try the aggr_lst.t test instead, which uses the tests.lst file.

Now, if you start adding test files into list files, do you have to move aggregate files to have yath/prove not run them separate as well? In our case, we use a replacement yath script that adds exclusions for all our aggregated files. It looks a bit like this:

BEGIN {
    use File::Slurp;

    my @args = ();
    foreach (@ARGV) {

        if (/--exclude-lists=(\S+)/) {
            my @lists = split /\n/, `find $1`;
            foreach my $file (@lists) {
                my $list = read_file($file);
                foreach (split(/\n/, $list)) {
                    push @args, "--exclude-file=/secure/t/$_" if $_;
                }
            }
        } else {
            push @args, $_ if $_;
        }
    }

    @ARGV = @args;
}

use App::Yath(\@ARGV, \$App::Yath::RUN);
exit($App::Yath::RUN->());

So you call it with the extra argument:

yath_agg test -jXX --exclude-lists=t/aggregate/*.lst t/

I suggest you try aggregating batches of tests - trying your entire suite will be quite disappointing, as there will be cascading failures which won't show you how many tests can be aggregated. You can script something that tries adding tests to lists a few at a time and verifying them using the Test2::Aggregate::run_tests output - although note that it is possible for a Test::More test to fail with the Test2::V0::subtest (which the Test2::Aggregate::run_tests uses) returning a pass.

Feel free to post issues on or write to me at dkechag at cpan.org. Use that email too to send a CV if you are interested in joining our wonderful Perl team in Manchester, UK - we almost always have openings for mid/senior level Perl devs...

Leave a comment

About Dimitrios Kechagias

user-pic Computer scientist, physicist, amateur astronomer.