Refactoring When Tests Are Failing

Today I'm working on some code. That's not surprising. It's also not surprising that the person I'm working on the code with is on holiday until the new year starts. Before he left, he made some significant changes to optimize our code, but for one subsystem he didn't have the time to apply his changes. This means that we have 7 tests failing out of 397. In a quick check we found there were plenty of conflicts between his code and mine and my work was generating even more conflicts. This meant we had two choices for how to deal with this while he was gone:

  1. I could continue my work with passing tests and deal with even more conflicts when he returns
  2. I could merge his code, deal with some failing tests. get rid of the conflicts and have a much faster code base

We opted for the latter, but when running the tests, how do I know if I broke something? This is one reason that we argue that all tests must pass for a test suite: do you scan through all of the broken tests to see if they've somehow broken differently? If you do, you'll start to ignore failures because that's normal. Eventually you'll get serious failures that showed up in testing but that you didn't notice.

So what follows isn't perfect, but if you have to temporarily deal with failing tests, it's far better than just trying to remember which tests are "allowed" to fail.

There's a feature in Test::Harness that many people don't know about: --state. With this, you can do the following:

prove -lr --state=save

And that will save the state of this test run in a file named .prove. In that, you'll see YAML output which has individual test file results like this:

  "./t/Advertiser/events.t":
    elapsed: 1.73660922050476
    gen: 3
    last_fail_time: 1356516251.85145
    last_result: 2
    last_run_time: 1356516251.85145
    last_todo: 0
    seq: 38
    total_failures: 1

In this case, we see from last_result that we had two test failures in this test file. In fact, we have this number for all test files. I used this knowledge to whip up a quick provediff program:

#!/usr/bin/env perl                                                                                                                                 

use Modern::Perl;
use App::Prove::State;

my ( $prove1, $prove2 ) = @ARGV;
unless ( -f $prove1 and -f $prove2 ) {
    die "Pass .prove files, please";
}

# XXX Must fix this in App::Prove. We want to be able to read and parse these
# .prove files even if we're in a different directory.
no warnings 'redefine';
local *App::Prove::State::_prune_and_stamp = sub { };
my $first = App::Prove::State->new( { store => $prove1 } )->results;
my %first_tests = map { $_ => $first->test($_) } $first->test_names;

my $second = App::Prove::State->new( { store => $prove2 } )->results;
my %second_tests = map { $_ => $second->test($_) } $second->test_names;

foreach my $test ( sort keys %first_tests ) {
    unless ( exists $second_tests{$test} ) {
        header($prove1, $prove2);
        say "Test '$test' found in '$prove1' but not '$prove2'";
    }
}
foreach my $test ( sort keys %second_tests ) {
    unless ( exists $first_tests{$test} ) {
        header($prove1, $prove2);
        say "Test '$test' found in '$prove2' but not '$prove1'";
        next;
    }

    my $first_result  = $first_tests{$test}->result;
    my $second_result = $second_tests{$test}->result;

    if ( $first_result != $second_result ) {
        header($prove1, $prove2);
        say
          "Test '$test' results are different: $first_result failure(s) versus $second_result failure(s)";
    }
}

sub header {
    my ( $prove1, $prove2 ) = @_;
    state $header_printed = 0;
    return if $header_printed;
    say "provediff results for '$prove1' versus '$prove2'";
    $header_printed = 1;
}

It's not particularly brilliant, but first I run prove -lr --state=save and then cp .prove prove_orig. Then subsequent test runs while refactoring have me running tests like this:

prove -lr --state=save; provediff prove_orig .prove

If we don't see a different number of failures, prove_diff gives no output. If we do see different failures, we get results like this:

provediff results for 'prove_orig' versus '.prove'
Test 't/Advertiser/conversions.t' results are different: 2 failure(s) versus 3 failure(s)
Test 't/Advertiser/conversions_functional.t' results are different: 4 failure(s) versus 3 failure(s)

This is a far cry from perfect, but it's also far better than visually inspecting the prove results every time and seeing if the failures are different. This is also only a stopgap measure until the other dev gets back and we can safely fix the last of the test failures.

Happy holidays! (I should write more about --state. It has a lot of neat features).

6 Comments

Another solution I commonly use is TODO blocks (see Test::More for more info about TODO blocks).

Before merging your own code, change the code in the common repository to add TODO blocks around the failing tests:

+{
+    local $TODO = "failing test from colleague in holidays";
     fail("Failing test");
+}

After this change, check that the testsuite pass. Check that almost no TODO success (none is ideal). Commit all those patches to the testsuite in a single commit (that will be easier to revert it later).

Then merge your own code. The test that fail are your own failures. Fix them.

Your colleague will have to fix the TODOs when (s)he comes back. You can even revert the TODO commit to push him to do his job.

Of course, TODO blocks issue an "unexpected pass" message when they are fixed, so you should be able to notice that the changes have been made and that the TODOs may be removed.

The drawback of the TODO strategy in comparison to the prove --state approach is that if any of the tests marked TODO should start failing differently than they failed when you marked them as TODO, you will not notice it.

"I should write more about --state. It has a lot of neat features"

I presented at a London Perl Mongers technical meeting about TAP::Harness and some of the cooler things you can do with prove late-ish last year. Slides are here: http://www.slideshare.net/kaokun/continuous-testing-and-deployment-in-perl-londonpm-technical-meeting

Would love to hear/read if I missed any good tricks!

Leave a comment

About Ovid

user-pic Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant. Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.