You Can't Test Everything, But At Least Test What's Important.

The key word being important. After releasing fetchware to CPAN, I got back CPAN Testers reports that had hundreds of given/when deprecation warnings. These warnings would just be completely insane when actually running the program on newer Perls, so given/when had to go byebye.

I used vim's cool but ancient s/// command to find and replace the givens and whens with their appropriate ifs. Unfortunately, I goofed up so profoundly that before bugs were reported to github, I had absolutely no idea that fetchware's command line interface was completely broken.

Fetchware has a gigantic test suite of some 33 files at the time, but none of them actually tested fetchware's most important interface--its command line interface. Which if broken, results in fetchware being unusable. During the great given/when massacre, I profoundly messed up the most important nested if in fetchware, the one for its command line interface making fetchware completely unusable. Hillariously, it still passed its test suite with flying colors, because fetchware had no tests for its command line interface. I never deemed it important enough to test. I obliviously never realized that if the command line interface is broken, then fetchware is broken. Oops.

The damage was done in commit e961f6abb3a66c76470a9b88f6c52eb646ff8269.
An example is below:

From:


given (shift @ARGV) {
when('install') {
cmd_install(@ARGV);
...

To:


my $command = shift @ARGV;
if ($command eq 'install') {
cmd_install($command eq @ARGV);
...

The cmd_install() subroutine is the main subroutine in fetchware that actually does the installing of fetchware packages. It's the subroutine with tons of tests that calls other subroutines with tons of tests, but what calls it has no tests! Notice: That stupid "$command eq @ARGV" part. That's an expression that will return true or false, but that's not what cmd_install() expects its first argument to be. It needs it to be a path to a Fetchwarefile or a fetchware package (from @ARGV) not a useless true or false value. Oops.

Since, there were zero tests of fetchware's command line interface, I released this to CPAN in version App-Fetchware-1.001, and didn't notice. This remained broken until App-Fetchware-1.008 released a week later :)

Someone decided to give fetchware a spin, and noticed how none of fetchware's commands worked properly. Luckily, they reported this problem in fetchware's first three bugs.

This still needed tests added, which happened in the 1.009 release of Fetchware. I added a t/bin-fetchware-run.t to fetchware's test suite to test fetchware's run() modulino subroutine, which is run when fetchware is run as a command line program, but not run when fetchware is loaded as a library. Also, I added the t/bin-fetchware-command-line.t file to actually call bin/fetchware different ways as a command line program to test if it prints out the right things or exits with success.

Breaking fetchware in such a spectacular silly fashion made me wonder about fetchware's test coverage. So, after installing an awesome Devel::Cover dzil command plugin. I ran Devel::Cover with "dzil cover." And the output was pretty good:

----------------------------------- ------ ------ ------ ------ ------ ------
File                                  stmt   bran   cond    sub   time  total
----------------------------------- ------ ------ ------ ------ ------ ------
bin/fetchware                         66.3   47.1   21.6   87.3    1.8   61.8
blib/lib/App/Fetchware.pm             67.5   53.0   38.1   86.1    2.3   63.1
blib/lib/App/Fetchware/Config.pm      98.3   89.3    n/a  100.0    2.6   96.1
...Fetchware/CreateConfigOptions.pm   98.3   71.4   33.3  100.0    0.2   92.1
blib/lib/App/Fetchware/ExportAPI.pm  100.0   80.0    n/a  100.0    0.0   96.8
blib/lib/App/Fetchware/Util.pm        66.7   42.6   50.0   89.4    2.8   59.2
...b/App/FetchwareX/HTMLPageSync.pm   34.8   13.6    n/a   68.4    0.0   35.9
blib/lib/Test/Fetchware.pm            86.7   54.2   27.8   87.9   90.4   75.6
Total                                 69.1   50.1   38.3   87.7  100.0   63.9
----------------------------------- ------ ------ ------ ------ ------ ------

63.9% is pretty good considering I've never bothered to run Devel::Cover before. However, Devel::Cover only produces pretty metrics to impress your boss. It fails to tell you where you should test your software, and where not testing is ok.

----------------------------------- ------ ------ ------ ------ ------ ------
File                                  stmt   bran   cond    sub   time  total
----------------------------------- ------ ------ ------ ------ ------ ------
bin/fetchware                         61.4   38.5   21.6   81.9    6.2   56.0
blib/lib/App/Fetchware.pm             67.5   53.0   38.1   86.1   14.2   63.1
blib/lib/App/Fetchware/Config.pm      98.3   89.3    n/a  100.0    7.7   96.1
...Fetchware/CreateConfigOptions.pm   98.3   71.4   33.3  100.0    2.0   92.1
blib/lib/App/Fetchware/ExportAPI.pm  100.0   80.0    n/a  100.0    0.0   96.8
blib/lib/App/Fetchware/Util.pm        66.7   42.6   50.0   89.4   15.1   59.2
...b/App/FetchwareX/HTMLPageSync.pm   34.8   13.6    n/a   68.4    0.1   35.9
blib/lib/Test/Fetchware.pm            86.7   54.2   27.8   87.9   54.6   75.6
Total                                 67.7   48.3   38.3   86.4  100.0   62.4
----------------------------------- ------ ------ ------ ------ ------ ------

Above is the same devel cover report, but with t/bin-fetchware-run.t and t/bin-fetchware-command-line.t deleted, and not included. It only decreases the total test percentage by a mere 1.5%. And if your tests are a mere 1.5% lower, does that tell you that your test suite is fundamentally broken. Of course not. 1.5% is nothing. But not testing your command line interface is a huge bug waiting to bite you.

Test what's most important first. Test what your users need most. 63.9% is meaningless. But the awesome HTML report Devel::Cover produces is just a tool. Like how a debugger does not actually "debug" your code for you. Devel::Cover just shows you what parts, paths, subroutines of your code you lack tests for. It's up to you to produce meaningful answers from the report. You still need to use the debugger (the fancy HTML report) to find and fix the bug (Prioritize where you really do need more test coverage. And, of course actually add it.).

Leave a comment

About dly

user-pic Perl rules.