Test Hierarchy Produces Poor Unit Tests
The first part of this series described Test Hierarchy, a hierarchy of test classes that mirrors the classes under test, and explained why it’s an antipattern. For how common it is, this practice doesn’t even produce good unit tests.
A unit test, by definition, tests a unit of software, no more, no less. On the one hand, we have unit tests, which test a single module or class. On the other hand, we have integration tests, which test how multiple modules or classes work together. We want each unit test to poke and prod only the class that it tests. We want each subsystem integration test to test a natural subsystem, e.g., the data-export subsystem. We want our system tests to test the whole system. And we don’t want any test to be affected by any other other units, subsystems, or systems.
When our software depends on other software that may change over time, our tests may suddenly start failing because the behavior of the other software has changed. This problem, which is called Context Sensitivity, is a form of Fragile Test…
Whatever application, component, class, or method we are testing, we should strive to isolate it as much as possible from all other parts of the software that we choose not to test. This isolation of elements allows us to Test Concerns Separately and allows us to Keep Tests Independent of one another. It also helps us create a Robust Test by reducing the likelihood of Context Sensitivity caused by too much coupling between our SUT [system under test] and the software that surrounds it. (xUnit Test Patterns: Refactoring Test Code. Gerard Meszaros. Addison-Wesley Professional, 2007.)
If terms like Context Sensitivity and Fragile Test feel familiar, it’s not just a coincidence.
Test Hierarchy produces tests that purport to be unit tests but that don’t actually test isolated units.
Let’s say we have a
Bat class that is a subclass of
Mammal. If the tests use Test Hierarchy, then
BatTest not only tests the
Bat code, but also the superclass
Mammal code and its superclass
This seems to make some intuitive sense, because after all,
Bat can do all the things that
Animal can do, all the methods that it inherits from those classes. But this intuition misses an important distinction. The unit is whatever code is in the
Bat.pm module, not whatever the
Bat class can do.
Bat unit test fails, it should indicate that we made a mistake in
Bat.pm, not any other module.
BatTest doesn’t just test the code in
Bat.pm, but also the code in
Animal.pm. This makes it an integration test (not a unit test), because it doesn’t just test its own module but other modules as well.
And it’s an integration test we don’t need. Generally, we write integration tests that exercise some system feature, like “export
Foo data in CSV format.” This might involve setting up the
Foo data fixture, invoking the appropriate export feature, then validating the CSV file that it generates. But we don’t need module tests that invoke low-level methods on other modules.
In fact, who said that there’s only one test per class?
Here at The Perl Shop, we generally create a separate test module per method or feature. So we’d create
breathe.t, each of which tests a different
Animal method. This way, we can group together tests by class method, and easily self-document which tests correspond with which feature.
We also inline our test classes in our
.t scripts, which keeps the test code close to the test script and cuts in half the number of files we need to maintain. And makes it impossible to subclass them.
We’ve had great success with these practices, and they’re fundamentally incompatible with Test Hierarchy.
You might have also noticed this in Testing Strategies for Modern Perl. In chapter 2, we create a
TicTacToe::BusinessLogic::Game class, which is tested by
In the concluding post, I’ll reflect on some of the reasons developers use Test Hierarchy, and why these reasons don’t stack up.
Peace, love, and may all your TAP output turn green…
This post originally appeared on The Perl Shop blog as “Test Hierarchy Produces Poor Unit Tests.”
Bat.pm doesn’t directly define the
Bat class. Rather, it defines an implicit “subclass mixin”—all the methods and attributes in
Bat.pm that are added to (“composed with“) its superclass
Mammal. The object system, then, composes the
Bat mixin with
Mammal to create the full
Bat class. Similarly,
Mammal.pm conceptually defines a
Mammal mixin that is composed with its superclass
Animal in order to create the