100% Test Coverage?
OK, you know how it works. Everybody who's really comfortable with testing often offers the following advice:
- Use Devel::Cover
- Don't strive for 100% coverage
This advice is absolutely correct, but misleading. It leaves you with an obvious question: what coverage do I want? This is where the danger lies.
While hacking on my Veure project, I find myself often digging into code that I don't understand terribly well. This also means I don't understand my interface. As a result, I often implement something I think will work, hit my browser and view the results. Which are wrong. Again. So I tinker some more. Again.
Lather, rinse, repeat. The benefit of this strategy is that I tend to add new functionality much faster. The drawback is that once something is working, it's easy to be careless with testing. And here's what I've found:
If it's not covered, it's broken. Every. Single. Time.
I admit that anecdotes are not data and maybe I'm just a bad programmer, but my code is getting complex enough that there are parts of it which look perfectly fine, but are nonetheless wrong. I recently discovered a URL hack exploit because when I was trying to increase my code coverage, I focused on a few modules with coverage around 98% or so and tried to get them to 100%. Three modules in a row with just a couple of untested lines had bugs in those lines.
You don't want to test that things already tested work (don't test getters/setters in Moose unless you're hacking on Moose directly), but you had better make sure that your code is setting them appropriately.
So don't strive for 100%, but be very, very careful about the seductive thought of "I have 95% coverage". If your project is sufficiently complex, that last 5% will be a minefield.
For me, the area where I start drawing the line on coverage is conditionals for multiple operating systems or error handling when provoking the error state itself becomes a substantial coding hassle or where the error handling itself is trivial.
For example, consider the common "open ... or die ..." idiom. Do I really need to provoke a failure to open a file just to get 100% coverage of that "or" condition? What does that tell me? That Perl can successfully die? On the other hand if the "or ..." clause tried to recover in some way, then it probably is worth testing.
Note also that leaving out the "or die ..." clause would improve my coverage ratio, but make for worse code. [cough]autodie[cough]. That's why people say that 100% coverage is good but no guarantee that things are done well.
So I'd say that 100% is good goal, but one shouldn't lose sight of the cost-benefit tradeoff to be made for where one's time is best spent.
-- dagolden
@dagolden: I agree that the cost-benefit trade off is important, but how do we define when the benefit is great enough to justify the cost? Generally people give a definition by example rather than a definition. That leaves a lot of leeway for error.
If having an "or die" fail (via an unchecked eval or mishandling of exceptions as flow control) could cost your company a few thousand pounds, then it's probably worth the time to fix it.
So I would say that, as a general rule, don't test code you didn't write (modules), but strive for complete coverage of the code you did write, so long as the cost of failure is greater than the cost of testing.
That being said, it's worth remembering that expensive bugs are often cheap to fix.
Here is a question for you. Which is more important:
The cleaner and less error prone code that is possible with MooseX::Declare. Or the code coverage reports provided by Devel::Cover.
The problem is that Devel::Cover will not work on code written with MooseX::Declare.
@ironcamel: for smaller projects, MooseX::Declare is great. For larger ones, I would take Devel::Cover, no questions asked.
If you have the luxury of starting with a fresh new code-line, an incremental approach which strives for 100% coverage would be to start with a small baseline and make that (maximally) 100% covered. Every additional piece of code from that point on would require 100% coverage.
I've reminded people that when Devel::Cover says 100%, it doesn't necessarily mean that all your code is covered.
It doesn't test operators 1/$a
It doesn't test regular expressions $b =~ /complex regexp/.
It also doesn't mean that your Test coverage is 100%. It just means that every line and suboption of code has been run through. This is more of a measure of how well and thorougly you write your tests, too.
Similarly, you can have 100% coverage of code, but have Devel::Cover fail.
my $option = shift || $self->default();
is not something that Devel::Cover handles well. A programmer looks at that and says, oh, take whatever the default is. But D::C says, "What happens when $self->default() is false?".
I'm surprised no one has mentioned this already. 100% code coverage means you have written extremely fragile tests and your tests are highly coupled to your implementation. I would consider that as a bad thing. Lately, I have been taking more of a BDD approach and testing just the business requirements and the public interfaces. One benefit is that I can refactor the implementation to my hearts content without breaking any tests, as long as the interface doesn't change.
@ironcamel: what you're referring to sounds simply like a melange of "acceptance/integration" tests. Of course, if you're really writing those tests in a natural language that can be shown to stakeholders, then it sounds like you're approaching BDD, but the term, like DSL, has been so bandied about that it's almost become almost meaningless.
Unfortunately in the testing world, descriptions of different testing approaches have become almost meaningly as one person's 'regression testing' is another person's 'integration testing'.
And I confess I'm not sure I buy your argument that 100% code coverage means the tests are fragile. I can see how that would be possible, even likely, but I don't know that it's guaranteed.
It's not always feasible to have 100% coverage.
For example, if you have code which does
causing the third line to fail is at best impractical -- you can't make it fail by making filea unreadable, as it will never get there; filec is renamed, so making it unwritable doesn't help; and you can't make the directory unwritable and still get there, either. (This is a minimal version of something I ran into last night, and it's 5am so I may be missing a case, but in my actual case I have yet to find a way to reach the last line and still have it fail).
The main problems with 100% coverage is that it is measured off of the granddaddy of bad metrics: lines of code.
If you had 100% branch coverage, 100% normal case coverage, 100% edge case coverage, and 100% input parameter permutation coverage, you'd have some robust code.
You'll never even get close unless you do TDD, because then the micro-API of the method calls will be structured such that you can reach 100%.
After you get accustomed to it, it won't take much longer; but it will still feel like it does.