When Must You Test Your Code?
Recently I wrote about how to be agile without testing (if you haven't read that, you should do so before reading this). I was planning on a follow-up after some comments came in and so far the reaction was decidedly mixed. I think that's a shame because not many people seemed to focus on the punchline:
And that's really the most interesting idea of this entire post: your customer's behavior is more important than your application's behavior.
For open source, this is clearly not true. Business success is not the primary driver and if it's an open source project, you should run it however you feel like it. I mean, heck, when you look at my open source code, you see that I'm heavily testing my code. I don't write that code for financial gain and I have different motivations.
However, if you're trying to build a business (and I'm focusing on that in this post), then it should be blindingly obvious that customers are more important than code, but that assertion leaves many people scratching their heads. Reddit, for example, was underwhelmed by the post, but one commenter (the only one on Reddit, honestly), reasoned through what I described and hit the nail squarely on the head:
And if you're relying on watching user behaviors to detect a noticeable but insignificant bug, there are many you're going to miss. And if you're relying on a production rollback to fix those bugs, you're going to let many of them through because it's less costly than a rollback. And eventually your program is going to be a big steaming mess that nobody likes, but that ill will can't be traced back to a particular "unexpected behavior" but rather to the confluence of all of them.
I loved that comment because it divined a very real problem with the approach I described.
In my "no testing" experience, your code, while not always turning into a big ball of mud, at minimum turns into an unweildy pile of ones and zeroes that isn't very fun to work on. I've worked at many different companies and I've never seen a well-designed, easy-to-hack system that doesn't have tests. It simply never happens (in my experience). As a result, you get a system that's harder to extend, harder to refactor, has more bugs and, crucially for many companies, less attractive for developers to work on. I still remember one company in London that had acquired this reputation and while they paid a reasonable salary, no one wanted to work there (London.pm members might guess who this is, but I won't confirm it and and I hope you won't post that here).
So you can be agile without testing, but you are going to accumulate technical debt and you have to be willing to service that debt. But if you want to minimize your technical debt, when must you test? That's simple: harm.
When people argue about whether or not there is a universal moral standard, the only one the really agree on is "do no harm" (though there's little agreement about what constitutes harm). For me, from an ethical standpoint, I have to argue in favor of do no harm in regards to software testing. In fact, it's not just a passive "do no harm", it's an active "mitigate harm" that I think is the only ethical choice. If you're not going to write tests to verify that the thumbnail size is correct, that's your choice. If you're not going to write tests verifying that you don't double-charge a customer who accidentally double-clicks "Submit", then I have to say that you've taken an unethical decision. I might want to write a lot more tests, but the only tests I'm going to insist must be written are those which prevent actual harm, even though I might want a lot more tests written.
So those are the tests you must write. What about the tests you should write? If you don't follow the monitoring approach that I described, then I would argue that you should write tests for anything and everything and hope that your customers like what you do. Note that this is actually what many "agile" shops do, so in this regard, I'm not proposing anything radical. However, if you do follow the monitoring approach, I would argue that at the very least you should write integration tests catching fatal errors. A customer still might buy something if they only see one columns of products instead of two, but they won't buy anything if the application crashes (not writing tests for this area is exposing your company to harm, but at least it's a choice).
I would also argue that if you have actions that customers must take for the success of your business, write enough integration tests to verify that your customers can take said action. Have a Web site? Make sure you know your conversion funnels and that people can get through them. Have an iPhone app? Make sure that the in-game extras people can buy actually show up and can be purchased.
Note that this approach, again, focuses on customers and not developers. You must not bring harm to your customers and you must always provide a way for your customers to complete the actions you wish them to complete. It's also good if you have logging to verify the most common actions your users take so you can at least verify that those actions aren't fatal: the Pareto rule is very much your friend here.
Over time, you'll wind up with enough integration tests that devs can still refactor with a modicum of safety and, if it's a large enough refactor, it's easy to add extra tests to verify that the behavior you're targeting is successful.
There is also a curious side-benefit to this: test coverage might be more useful. This sounds paradoxical, but remember, this approach focuses on whether or not your customers take the actions you need for your business to be successful — not whether or not your application does what you think it's supposed to do. As a result, your tests are more focused on the bulk of your business needs instead of that "Facebook Like" button and, if tests are managed correctly, your code coverage could show that there are large parts of your system that you're not actually using. Remember: for must of us, a unit test on a function will show that function as being tested, regardless of whether or not it's dead code. However, it's a lot harder to get coverage of dead code when you focus your tests on what your customers are actually doing.
To be fair, I generally don't test this way because very few companies have strong enough monitoring or deployment procedures for this to work. It's very much a "cutting edge" approach and even if you can get your systems evolved enough to allow this, my experience is that many developers are going to struggle with the idea that customer behavior is more important than application behavior.
Instead, I'll leave you with this quote from the Reddit thread I referred to:
I think your overall message is, "don't succumb to cargo-cult programming practices"--which is a great one! But the structure of your post sounds like you're suggesting doing user-monitoring in place of traditional testing, while you're really suggesting doing user-monitoring in addition to traditional testing, employed strategically.
Clearly I need to hire that person to be my editor.
If you want to try a radically different agile strategy:
- Use continuous deployment
- Monitor any behaviors that absolutely impact the bottom line
- Write integration tests to catch all fatal errors
- Write integration tests to validate conversion funnels
- Write tests to prevent any "harmful" behavior
Instead of repeating mantras, trying focusing on your customers instead of the code. I've seen this in action (minus some of the testing I recommend) and it works astonishingly well. Which would you rather have: code that does what you want or customers that do what you want? It's your call.
See also: Code evolution versus intelligent design.
A couple of thoughts that might be even relevant here.
0) Although you had comment about developers, I think you have not explicitly mentioned that one should not "harm" the developers and the maintenance programmers either. How does that fit in here?
1) I have this strange feeling that Python developers argue for beautiful code, Perl developers argue for working code, even if it is a hack. Do you think this might be related to what you write?
2) Are you trying to offer consulting services specifically for Perl shops or do you plan to be more general? In either case, but especially if you'd like to go wider, you probably want to write on your own site ....
3) When I arrive to a company that asks help with their Perl code, usually they have 0 tests and a big ball of spaghetti that mostly works. Writing unit tests is both a waste of time and impossible. (there are no separate units). Staring with the test you mentioned is the only possible way to provide value.
Gabor, regarding "harm" to maintenance developers, this may be an area I alluded to in the post where "do no harm" is universally accepted as a moral standard but the definition of "harm" varies. In this case, I would argue again that buggy software giving someone lethal doses of radiation is harm. Overcharging a customer because they double-clicked on a link is harm. Storing their credit card numbers without encryption is a strong potential for harm (and violates PCI compliance guidelines). Annoying maintenance developers is annoying (Redundancy 101), but whether or not that crosses the threshold of harm can be open to debate. You may very well define this as harm, but others can argue that the maintenance programmers at least have a choice if they wish to continue working there.
Regarding the Python/Perl split: that's a very interesting question. I would love to see studies which show how users of different programming languages approach these issues.
Also, you're absolutely right that this should be going on our Web site. This material will eventually be there, but our Web site designer is getting married and this has put some of the development plans on hold. However, we already have enough business that we're not overly concerned about this yet.
can i comment?
Adrian, of course you can comment. What a strange question :)
A few random musings.
First, alluding back to my comment-that-never-made-it-through-MT-sucky-sign-in - a large chunk of your approach to when to write tests is based on the assumption that most tests are written to detect error (or harm - in your elegant repositioning of priorities).
There are other reasons to write tests. To drive the design (you know I'm a huge TDD fan). To mark a goal for completion (a bunch of the BDD / acceptance test driven school). I'm sure they are more.
Personally I tend to write lots of design-driving tests, and very few harm detection tests.
That's not because I don't value preventing harm - it's because my purpose in writing tests is to drive the design. That's going to produce "waste" if you look at the test suite as protecting from harm. However, for me anyway, it produces less waste than the inevitable debugging sessions that result when I don't write in a TDD style.
I am a massive, massive fan of continual deployment/delivery - but in phrasing it as an alternative to testing I think you're doing both CD and testing a disservice.
They are servicing very different needs, and the advantages that CD give you are often interestingly different ones from the advantages that testing give you.
Tests (for me) are, mostly, about the health of the code.
The metric / experimental culture that a CD approach produces is, mostly, about the health of the business.
Adding CD to the mix hasn't really altered my approach to testing at all. Because most of the tests that I write are not about harm or business health. This is the pattern I've seen in other organisations too.
It has, however, completely changed the way I attack defining and discovering features, managing deployment, etc. Story writing has changed. The idea of "delivery" and "success" and "done" have all changed.
Testing - not so much.
You also - of course - have to be working in an environment where CD works to the fullest effect. Doing this stuff in, for example, the mobile app space is considerably more challenging.
Can you kindly recommend resources (books, websites, etc) for learning how to write Perl test cases?
There are several things that I like about this article. I like that the article acknowledges technical debt is sometimes appropriate and provides some suggestions on managing that.
Your concluding paragraph says it all.