How to be agile without testing
What's a bug?
Fair warning: if you're someone who has the shining light of the converted in your eyes and you've discovered the One True Way of writing software, you might feel a bit challenged by this post.
Your newest developer just pushed some code, but it has a bug. They screwed up the CSS and one of the links is a bright, glaring red instead of the muted blue that your company requires. While you're sitting with her, counseling her on the importance of testing, you get a call from marketing congratulating you on the last release. Sales have jumped 50%.
You know that the only change is the link color.
Was that code really a bug? Are you honestly going to roll it back?
More importantly, and this is the question that many people get wrong: what are you going to learn from this?
Why we write tests
Why do we go agile? Because we believe we can improve the software process. Because we believe we can share information better and, this is very important to agile: get feedback early and often (hint: that theme is going to recur). Another interesting thing about agile is that every agile methodology, without exception, says that you need to adjust that methodology to meet you particular needs. I've said this before and it bears repeating: you can't be agile unless you're, well, agile. So some agile teams don't have fixed iterations. Others (most?) don't do pair programming. Code review is only done on the tricky bits or maybe for newer programmers. And guess what? These companies often do very well, despite doing things differently.
But none of them talk about getting rid of testing. They just don't.
And yet in our "red link" example above, testing might well have made it harder to discover a 50% increase in sales.
For most of us, we learned about testing years ago and it was good. Then we learned about TDD and realized we had found testing Nirvana. FIT testing was a nifty idea that's heavily evangelized by those who offer FIT testing consulting services — and pretty much no one else. And now BDD leaves some breathless while others yawn. It's just the next craze, right?
But what is testing for? From a technical perspective, we might argue that it's to make sure the software does what we want it to do. But is that the most important thing? Remember that over a size of mumble lines of code, all software has bugs. All of it.
Instead, I think it's better to say that all software has unexpected behavior. We write tests because we hope the software will do what we want the software to do, but instead, isn't it better if the customers do what we want them to do? You can build a better mousetrap, but there's no guarantee they will come. And if there's anything to learn from Digg or other tech disasters it's this: customers are going to do what they damned well please, regardless of whether or not your software "works", the experts you've consulted or how many focus groups you've held.
So rather than introduce software testing as some proxy for customer behavior, let's think about the consumers of our software for a moment.
A list of undesirable things
Considering our "bright red link" example above, I ask again: is it a bug? In that example (which was not chosen at random), it's easy to argue that it's a software bug, but that's only because the software exhibited unexpected behavior. In this case, it was a 50% increase in sales.
So now, instead of bugs — always bad! — we can think in terms of "unexpected behavior", sometimes good, sometimes bad.
So how do you know which is which?
You make lists of undesirable things. 500 errors on your Web site are bad, but
are 302s? Tough to say. Maybe you want to keep RAM usage below a certain
level, or not see a significant drop in sales. And you probably want to make
sure that responses never take more than
Make a list of everything that's unequivocally undesirable (for example, a Facebook "like" button going away doesn't count as unequivocally undesirable) and add monitoring for all of those behaviors. Every time you change or add technologies, go over your list of undesirable things again. Are they up to date? Is there anything you need to change?
Some of those undesirable things are reversible (dropping sales) and the alternative is good. So monitor those, too. Maybe you want to get notified when a release improves response time by 10%.
Well, it's great, but it doesn't replace testing. Not by a long shot. You've made your bi-weekly release, RAM consumption has skyrocketed, you're swapping like mad and now you a 3,000 line diff to go through. Finding a memory leak can be hard at the best of times and normal testing often misses them (but checkout Test::LeakTrace), but now you have a roll back a huge change and goes through 3,000 lines of code to find your problem.
So you don't do that. Instead, you're switching to continuous deployment. With this model, you push code to production the moment it's ready. Of course, it's good if you actually push it to a box, watch it, push it to a cluster, watch it, and then push it to all servers. With your extensive monitoring, undesirable things usually show up pretty quickly and your memory leak is a 30 line diff instead of a 3,000 line diff.
Which one do you want to deal with?
(Naturally, I used a memory leak as an example, but that's one of the things which often takes longer to show up, but I'm too lazy to change that example. Pretend I wrote "5% increase in 404s.)
In my experience with this, customers are fairly forgiving about minor quirks and most unexpected behaviors are things like "this image isn't showing up" or "these search results are ordered incorrectly." Those tend to not be catastrophic. In fact, many times this unexpected behavior goes unnoticed. Most of the time the unexpected behaviors will turn out to be neutral or bad, in terms of undesirable things, but sometimes they turn out to be the good unexpected behavior. You'll never know if you don't try.
As you may expect, this technique works very well with A/B testing and if you have the courage to look for unexpected behaviors instead of bugs, A/B testing is the next logical step.
Note: None of the above precludes writing tests. None of it. I've seen the "monitoring undesirable things" strategy work extremely well and I firmly believe that it can work in conjunction with software testing. However, it's a different way of testing software, one that's more reliant on customer behavior than exacting specifications. So the title of this post is actually a bit of a lie; it's just a different want of looking at testing.
And that's really the most interesting idea of this entire post: your customer's behavior is more important than your application's behavior.
See also: when must you test your code?.