More stupid testing tricks

By Ovid on February 28, 2011 9:01 AM

For the guy who wrote the test harness currently ships with Perl and has commit rights to an awful lot of the Perl testing toolchain, I sure do seem to do a lot of stupid things while testing. That being said, sometimes I need to do those stupid testing tricks. That's because there seem to be roughly two types of developers:

Those who work in a perfect world
Those who work in the real world

I say the latter with a bit of bitterness because invariably I keep hearing YOU MUST DO X AND NOTHING ELSE where "X" is a practice that I often agree with, but it's the "and nothing else" bit that really frosts my Pop Tart (tm).

I'm in the rather unfortunate position of having an NDA so I can't exactly explain what's driving a particular use case, but I have a fantastic job which nonetheless has some serious constraints which I'm not in a position to deviate from. So not only am I not in a position to follow best practices in what I'm about to describe, I'm not even in a position to tell you why. Suffice it to say that I have an enormous system which I'm faced with and many things which I would take for granted in other environments are not the case here, so I'm forced to improvise. (Note that I didn't say it's a bad system. It's a different system and there is at least one fundamental assumption about software development which doesn't apply here, but I can't say more)

So let's say that you have a rather large dataset you're testing and you have some contraints you must face:

You have no control over the actual data
You cannot mock up an interface to that data
The data is volatile

How do you test that? Let's say a function returns a an array of array refs. At first, I tried writing something like the Levenshtein edit distance for data structures, but our data is so volatile that instead of having the tests fail the day after they're written (the data I test against is more-or-less stable for one day), I could have them last several days before failure hits.

Still, coming back a week later and still having the tests fail is not good. Further, by the time the data bubbles up to me, the criteria by which it's assembled and sorted is not present, so I have no way of duplicating that in my test (and it's complex enough that I wouldn't want to duplicate it).

Thus, I'm stuck with the awful problem of tests which are going to break quickly. I thought about the excellent Test::Deep, but that can let me validate the structure of the data, not the meaning. Test::AskAnExpert could let me know the meaning by punting to the human (me, in this case), but this doesn't do anything about the data being so volatile.

So I've written the abysmally stupid Test::SynchHaveWant. The idea is that the results you want are in the __DATA__ section of your .t file and if the test(s) fail, you can look at the failures and if they're not really failures, you can then "synch" your "wanted" results to the new results and watch them pass again. We do this by writing the synched results to the __DATA__ section.

For example: let's say that commit X on Feb 3rd is a known good commit, but your tests are now failing on Feb 27th. Roll your code back to X and rerun the tests. If they fail in the same way, you can assume that it's merely data changes. Simply "synch" your test data, rerun the test to verify, then checkout "head" again and make sure the tests pass.

This is an incredibly bad idea for several reasons:

Simply asserting that the results you want are the results you got is begging for laziness and false positives.
Rewriting your source code on disk is very stupid.
The data you want is now in the __DATA__ section, pulling it away from the code which should have it, masking the intent.
It's still a lot of manual work when there are failures.

All things considered, this is probably one of the dumbest testing ideas I've had, but it's working. I've a few more ideas to make it easier to use, but I'm still trying to figure out a cleaner way of making this work.

8 comments

Tagged as:

testing, tests

8 Comments

targ.myopenid.com | February 28, 2011 9:50 AM | Reply

FYI, the accepted abbreviation of synchronize is sync, not synch.

Maddingue | February 28, 2011 10:00 AM | Reply

Living as well in the real world, and even more, in the production world, I know the kind of problems you describe. "Upgrade! Upgrade!" isn't an answer when working on a non trivial production platform where 5.8 is the main version of Perl.

WRT to your tests, are you testing the code or the data? if the code, why testing against live data? why not keeping some well-known data to provide stable samples to work with?

Ovid replied to comment from targ.myopenid.com | February 28, 2011 10:01 AM | Reply

@targ: [citation needed] :)

I've seen both.

Ovid replied to comment from Maddingue | February 28, 2011 10:25 AM | Reply

I'm unfortunately having to test both the code and data at the same time. And for reasons I cannot describe, keeping well-known data to provide stable samples would be a ridiculously Herculean effort. I think I can give an analogy which doesn't violate my contract terms.

Imagine if your data came from many, many different sources and the quality, structure and location of those sources change so rapidly that trying to "mock" all of those sources would require so much time maintaining the test suite that you couldn't develop code any more. You either compromise on how you test or you give up testing. It's not a perfect analogy, but I think it gives you an idea of what I might be looking at.

Mithaldu | February 28, 2011 12:05 PM | Reply

I think i might be doing things similar to what you described here. I have to convert lots of data in such a way that actually writing the $want data isn't feasible, since it's so much AND changes all the time. My rescue was Test::Regression

It allows me to do things like this:

# on the shell:
set TEST_REGRESSION_GEN=1

# in the perl script:
use Test::Regression;
use Data::Dumper;
my $data = get_complex_stuff();
ok_regression( sub{ Dumper $data }, "t/data/complex.dump", 'complex data matches' );

The ENV assignment there tells Test::Regression to not actually do a comparison, but to just generate the data from the sub and dump it into the file name given. Then i look at git diff to see if my data changed and if the change is something i wanted.

Additionally, on a user system, TESTREGRESSIONGEN won't be set, so it does a proper comparison there.

Aristotle | February 28, 2011 8:04 PM | Reply

You work at booking.com, right?

Ovid replied to comment from Aristotle | February 28, 2011 8:09 PM | Reply

@Aristotle: yes, I do. Why do you ask?

Aristotle | March 1, 2011 5:03 AM | Reply

Because of the analogy. :-)

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Ovid

Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/

More info »

Ovid