Testing Refactored Webapp Against Current Version With Geest (Ruby Kage port)

I'm currently working on a big, and I mean /BIG/ codebase, like 200K LOC with about 10 years of history behind it.

In this post I'll briefly describe how I'm refactoring code using a little tool called Geest (github), which I completely stole from Ruby's kage.

tl;dr: With Geest you can check differences between new / old code transparently. It's really handy. Please let me know what you think, or file issues if you find any.

I'm currently refactoring mod_perl (1.3, mind you) code into PSGI, however, being that there aren't many, uh any, tests, and a lot of code that depends on wonky mod_perl behavior and stuff, the way this refactoring proceeds is basically as follows:

  1. Read the code
  2. Write some code
  3. Check /very/ carefully

Of course, I'm trying to write tests while writing new features, but when you have no formal spec, it's extremely hard to write automated tests. Luckily most of what I'm currently refactoring is code to "view" the content, and not a whole lot of code mutation, so all I really need to make sure is that the contents render correctly. So at this current stage, it's much easier to check with your own eyes if that code you have refactored is doing what it used to.

And I know, at this point you are all like "WTF, write your tests first". Yeah yeah. While I can't share you the code, all I can say is "I will, but only after I kill all of this god forsaken mod_perl code". There are many reasons that makes it hard to test against this code base. So my current priority is to just fix the shit out of it.

So what to do now? Well, my goal is to replicate the output of the production server. i.e., I have the correct output being served from there, so all I need to do is to check that given the same request, I write code that generates the same content. This is where Geest/kage enters into the picture.

Firstly, kage is a tool written in ruby, and it does exactly what I'm going to talk about Geest. kage is a great tool, and it does what it does correctly. The reason I wrote Geest is because, well, why not. I wanted to port it so I know how this worked.

So back to my original problem of trying to test my refactored code: Let's say you have production.myapp.com, and staging.myapp.com. You deployed your new refactored code to staging.myapp.com, and you want to make sure all that new staff matches your old code base.

Geest is basically a simple proxy server. It receives requests from your browser or whatever, and relays this request to all servers that you specify. In my setup, I put Geest in front of my staging server, and configured it to check produciton.myapp.com AND staging.myapp.com. You might do something like this:

    use strict;
    use Geest;

    my $server = Geest->new();
    $server->add_master(staging => (
        host => "staging.myapp.com",
        port => 5050,
    $server->add_backend(production => (
        host => "production.myapp.com",
        port  => 8080

    $server->on(select_backend => sub {
        return [ qw(staging production) ];

This tells Geest the following:

  1. You have "staging" and "production" backends
  2. Staging is considered to be "master", which means that response from "staging" is preferred when replying to the client
  3. On every request, you want to relay the request to both staging and production

And finally, after the above, you can tell Geest to create PSGI app:

    return $server->psgi_app;

This needs to be run in a AnyEvent-compatible PSGI server, e.g Twiggy:

    twiggy -a app.psgi

After all of this is done, you can point your browser to http://staging.myapp.com:5000/. All requests will be relayed to BOTH production and staging. When responses come in from "master" (which is "staging") you get a reply. Well, that's nice, but it still doesn't check for differences between production and staging.

So then at that point you can go back to your app.psgi and add the "backend_finished" hook to check if the responses are in fact the same:

    use Test::Diff;
    $server->on(backend_finished => sub {
        my ($responses) = @_;
        # $responses = {
        #    name_of_backend => {
        #       backend  => ...,  # Geest::Backend object
        #       response => ...,  # HTTP::Response object
        #       request  => ...,  # HTTP::Request object
        #    },
        #    ...
        # };
        if (! $responses->{prod} && $responses->{dev}) {

        my $data_prod = $responses->{prod}->{response}->decoded_content;
        my $data_dev  = $responses->{dev}->{response}->decoded_content;
        if ($data_prod ne $data_dev) {
            # You probably want to check that both responses are
            # content_type -> text/* before running diff()
            print STDERR diff(\$data_prod, $data_dev);

This hook is called after all of the backends have finished. As you can see, a simple diff() will get me the result I wanted. With this, I can be assured that the new code base is acting like the one in production. Yay! Mission accomplished!

So, in summary: Of course, it's better to have a perfect spec and automated tests in place before hand, but when push comes to shove, Geest is a good fallback.

Also, I'm sure you can do more with this. For example, you might change this to receive some of your traffic from production environment, but make sure to 1. make "production" the master backend, and 2) only send "idempotent" requests to the staging backend. This way you get to check your code base against live traffic.

Let me know if there are any problems, or please send me comments!

Leave a comment

About lestrrat

user-pic Japan Perl Association director; LINE, Inc; Tokyo, Japan