Exceptions as Flow Control

So, you're deep inside of a series of functions and you find that you really, really want what is effectively a goto: that is to say, you want to jump right out of all of those functions to the top level. Many people think "aha!, I'll just throw an exception."

Years ago, many Java programmers used to do this (of course, Exceptions are first class in Java). To paraphrase Mencken, it was simple, easy, and wrong. As understanding grew of how exceptions should be used, it was learned, surprisingly to some, that exceptions should be used for exceptional conditions. One rule of thumb was that exceptions (like aspects later) should be reserved for code where, in theory, if nothing went wrong, they could be completely removed without affecting the program's correctness. So you might throw an exception if you fail to open a file, but if you're iterating over the lines in that file (and if it's OK for that file to have no lines), then you don't throw an exception just because you've reached the end of the file. EOF is expected. Failure to connect is not. Unfortunately, failure to understand failure caused some horrible knock-on effects.

Here's some pseudo-code:

try {
    lines = read_file();
}
catch (exception File::Missing e) {
     // handle it
catch (exception File::Cant::Open e) {
     // handle it
catch (exception File::Cant::Read e) {
     // handle it
catch (exception File::EOF e) {
     // handle it
}

One of those things is not like the other. Three of the exceptions you can throw are errors and one of them is for flow control. See anything odd about that?

Many programmers know (or should) that overloading the semantics of something is fraught with error. Overloading behavior is one thing ("if" statements in functions often do this) and we know that these can be sources of bugs, but overloading the meaning (semantics) is begging for trouble. This is a common type of issue in C programs because it doesn't natively have exceptions (though you can try and fake 'em). I've seen problems where the "customer_account" function might return a negative value indicating an error. That works until customers are allowed to have negative balances. You've overloaded the semantics of the return value and now everyone who checks the return value must remember to dispatch appropriately.

PHP has another interesting example of a design flaw. Consider this code:

$a = array();
$a[2] = "foo";
$a["two"] = "bar";
echo $a[2];
echo $a["two"];

That will print "foo" and "bar". What they've done is overload the meaning of array indexes. Use a string and you'll get an associative array. Use a number and you get a numeric array. So what happens if you do this?

$a["2"] = "bar";
echo $a[2];

It will print "bar". Now you might be thinking "so what? We're obviously coercing a string into a number". That seems reasonable in a dynamic language, but it causes very difficult to diagnose bugs in PHP. Fortunately, Perl disambiguates arrays and hashes (associative arrays) and doesn't have this particular problem. (For PHP, you can argue whether they're overloading the meaning or behavior here, but I think associative arrays are different beasts from normal arrays and thus they're overloading the meaning).

Perl, of course, brings its own bag 'o bugs. Remember when pseudo-hashes were introduced? Pseudo-hashes were basically array references where the first element was a hashref. Forget about what the behavior was for a moment. Think about the meaning. We overloaded the semantics of the first element in array refs. This led to countless issues where programmers legitimately needed an arrayref of hashrefs, only to discover bizarre pseudo-hash behavior and warnings. Perl programmers were getting confused by this issue. When Schwern finally demonstrated the performance issue with pseudo-hashes, it was the nail in the coffin and they were deprecated.

The Perl 6 designers, realizing that overloaded semantics are problematic, decided, for example, that string eval and block eval should be "eval" and "try", respectively. Different types of loop constructs now have different names. Many things which were ambiguous in Perl 5 are disambiguated in Perl 6. Overloading semantics is bad.

So why am I mentioning this? Well, I've been debugging a nasty problem and cleared the error log. I then ran a few lines of code, only to find my error log at over 3,000 lines! The application software throws exceptions when exceptions are needed. It also throws exceptions for flow control. The logging software, not realizing it was dealing with overloaded semantics, dutifully logged everything in the error log. I spent hours trolling through that error log, only to finally conclude that the error I'm facing is not in that log.

Getting back to Java, James Duncan Davidson (you may know him as the creator of Apache Ant) was explaining to me the problems with Java exceptions. Because many programmers were using them for flow control, the JVM would have to catch the exceptions and do a lot of work recording stack frames for every one. This turned out to be a huge performance hit. For legacy Java apps, it turned out that only using exceptions for exceptions and not for flow control was a huge performance gain. Thus, I was curious about this for Perl.

If you have a legitimate exception (particularly non-recoverable) in Perl, you're probably not worried about performance at that point. If you're using exceptions to break out of several subroutine levels (and hoping a lower level eval doesn't trap and ignore it), what's the performance hit?

I'm not saying this is the best example, but here's a bare-bones benchmark:

#!/usr/bin/env perl

use strict;
use warnings;

use Benchmark 'cmpthese';

sub normal    { return }
sub exception { die }

cmpthese(
    5000000,
    {
        'normal' => sub {
            normal();
        },
        'double eval' => sub {
            eval { eval { normal() } };
        },
        'eval' => sub {
            eval { normal() };
        },
        'exception' => sub {
            eval { exception() };
        },
    }
);

Stripping away everything else, can you guess what the performance metrics of this are?

                 Rate   exception double eval        eval      normal
exception    465983/s          --        -79%        -87%        -91%
double eval 2242152/s        381%          --        -38%        -59%
eval        3597122/s        672%         60%          --        -34%
normal      5434783/s       1066%        142%         51%          --

As you can see, there's a serious performance hit using die/eval for flow control. A simple return is almost 10 times faster.

Of course, this very simplistic bit of code doesn't even begin to mirror the complexity of real-world applications. However, knowing the problems Java faced with this and knowing that, in general, overloading the semantics of something is dangerous, combined with this benchmark, should give you some pause here. If your die/eval flow control is buried at the bottom of a tight loop, you might want to rethink it.

So what's the right answer? Unfortunately, that entirely depends on your application and what it's doing. If you feel that you must have exceptions for flow control, use them, but like any technique, be aware of what you're doing and why you're doing it. Tradeoffs are dangerous if you don't know you're making them.

10 Comments

One alternative is to use something like (my) Continuation::Escape, which lets you jump higher up the stack without using eval/die.

Continuation::Escape is just a bit of sugar over Vincent Pit's awesome Scope::Upper, which probably does some really strange things to the internals. YMMV. :)

You start with "it's wrong" and end with, "well, you can do it, but there are trade-offs."

Exceptions are flow control. Viewing them as an error-reporting mechanism only is just giving up some of your own understanding of the tools at your disposal.

They have plenty of benefits when used for non-error reporting. They avoid the semi-predicate problem, they let you ignore any layers of indirection between a caller and callee for multi-level return, and so on.

The key point is: if your language makes them expensive or twitchy, be familiar with the expense and tics.

Thankfully, Perl has non-local goto for exactly this purpose:

!/usr/bin/env perl

use Benchmark 'cmpthese';

sub normal { return } sub exception { die } sub going { goto DONE }

cmpthese( 1000000, { 'normal' => sub { normal(); }, 'double eval' => sub { eval { eval { normal() } }; }, 'eval' => sub { eval { normal() }; }, 'exception' => sub { eval { exception() }; }, done => sub { going(); DONE: }, } ); END Rate exception double eval eval done normal exception 136240/s -- -77% -84% -85% -90% double eval 591716/s 334% -- -28% -37% -58% eval 826446/s 507% 40% -- -12% -41% done 934579/s 586% 58% 13% -- -34% normal 1408451/s 934% 138% 70% 51% --

By not designing an application to properly work without an exception flow control mechanism, you can cause confusion like my several thousand line error log which didn't contain any errors.

I'm not sure just what you mean, but I will say this: this is where something like "checked exceptions" would make the most sense. I don't want them to make sure that my code will handle actual errors; I'm okay with crashing, sometimes. On the other hand, they'd allow you to say: did I handle every possible terminating condition of this routine?

Obviously, CPS is superior here. Also, I wouldn't really recommend that the use of exceptions for flow control leave the boundaries of a given library. That is, if Foo::Plugin throws Foo::Stop, Foo::Plugins should really be limited to Foo. The issue is that exceptions alone are not for flow control. They're for use with catch blocks, and you have to ensure that both exist if you don't want to risk insane program termination.

As for your error log: (I predict that) either someone forgot to catch, which is equivalent to "forgot to check for magic return value" and isn't very interesting, or you're using SIG{DIE}, which is a whole separate can of worms.

I really enjoyed this post.

Thank you.

Leave a comment

About Ovid

user-pic Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant. Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.