Exceptions as Flow Control

By Ovid on February 23, 2010 10:09 AM

So, you're deep inside of a series of functions and you find that you really, really want what is effectively a goto: that is to say, you want to jump right out of all of those functions to the top level. Many people think "aha!, I'll just throw an exception."

Years ago, many Java programmers used to do this (of course, Exceptions are first class in Java). To paraphrase Mencken, it was simple, easy, and wrong. As understanding grew of how exceptions should be used, it was learned, surprisingly to some, that exceptions should be used for exceptional conditions. One rule of thumb was that exceptions (like aspects later) should be reserved for code where, in theory, if nothing went wrong, they could be completely removed without affecting the program's correctness. So you might throw an exception if you fail to open a file, but if you're iterating over the lines in that file (and if it's OK for that file to have no lines), then you don't throw an exception just because you've reached the end of the file. EOF is expected. Failure to connect is not. Unfortunately, failure to understand failure caused some horrible knock-on effects.

Here's some pseudo-code:

try {
    lines = read_file();
}
catch (exception File::Missing e) {
     // handle it
catch (exception File::Cant::Open e) {
     // handle it
catch (exception File::Cant::Read e) {
     // handle it
catch (exception File::EOF e) {
     // handle it
}

One of those things is not like the other. Three of the exceptions you can throw are errors and one of them is for flow control. See anything odd about that?

Many programmers know (or should) that overloading the semantics of something is fraught with error. Overloading behavior is one thing ("if" statements in functions often do this) and we know that these can be sources of bugs, but overloading the meaning (semantics) is begging for trouble. This is a common type of issue in C programs because it doesn't natively have exceptions (though you can try and fake 'em). I've seen problems where the "customer_account" function might return a negative value indicating an error. That works until customers are allowed to have negative balances. You've overloaded the semantics of the return value and now everyone who checks the return value must remember to dispatch appropriately.

PHP has another interesting example of a design flaw. Consider this code:

$a = array();
$a[2] = "foo";
$a["two"] = "bar";
echo $a[2];
echo $a["two"];

That will print "foo" and "bar". What they've done is overload the meaning of array indexes. Use a string and you'll get an associative array. Use a number and you get a numeric array. So what happens if you do this?

$a["2"] = "bar";
echo $a[2];

It will print "bar". Now you might be thinking "so what? We're obviously coercing a string into a number". That seems reasonable in a dynamic language, but it causes very difficult to diagnose bugs in PHP. Fortunately, Perl disambiguates arrays and hashes (associative arrays) and doesn't have this particular problem. (For PHP, you can argue whether they're overloading the meaning or behavior here, but I think associative arrays are different beasts from normal arrays and thus they're overloading the meaning).

Perl, of course, brings its own bag 'o bugs. Remember when pseudo-hashes were introduced? Pseudo-hashes were basically array references where the first element was a hashref. Forget about what the behavior was for a moment. Think about the meaning. We overloaded the semantics of the first element in array refs. This led to countless issues where programmers legitimately needed an arrayref of hashrefs, only to discover bizarre pseudo-hash behavior and warnings. Perl programmers were getting confused by this issue. When Schwern finally demonstrated the performance issue with pseudo-hashes, it was the nail in the coffin and they were deprecated.

The Perl 6 designers, realizing that overloaded semantics are problematic, decided, for example, that string eval and block eval should be "eval" and "try", respectively. Different types of loop constructs now have different names. Many things which were ambiguous in Perl 5 are disambiguated in Perl 6. Overloading semantics is bad.

So why am I mentioning this? Well, I've been debugging a nasty problem and cleared the error log. I then ran a few lines of code, only to find my error log at over 3,000 lines! The application software throws exceptions when exceptions are needed. It also throws exceptions for flow control. The logging software, not realizing it was dealing with overloaded semantics, dutifully logged everything in the error log. I spent hours trolling through that error log, only to finally conclude that the error I'm facing is not in that log.

Getting back to Java, James Duncan Davidson (you may know him as the creator of Apache Ant) was explaining to me the problems with Java exceptions. Because many programmers were using them for flow control, the JVM would have to catch the exceptions and do a lot of work recording stack frames for every one. This turned out to be a huge performance hit. For legacy Java apps, it turned out that only using exceptions for exceptions and not for flow control was a huge performance gain. Thus, I was curious about this for Perl.

If you have a legitimate exception (particularly non-recoverable) in Perl, you're probably not worried about performance at that point. If you're using exceptions to break out of several subroutine levels (and hoping a lower level eval doesn't trap and ignore it), what's the performance hit?

I'm not saying this is the best example, but here's a bare-bones benchmark:

#!/usr/bin/env perl

use strict;
use warnings;

use Benchmark 'cmpthese';

sub normal    { return }
sub exception { die }

cmpthese(
    5000000,
    {
        'normal' => sub {
            normal();
        },
        'double eval' => sub {
            eval { eval { normal() } };
        },
        'eval' => sub {
            eval { normal() };
        },
        'exception' => sub {
            eval { exception() };
        },
    }
);

Stripping away everything else, can you guess what the performance metrics of this are?

                 Rate   exception double eval        eval      normal
exception    465983/s          --        -79%        -87%        -91%
double eval 2242152/s        381%          --        -38%        -59%
eval        3597122/s        672%         60%          --        -34%
normal      5434783/s       1066%        142%         51%          --

As you can see, there's a serious performance hit using die/eval for flow control. A simple return is almost 10 times faster.

Of course, this very simplistic bit of code doesn't even begin to mirror the complexity of real-world applications. However, knowing the problems Java faced with this and knowing that, in general, overloading the semantics of something is dangerous, combined with this benchmark, should give you some pause here. If your die/eval flow control is buried at the bottom of a tight loop, you might want to rethink it.

So what's the right answer? Unfortunately, that entirely depends on your application and what it's doing. If you feel that you must have exceptions for flow control, use them, but like any technique, be aware of what you're doing and why you're doing it. Tradeoffs are dangerous if you don't know you're making them.

10 comments

Tagged as:

exceptions

10 Comments

sartak.org | February 23, 2010 1:19 PM

One alternative is to use something like (my) Continuation::Escape, which lets you jump higher up the stack without using eval/die.

Continuation::Escape is just a bit of sugar over Vincent Pit's awesome Scope::Upper, which probably does some really strange things to the internals. YMMV. :)

Ovid | February 23, 2010 1:26 PM

@Sartak: I thought about mentioning continuation passing as an alternative, but without writing realistic benchmarks, I thought I'd be guessing and this was just a quick post for lunch. Still, even with performance issues, you wouldn't have programmers and application code guessing every time whether or not an exception is actually an exception.

Ovid | February 23, 2010 1:35 PM

I should mention that "not overloading semantics" is part of the reason why the Liskov Substitution Principle (and "strict equivalence") is such an important concept. It's also the reason why it's not recommended that you reuse the same variable name for radically different things.

Many functional programming languages go even further and don't allow variable assignment. This pretty much guarantees that you can't get an unexpected meaning from a variable.

rjbs.manxome.org | February 23, 2010 1:41 PM

You start with "it's wrong" and end with, "well, you can do it, but there are trade-offs."

Exceptions are flow control. Viewing them as an error-reporting mechanism only is just giving up some of your own understanding of the tools at your disposal.

They have plenty of benefits when used for non-error reporting. They avoid the semi-predicate problem, they let you ignore any layers of indirection between a caller and callee for multi-level return, and so on.

The key point is: if your language makes them expensive or twitchy, be familiar with the expense and tics.

Ovid | February 23, 2010 1:52 PM

@Ricardo: OK, that's a fair point about my "wrong/ok, you can do it". However, arguing that "exceptions are flow control" seems to me like arguing that fine coffee and fine whiskey are substitutes. While I drink both, I drink them in completely different contexts (though sometimes I have fine whiskey in my fine coffee and get pseudo-whiskey warnings and attendant performance problems).

So while exceptions impact flow control, using them for flow control is a dodgy practice at best. Exceptions are generally expensive in just about every language due to stack handling and other issues (and finding that unchecked eval deep in my code base and breaking my flow control is real fun). By not designing an application to properly work without an exception flow control mechanism, you can cause confusion like my several thousand line error log which didn't contain any errors.

That being said, if your language doesn't provide rich enough constructs to avoid using, say, goto, sometimes you use that keyword and it's easy to abuse (remember BASIC?). Perhaps it's safe to say that Perl 5 doesn't offer features suitable for avoiding this construct (proper continuations?).

educated_foo | February 23, 2010 2:33 PM

Thankfully, Perl has non-local goto for exactly this purpose:

!/usr/bin/env perl

use Benchmark 'cmpthese';

sub normal { return } sub exception { die } sub going { goto DONE }

cmpthese( 1000000, { 'normal' => sub { normal(); }, 'double eval' => sub { eval { eval { normal() } }; }, 'eval' => sub { eval { normal() }; }, 'exception' => sub { eval { exception() }; }, done => sub { going(); DONE: }, } ); END Rate exception double eval eval done normal exception 136240/s -- -77% -84% -85% -90% double eval 591716/s 334% -- -28% -37% -58% eval 826446/s 507% 40% -- -12% -41% done 934579/s 586% 58% 13% -- -34% normal 1408451/s 934% 138% 70% 51% --

rjbs.manxome.org replied to comment from Ovid | February 23, 2010 4:10 PM

By not designing an application to properly work without an exception flow control mechanism, you can cause confusion like my several thousand line error log which didn't contain any errors.

I'm not sure just what you mean, but I will say this: this is where something like "checked exceptions" would make the most sense. I don't want them to make sure that my code will handle actual errors; I'm okay with crashing, sometimes. On the other hand, they'd allow you to say: did I handle every possible terminating condition of this routine?

Obviously, CPS is superior here. Also, I wouldn't really recommend that the use of exceptions for flow control leave the boundaries of a given library. That is, if Foo::Plugin throws Foo::Stop, Foo::Plugins should really be limited to Foo. The issue is that exceptions alone are not for flow control. They're for use with catch blocks, and you have to ensure that both exist if you don't want to risk insane program termination.

As for your error log: (I predict that) either someone forgot to catch, which is equivalent to "forgot to check for magic return value" and isn't very interesting, or you're using SIG{DIE}, which is a whole separate can of worms.

Sawyer X | February 24, 2010 8:10 AM

I really enjoyed this post.

Thank you.

Ovid | February 24, 2010 10:42 AM

@SawyerX You're welcome :) It's one of those really annoying issue that I've been meaning to write about for a while. Exception handling in general seems to be pretty neglected in Perl, though folks are getting better at it.

In a related vein, I've also been thinking about writing something similar for Aspects (AOP) because while they're interesting, people are doing some incredibly stupid things with them such as using them to make a method thread safe. If your joinpoint disappears, the method is no longer thread safe, but there's no visible warning and the tests may very well pass.

Ovid | March 1, 2010 1:10 PM

This is a test post. Please ignore.

About Ovid

Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/

More info »

Ovid