When a failure is not a failure

By Buddy Burden on June 15, 2014 1:32 AM

[What I really should have done here is something like Tom Wyant or Joel Berger. Being a literary type, that appeals to me. Of course, with Dupin and Holmes taken, who would I pay homage to? Nero Wolfe, perhaps? Batman? Michael Westen? Harry Dresden?

Anyways, maybe next time, if I have more time to prepare. For now, you’ll have to settle for plain ol’ me.]

At $work, I’m the guy who maintains the tool we use for branching, merging, pushing ... release management, I suppose you’d say. I don’t actually do the pushes most of the time, but I’m usually the guy who takes the tickets when there’s something wrong with the process.

At the end of the push, we build a new AMI. That’s Amazon-speak for a master virt which we can clone to bring up new production machines quickly. By making a new one after every push, any new instances we bring up will automatically have the latest code on them. Recently, my coworker doing the push reported that the AMI build failed. Well, he clarified: it says it failed, although it actually looked like it succeeded. Probably a transient error, I assured him (with fingers crossed). But, no such luck: the next week, the same thing happened. This time my coworker figured out what underlying command the release management script was running and ran it by hand. Yep, it definitely succeeds, he reported. It just says it failed.

And so it went for the next three pushes. You know how it is ... it’s irksome, but there’s nothing actually broken, so why waste time fiddling with it? Finally someone decided to make a ticket for it. I spoke up in tech chat: you know guys, probably someone changed something with the underlying command so it doesn’t exit cleanly. The poor release management script is stupid; all it does is run the command, then check the exit value. Non-zero, report failure. I bet anyone could fix this with a bit of time and some Git archaeology.

Two more pushes with people grumbling about it and I finally just grabbed the ticket myself. I found the exact date of the first push where it happened (and that’s why you always log your tech chat, people), then started poking around the Git commit logs for any changes just before that date.

As it turns out, another coworker had decided to refactor the command entirely, moving the main logic out of the bin/ script and into its own class. Well, nothing wrong with that, I thought ... sounds like a pretty good idea to me. The guts of the script was reduced to dealing with configuration, environment, and options, and then this code:

my $command = CE::Command::ec2_launch_instance->new(
    # bunches o' parameters here
);
$command->execute();
exit $command->success ? 0 : 1;

Well, nothing much to see there, I thought. Better look where all the action is and see what’s really going on:

sub execute {
    my $self = shift;
    # do some stuff
    return unless $self->wait;
    # more stuff, including creating $manager
    if (not $manager->all_success) {
        $self->success(0);
        return;
    }
    # a wee bit more stuff
    $self->success(1);
}

Hunh, I thought. I still don’t see it.

Do you? I’m sure some of you out there spied it instantly and are even now nodding your heads in commiseration. If you didn’t spot it though, go back and look. I promise you, I gave you enough of the code to figure it out ... nothing I left out is relevant to the problem. Go ahead and ponder it for a bit. I’ll wait.

Sure you want to peek at the answer?

Really sure?

Okay, I’ll tell you what it was.

I was all set for a marathon debugging session ... I figured I’d have to run the whole stupid script a bunch of times, adding more and more debugging. Or maybe (<shudder>) actually fire up the debugger.^* Which wouldn’t be so bad, except building an AMI takes a fair amount of time, so it’s not the sort of thing you really want to be doing over and over again until you winkle out an obscure bug. Especially for a bug where nothing’s actually broken.

But then I had a nagging hunch. I wish I could tell you what triggered it, but I honestly don’t know. Probably something I read somewhere a long time ago ... and this is why you must read about Perl (or whatever your language-of-choice is). Read the manuals, read the books, read the blogs. Go download Modern Perl. Subscribe to Perl Weekly. The more you know, the better chance that, when you hit that weird bug, something’s going to tickle the back of your brain, something’s going to stick in the sewer grate at the bottom of your mind, something’s going to blink a little light way off at the periphery of your vision, and the little light will say ...

Wait. Go back a bit. What was the middle thing?

Although my forebrain told me to concentrate on the execute method—that’s where all the action is, after all—something made me glance back at the script itself. Remember this part?

my $command = CE::Command::ec2_launch_instance->new(
    # bunches o' parameters here
);
$command->execute();
exit $command->success ? 0 : 1;

Even more specifically ... this part?

exit $command->success ? 0 : 1;

Something wrong here ... maybe exit can’t take an expression?

[absalom:~] perldoc -f exit
       exit EXPR
       exit    Evaluates EXPR and exits immediately with that value.    Example:
       :
       :

Well, so much for that theory. Still ... something ...

Could this be one of Perl’s weird precedence things? I mean, I know ternaries are pretty low in the precedence list ... let’s look:

[absalom:~] perldoc op
PERLOP(1)                                  User Contributed Perl Documentation                                  PERLOP(1)
 
 
 
NAME
       perlop - Perl operators and precedence
 
DESCRIPTION
   Operator Precedence and Associativity
       Operator precedence and associativity work in Perl more or less like they do in mathematics.
       :
       :
 
           left        terms and list operators (leftward)
           left        ->
           nonassoc    ++ --
           right       **
           right       ! ~ \ and unary + and -
           left        =~ !~
           left        * / % x
           left        + - .
           left        << >>
           nonassoc    named unary operators
           nonassoc    < > <= >= lt gt le ge
           nonassoc    == != <=> eq ne cmp ~~
           left        &
           left        | ^
           left        &&
           left        || //
           nonassoc    ..  ...
           right       ?:
           right       = += -= *= etc.
           left        , =>
           nonassoc    list operators (rightward)
           right       not
           left        and
           left        or xor
 
       :
       :

What’s that thing roughly in the middle there? “named unary operators”? What are the chances, do you suppose, that exit is a “named unary operator”?

Pretty good, as it happens.

[absalom:~] perl -le 'exit 2' ; echo $?
2
[absalom:~] perl -le 'exit 1 ? 2 : 3' ; echo $?
1

Crap.

You mean this

exit($command->success ? 0 : 1);

would fix it?

Yep. That fixes it.

Crap.

Well, lesson learned and all that, and maybe this will help someone else out there, but I also wonder if maybe the man page could be clarified a bit. I can’t help but think that this:

[absalom:~] perldoc -f exit
       exit EXPR

is a bit misleading. Sure, exit EXPR ... but only if EXPR consists only of terms and operators higher in the precedence list than “named unary operators.” That seems like a bit of a gotcha.

Anyways, mystery solved, and here’s my commit message:

Author: Buddy Burden <barefootcoder@gmail.com>
Date:   Wed Jun 4 15:06:49 2014
 
    oh look: named unary operators have a higher precedence than the ternary <sigh>

[P.S.: I hope to see many of you at YAPC in a week or so. If you see a short dumpy guy wandering the halls with a scruffy beard and no shoes, walk over and say “hi!”]

* You may recall from previous posts that I hate the debugger. Part of the reason is that I never reach for it unless I have a really weird problem, and really weird problems are not the debugger’s forté. In the past several months, I’ve only tried to use it twice, and once it worked like a charm, and the other time it did something completely different (and even wronger) than I got when I just ran it straight. Actually, 50/50 is a better record than I usualy have with the stupid thing. But I digress.

10 comments

Tagged as:

bugs

10 Comments

blog.plover.com | June 15, 2014 4:29 AM | Reply

I have been bitten by that at least twice that I can remember and probably more like three or four times. So it is with no pride at all that I can say that I recognized the problem immediately.

James | June 15, 2014 6:56 AM | Reply

not running with warnings on?

$ perl -Mwarnings -le 'exit 1 ? 2 : 3' ; echo $?
Useless use of a constant (2) in void context at -e line 1.
Useless use of a constant (3) in void context at -e line 1.

confuseAcat | June 15, 2014 12:07 PM | Reply

James sure has a point there. But this is also clearly a documentation bug. Why would some built-in functions have different precedence than others? perldoc even calls exist a "function".

Octavian Rasnita | June 16, 2014 1:33 PM | Reply

First trial:

use strict;
use warnings;
exit 1 ? 2 : 3;

Result:
Useless use of a constant (2) in void context at E:\z1.pl line 4.
Useless use of a constant (3) in void context at E:\z1.pl line 4.
(completed with exit code 1)

Second trial:

use strict;
use warnings;
foo();

sub foo {
exit 1 ? 2 : 3;
}

No result
(completed with exit code 1)

It is strange that if exit() is called in a subroutine then "use warnings" has no effect.

Toby Inkster replied to comment from Octavian Rasnita | June 16, 2014 2:10 PM | Reply

The (lack of) warning is nothing to do with exit(). Try this:

$ perl -we'42;'
Useless use of a constant (42) in void context at -e line 1.

Now let's wrap that useless constant in a sub...

$ perl -we'sub xxx { 42; } xxx();'

No warning! Why? Because 42 is no longer a useless use of a constant - it provides a return value for the sub. What about xxx();... is that a useless function call in void context? Well, in this case it is, but Perl doesn't warn about such things because it doesn't know for sure whether the function call might have side-effects, and thus might be useful to call in void context.

sthoenna replied to comment from James | June 16, 2014 4:51 PM | Reply

The actual code used 0 and 1 and those two numbers are exempt from that warning.

sthoenna replied to comment from confuseAcat | June 16, 2014 4:57 PM | Reply

Why would different functions have different precedence? Because they always have, and because it is convenient that they do. Even the EXPR that was found confusing is a giveaway here: perlfunc carefully denotes the list operators by saying LIST instead.

Ether replied to comment from Toby Inkster | June 16, 2014 5:05 PM | Reply

tobyink is correct. Consider:

$; perl -Mwarnings -le 'exit 1 ? 2 : 3' ; echo $?
Useless use of a constant in void context at -e line 1.
Useless use of a constant in void context at -e line 1.
1

and then:

$; perl -Mwarnings -le 'sub foo { exit 1 ? 2 : 3} foo()' ; echo $?
1
$; echo $?
0

The conditional operator is no longer useless, because it provides the return value for the sub.

Buddy Burden | June 16, 2014 6:42 PM | Reply

In the question of why warnings didn't catch it, sthoenna is right, and the sub question is a red herring (the original bug wasn't in a sub anyway). That is:

[absalom:~] perl -Mwarnings -le 'exit 1 ? 2 : 3' Useless use of a constant (2) in void context at -e line 1. Useless use of a constant (3) in void context at -e line 1.

true, but:

[absalom:~] perl -Mwarnings -le 'exit 1 ? 0 : 1' ; echo $? 1

For some reason, using 0 and 1 is immune to the warning. I'm sure there's a very good reason for this ... somewhere. But it turns out to be very frustrating here. :-/

However, I have to disagree with sthoenna on another point:

> Even the EXPR that was found confusing is a giveaway here: perlfunc carefully denotes the list operators by saying LIST instead.

This isn't a difference between EXPR and LIST. Because $command->success ? 0 : 1 is definitely an EXPR and not a LIST. It's just an EXPR that isn't valid after a named unary operator such as exit. Thus my comment that the doc was a bit misleading here.

Toby Inkster replied to comment from Buddy Burden | June 16, 2014 10:28 PM | Reply

Perl actually has about four different things it calls an expression internally. The thing which is valid after a named unary operator is an "arithexpr"; this is the most limited form of expression, only allowing operators with precedence down to the bitshift operators.

Next is the "termexpr", which allows anything down to assignment. After that is the "listexpr", which allows anything down to the comma.

Lastly there is the "fullexpr", which would even accept the low-precedence boolean operators (e.g. and and or).

In a standard function call, each parameter is parsed as a "listexpr", and thus you can use higher precedence operators such as the ternary operator freely within their parameters.

However, named unary operators are not standard function calls. Their parameter is parsed as an "arithexpr".

User-defined functions with particular prototypes are also parsed like named unary operators. I don't know of a definitive list of these prototypes. perl5140delta contains a list of prototypes which were changed to act more like unary operators in Perl 5.14, but there were already some prototypes which acted like that prior to 5.14. ($) is one of them.

For a demonstration, compare the output of the following under both Perl 5.12.x and 5.14.x:

perl -MO=Deparse -E'sub foo (;$) {}; my $x = foo $a > $b;'

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Buddy Burden

14 years in California, 25 years in Perl, 34 years in computers, 55 years in bare feet.

More info »

Buddy Burden