Scalar context gotchas

On Twitter, Curtis Poe (@OvidPerl) posted some interesting and unintuitive Perl code; I've slightly reformatted it and changed some values for the sake of the following discussion.

use Data::Dumper;
sub boo { 4,5,6 }
my @x = ( boo() || 5,8,7);
print Dumper \@x;

What do you think this prints?

Let's look at some simpler examples of code:

$ perl -le'@x = (4,5,6,7,8); $y = @x; print $y'
5

An array like @x, in scalar context, evaluates to the number of elements in that array. In this case, @x contains five elements.

$ perl -le'$y = (4,5,6,7,8); print $y'
8

A list, as opposed to an array, returns the last element in scalar context - in this case, 8.

Let's see how perl parses this line. We can use the B::Deparse module with the -p option, which adds extra parentheses to make the structure clearer.

$ perl -MO=Deparse,-p -le'$y = (4,5,6,7,8); print $y'
BEGIN { $/ = "\n"; $\ = "\n"; }
($y = ('???', '???', '???', '???', 8));
print($y);
-e syntax OK

As you can see, the unused constant values have been optimized away.

perldoc perldata says this about arrays and lists in scalar context:

If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.)

And finally:

$ perl -le'$y = (4..8); print $y'

This prints an empty line. You might have thought that this is the same as the line with the list above, but the range operator is somewhat different. In list context, it does indeed return a list of values counting from the left value to the right value. However, in scalar context, the range operator behaves like a flip-flop, starting out with a false value, which stringifies to an empty string.

perldoc perlop says this about the range operator:

In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each ".." operator maintains its own boolean state, even across calls to a subroutine that contains it. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again.

After this discussion, you probably know what the program at the beginning of this post prints. Let's look at how perl parses it:

$ perl -MO=Deparse,-p test.pl 
use Data::Dumper;
sub boo {
    (4, 5, 6);
}
(my(@x) = ((boo() || 5), 8, 7));
print(Dumper((\@x)));
test.pl syntax OK

If we run the program, it prints:

$ perl test.pl 
$VAR1 = [
          6,
          8,
          7
        ];

The expression (boo() || 5) is the same as ((4, 5, 6) || 5) and since a list returns the last value in scalar context, this is the same as (6 || 5), which is 6.

6 Comments

I covered some of this in perlfaq4's What is the difference between a list and an array?, which is another version of what I wrote under the same title for The Effective Perler. :)

And, this isn't really a context problem although it contributes to some of the confusion. I'm surprised you never said "precedence".

However, remember that there is no such thing as a list in scalar context. There is the comma operator in scalar context, but no list in scalar context.

The trick here is the intermediate expression with ||, a scalar operator which has higher precedence than ,. So, bar() gets called in scalar context for the scalar operation. It's a bit easier to see with this code:

use Data::Dumper;
sub boo {
wantarray? (4,5,6) : 'scalar'
}
my @x = ( boo() || 5,8,7);
print Dumper \@x;

That gives ( 'scalar', 8, 7 ).

Although Ovid calls this a fail, it's no more a fail than misunderstanding precedence in any other language.

@brian d foy - Let's imagine that ',' binds stronger then '||' - then the example would still be confusing since the precedence would not penetrate into the 'boo' sub, that is it would fix something like:

( 4,5,6 || 5,8,7 )

but not the original example.

I think it is a context problem although precedence contributes some of the confusion :)

I now agree that it's not *directly* a fail because apparently this was the behaviour the original developer intended: toggling the first value in a list.

It's now been changed by a dev to something a bit more intuitive:


my @x = ( (foo()|| 5), 6, 7 );


That's still not great, but it's far better than it was.

I don't need to imagine that the precedence table is some other thing. That would be some other problem. The only problem here is not recognizing that the || is a scalar operator and enforces scalar context. There is no other problem here.

Bah, I wrote the wrong thing. The only problem is one of precedence. If you mix up precedence, of course things are going to get mixed up. This sort of thing is a newbie mistake, even if experienced people make it or misinterpret it.

Leave a comment

About hanekomu

user-pic