A Tiny Code Quiz
I'll be speaking at the 2014 German Perl Workshop, so I hope to see some of you there.
Ben Tilly posted the following on Facebook a few days ago (I've modified it every so slightly to make the possible answers clearer):
@ar1 = qw(foo bar);
@ar2 = qw(a b c d);
print scalar (@ar1, @ar2);
He argued that even experienced developers will often get this wrong. Looking at the above, we could possibly argue for any of the following to be printed:
1
2
3
4
6
d
foo
Try and guess the answer without consulting the docs or reading any of the responses! I'll give a rationale for each of those after the cut (and for the record, I got the answer wrong the first time), but I won't be giving the answer -- I assume that will show up pretty quickly in the comments. Note that, of course, the following rationales are wrong (badly so in some cases), but this gives you an idea of how difficult context can be in Perl.
1
Due to the comma being a list operator, the two arrays are first flattened into a list and either the first or last element is returned, either of which in scalar context evaluates to 1.
2
Reasoning: @ar1
has two elements and scalar()
takes an array and evaluates it in scalar context.
3
Due to the comma being a list operator, the two arrays are first flattened into a list and the first element, foo
, is returned, which evaluates to 3
in scalar context.
4
All of the arrays are evaluated in scalar context, returning a list of 2, 4
, but only the last element of the list is returned by scalar()
.
6
The two arrays, combined, have six elements, which evaluated as a scalar, causes 6
to be printed.
d
Due to the comma being a list operator, the two arrays are first flattened into a list and the last element is returned, which in scalar context is itself.
foo
Due to the comma being a list operator, the two arrays are first flattened into a list and the first element is returned, which in scalar context is itself.
Discuss!
Note: if you want to be really picky, according to a strict reading of the documentation, while one of the answers above is right, none of the rationales are. Have fun!
Humm... my first guess was 6, thinking in something like
print scalar ( split ",", join ",",(@ar1, @ar2));
and surprise
Okay, I'll be real picky.
scalar puts the parenthesised expression and hence the comma operator in scalar context. The scalar comma operator then evaluates first the left hand operand in void context and throws away the result, then returns the right hand hand operand in scalar context.
No list involved. Perl lists are never in scalar context. :)
Many Perl programmers might not guess the correct answer when presented with this "quiz".
However, I believe very few would accidentally write such code in practice while ignorant of its true function.
Passing a comma-separated parenthesized list of multiple variables to a built-in operator that operates on a single variable, is a rather weird thing to do and although many won't know what it does, they will also know that they don't know what it does.
Hence I don't think this example shows that context handling in Perl is excessively complex or even 'dangerous'.
PS: With
use strict;
you even get a warning about using@ar1
in void context in that code snippet, which should make things clear.smls, you wrote:
Passing a comma-separated parenthesized list of multiple variables to a built-in operator that operates on a single variable, is a rather weird thing to do and ...
That's a subtle misunderstanding about Perl that many have. scalar() doesn't operated on a single variable. It takes an expression as its argument and evaluates the expression in scalar context, not a single variable.
Since scalar (@anyarray) returns the number of elements in @anyarray, and since @new_array = ( @ar1, @ar2 ) (as I recall), I would've thought the answer is '6', the number of elements in the new anonymous array.
Great quiz I got it wrong the first time, but only after reading the docs and the reasoning I got it. And of course I had to test it I learned something today :) Thanks
The explanation for 4 is slightly wrong, it should be, quote perldoc scalar, "Evaluates all but the last element in void context and returning the final element evaluated in scalar context"
But it's curious how perl allows passing more than one arg to unary operators.
Dmitry: I wrote "according to a strict reading of the documentation, while one of the answers above is right, none of the rationales are".
Try this one:
The relevant Concise is: s <@> print vK ->t l <0> pushmark s ->m - <1> scalar sK/1 ->s r <@> list sK ->s m <0> pushmark v ->n o <1> rv2av[t5] vK/1 ->p n gv(ar1) s ->o q <1> rv2av[t6] sK/1 ->r p gv(ar2) s ->q
So scalar just sets the GIMME for list to G_SCALAR. And list does this:
PP(pplist) { dVAR; dSP; dMARK; if (GIMME != GARRAY) { if (++MARK <= SP) MARK = *SP; / unwanted list, return last item */ else *MARK = &PLsvundef; SP = MARK; } RETURN; }
Ha! Pushing the two items onto the stack and then expect it to be flattened is intuitive, but not implemented as such. perl treats it as error. I would call that a bug. A list in scalar context should return the number of flattened list entries, not the number of the last list entry.
My answer is 4.
Because scalar is unary operator, it evaluates the first array in void context, the second array in scalar context. And finally, return the result when evaluating second array in scalar context, which is number of element in second array.
Fwiw, using the current Rakudo/Parrot evaluator on #perl6:
my @ar1 = <foo bar>;
my @ar2 = <a b c d>;
print $(@ar1, @ar2);
foo bar a b c d
TIL.
So how would you modify the statement if you want to get 6? The best I can come up with is:
One of the things I love about perl is the generous application of the principle of least astonishment. One of he main values of languages like perl is that it is pretty intuitive to understand what a given bit of code is going to do. That is one of the things that makes perl a fast language to work with.
I see why the code does what it does and why perl actually works this way but I was surprised by what it returned. At least in my case it violated the principle. Maybe I need to adjust how I think about perl in this context so that the behaviour is what I expect.
No, a list in scalar context returns its last element. It has to, that's how the 'scalar comma operator' works. What seems weird here is that the scalar context propagates down to the arrays, because we expect arrays to interpolate into lists, but in fact that only happens (and only can happen) when the list is in list context.
I think what we forget here is that an array is not the same as a list; it's an expression which has to be evaluated, and it doesn't return the same thing in scalar context as the equivalent list. The result is exactly the same as this code, which is much less surprising:
..And why does the comma operator behave like this?
The Camel Book mentions "This is just like C's comma operator."
I have not thought about it enough, but the current behaviour does not seem intuitive, and therefore seems to violate the famous principle of least astonishment (for me). Ha! :-)
But as chris says, it is probably my duty not to be astonished.. ;-)
print scalar @{[ @ar1, @ar2 ]};
will do what you want. But that's ugly.
If you want 6:
print scalar (()=( @ar1, @ar2 ));
print @ar1 + @ar2;
But there is no list in scalar context here; just a comma operator in scalar context doing what it is supposed to do. "Fixing" it to work as you suggest would be a major change with many repercussions.
My first guess was 6 which is of course not correct. Have to revisit http://perldoc.perl.org/perlop.html
The easiest way to make it return 6, the total number of elements in both arrays, is:
ie: make the addition explicit.
I note that if you write it correctly (below) and run it you get: Useless use of private array in void context at test.pl line 11. 4
All done.
Line 11 is the print line.
use strict; use warnings;
my @ar1 = qw(foo bar); my @ar2 = qw(a b c d); print scalar (@ar1, @ar2);
print "\n\nAll done.\n";
END