The Case of the Overloaded Curlys
When I entered our rooms in Paris after a day on the boulevards, I found C. Auguste Dupin at his desk contemplating a scrap of paper. "A problem?" I asked.
"A death. And a rather gruesome one, n'est-ce pas?" he replied, handing me the paper. On it I saw:
$ perl -e '%h = map { "-$_" => 1 } qw{ foo bar };'
Not enough arguments for map at -e line 1, near "} qw{ foo bar }"
syntax error at -e line 1, near "} qw{ foo bar }"
Execution of -e aborted due to compilation errors.
Well, a map
takes either a block or an expression followed by a comma, and either must be followed by a list. The block was there, and so was the list. Clearly a neophite had been playing with bleadperl, and paid the price. With a rueful shake of my head, I silently passed the paper back to the detective.
"No, this was a production release of Perl, mon ami."
"But how --" I expostulated.
"-- did I know what you were thinking? Simple observation. When I gave you the paper, your brow contracted in puzzlement. This was succeeded by the abstracted expression of deep thought. Your right thumb played back and forth between the first two fingers, as though considering two alternatives, thus I knew you were thinking of the two forms of map
. Your expression of rueful pity for the victim as you returned the paper to me completed the picture."
"But how --" I began again.
"-- can a production version of Perl fail to parse such an obviously correct map
? Ah, that is a deeper problem. The solution lies in the fact that in Perl, curly brackets are overloaded, and may be used either as block delimiters or a hash constructor.
"The map
takes either an expression or a block as its first argument, and the documentation states plainly that the Perl parser must choose between these alternatives before it analyzes the entire statement, and that, absent clear clues, it guesses." His face assumed an expression of something like revulsion as he pronounced the last word. "Since the parse failure makes no sense with a block, it must be that Perl parsed the curly brackets as a hash constructor, and therefore as an expression. The comma which must follow an expression was not found, lacking which a painful death ensued.
"But how --" I broke in for the third time.
"-- could this tragedy have been prevented? Nothing simpler. The perlref
document states that a unary plus sign before the left curly bracket forces it to be interpreted as a hash constructor, and a semicolon after it forces the interpretation to be a block. Two simple strokes could have prevented this horrible death."
And with his pen, he inserted the semicolon which would have spared the victim's life:
$ perl -e '%h = map {; "-$_" => 1 } qw{ foo bar };'
With apologies to Edgar Allen Poe for borrowing his detective (whom I return, I hope, only a little worse for wear), and to Francophones for possible butchering of their language. The one-liner is an abstraction of a piece of code that puzzled me, until I remembered that Perl might not see the map
the same way I did.
I was curious, since that is so obviously a useful map, why was your failure happening and how (else) could it be prevented.
It seems that `perl -e '%h = map { $_ => 1 } qw{ foo bar };'` works, so I guess the parser thinks that the quoting is too obviously hash construction.
The reason for the quoting here was to get a `-` preceding the key text, but this reminds me that that a string can actually be negated! Therefore a working solution is simply `perl -e '%h = map { -$_ => 1 } qw{ foo bar };'`
Brilliant!
MOAR! :)
Joel -
You are quite right.
When preparing my post, I should have been a better detective, because I omitted from my toy example a character other than the one M. Dupin pointed out.
A better representation of the real-life code would be more like
What I was doing was preparing to construct a command line from an options hash, and the particular command I was issuing required double-dash syntax. I thought the omission of one of the dashes was immaterial to my example but, as you have shown, I was wrong.
The fact that you almost never see blocks written as
{; ... }
, or hash constructors written as+{ ... }
, shows that, despite M. Dupin's revulsion, Perl's guessing is generally pretty good. I wrote the blog post to try to publicize the general solution, but unfortunately my example was broken. Fortunately, you found the problem. Thanks!Tom,
I knew that my fix was limited to that one use case. And I really am glad to see the general solution! The broken case is terrifyingly close to code that we all write, and I'm not sure that I would have caught it or that I would have known what to do when I did! Thanks for the provocative post.
Got bitten by that once or twice... as I tend to use 'map' a lot, I got into the habit of disambiguating these with a pair of parens inside the curlies:
$ perl -e '%h = map { ("-$_" => 1) } qw{ foo bar };'
It looks more natural to me than adding a semicolon in the middle. In fact if I'd seen the "map {; ..." code in the wild, I'd probably have imagined it was a harmless typo.
@ordabadar, you know, I think I like your method better. At the same time that it helps the parser understand, it makes it more readable. "This map is returning a list of two values". Once again though, good to know that should all else fail `{;` will keep the parser from hurling.
I always, without intentional exception, write:
map {; ... } @array
and the same for grep. Sometimes I have been asked, "Why don't you only put the semicolon when it will be needed?" Well, because then I'd need to think about it.
Somewhat frustratingly, this problem is not fixed in Perl 6. I asked once why they still used {} for two things. I was told, "Well, it actually isn't ambiguous anymore" but the implementations on hand still choked on a number of cases due to ambiguity.
Fixing the ambiguity in {hashref} and {block} is on my list of things to fix in Perl, when I get a time machine.
I confess to being with RJBS on this one. I have yet to take my disambiguation to the extent he has, but when it's needed I tend to prefer the documented way to disambiguate.
That said, TMTOWTDI. Not one but two alternatives have been brought forth, and I'll bet ordabadar's solution is general. Thanks to both Joel Berger and ordabadar for giving us all yet another way to deal with pesky corner cases.
One of the many things that keeps me with Perl is that I am always learning more about it.
Just noting that it's the dbl quoting of any var
(e.g.
map { "$baz" => 1 }
) that makes the parser mis-guess. So:
map { "--" . $_ => 1 }
works too. Why the quoted var befuddles so, I dunno.