The Case of the Overloaded Curlys

When I entered our rooms in Paris after a day on the boulevards, I found C. Auguste Dupin at his desk contemplating a scrap of paper. "A problem?" I asked.

"A death. And a rather gruesome one, n'est-ce pas?" he replied, handing me the paper. On it I saw:


$ perl -e '%h = map { "-$_" => 1 } qw{ foo bar };'
Not enough arguments for map at -e line 1, near "} qw{ foo bar }"
syntax error at -e line 1, near "} qw{ foo bar }"
Execution of -e aborted due to compilation errors.

Well, a map takes either a block or an expression followed by a comma, and either must be followed by a list. The block was there, and so was the list. Clearly a neophite had been playing with bleadperl, and paid the price. With a rueful shake of my head, I silently passed the paper back to the detective.

"No, this was a production release of Perl, mon ami."

"But how --" I expostulated.

"-- did I know what you were thinking? Simple observation. When I gave you the paper, your brow contracted in puzzlement. This was succeeded by the abstracted expression of deep thought. Your right thumb played back and forth between the first two fingers, as though considering two alternatives, thus I knew you were thinking of the two forms of map. Your expression of rueful pity for the victim as you returned the paper to me completed the picture."

"But how --" I began again.

"-- can a production version of Perl fail to parse such an obviously correct map? Ah, that is a deeper problem. The solution lies in the fact that in Perl, curly brackets are overloaded, and may be used either as block delimiters or a hash constructor.

"The map takes either an expression or a block as its first argument, and the documentation states plainly that the Perl parser must choose between these alternatives before it analyzes the entire statement, and that, absent clear clues, it guesses." His face assumed an expression of something like revulsion as he pronounced the last word. "Since the parse failure makes no sense with a block, it must be that Perl parsed the curly brackets as a hash constructor, and therefore as an expression. The comma which must follow an expression was not found, lacking which a painful death ensued.

"But how --" I broke in for the third time.

"-- could this tragedy have been prevented? Nothing simpler. The perlref document states that a unary plus sign before the left curly bracket forces it to be interpreted as a hash constructor, and a semicolon after it forces the interpretation to be a block. Two simple strokes could have prevented this horrible death."

And with his pen, he inserted the semicolon which would have spared the victim's life:


$ perl -e '%h = map {; "-$_" => 1 } qw{ foo bar };'

With apologies to Edgar Allen Poe for borrowing his detective (whom I return, I hope, only a little worse for wear), and to Francophones for possible butchering of their language. The one-liner is an abstraction of a piece of code that puzzled me, until I remembered that Perl might not see the map the same way I did.

9 Comments

I was curious, since that is so obviously a useful map, why was your failure happening and how (else) could it be prevented.

It seems that `perl -e '%h = map { $_ => 1 } qw{ foo bar };'` works, so I guess the parser thinks that the quoting is too obviously hash construction.

The reason for the quoting here was to get a `-` preceding the key text, but this reminds me that that a string can actually be negated! Therefore a working solution is simply `perl -e '%h = map { -$_ => 1 } qw{ foo bar };'`

Brilliant!

MOAR! :)

Tom,

I knew that my fix was limited to that one use case. And I really am glad to see the general solution! The broken case is terrifyingly close to code that we all write, and I'm not sure that I would have caught it or that I would have known what to do when I did! Thanks for the provocative post.

Got bitten by that once or twice... as I tend to use 'map' a lot, I got into the habit of disambiguating these with a pair of parens inside the curlies:

$ perl -e '%h = map { ("-$_" => 1) } qw{ foo bar };'

It looks more natural to me than adding a semicolon in the middle. In fact if I'd seen the "map {; ..." code in the wild, I'd probably have imagined it was a harmless typo.

@ordabadar, you know, I think I like your method better. At the same time that it helps the parser understand, it makes it more readable. "This map is returning a list of two values". Once again though, good to know that should all else fail `{;` will keep the parser from hurling.

I always, without intentional exception, write:

map {; ... } @array

and the same for grep. Sometimes I have been asked, "Why don't you only put the semicolon when it will be needed?" Well, because then I'd need to think about it.

Somewhat frustratingly, this problem is not fixed in Perl 6. I asked once why they still used {} for two things. I was told, "Well, it actually isn't ambiguous anymore" but the implementations on hand still choked on a number of cases due to ambiguity.

Fixing the ambiguity in {hashref} and {block} is on my list of things to fix in Perl, when I get a time machine.

Just noting that it's the dbl quoting of any var
(e.g.
map { "$baz" => 1 }

) that makes the parser mis-guess. So:
map { "--" . $_ => 1 }

works too. Why the quoted var befuddles so, I dunno.

Leave a comment

About Tom Wyant

user-pic Fine Perl code for over 0.005 centuries.