Perl and Parsing 12: Beating up on the "use" statement

If you have been following the Perl blogosphere recently, you may have noticed that it has been a bad few weeks for Perl's use statement. I have been picking it apart in this series, and chromatic, on his blog, recently pointed out a documentation issue. Unlike chromatic, who focuses on user concerns, I use Perl as a way to implement and to illustrate parsing. Though, to be sure, one of the points I try to make is that the choice of parsing strategy is ultimately very much a user concern.

I find Perl's use statement especially interesting because it is a good example of a natural syntax that you would like to be easy to parse, but which proves problematic with current parsing technology. With a general BNF parser, like Marpa, Perl's use statement is easy to parse. But the use statement strains Perl's parser LALR parse engine to the limits. Indeed, as I will show next, even a bit beyond.

Reversed use statements

Consider this statement

use 2 Fatal;
Perl accepts this without error or warning, and interprets it as a request of at least version 2 of the Fatal module. But there's a problem. If you missed it, look carefully -- module name and version are reversed from the documented order. As documented, the statement should be
use Fatal 2;

What happens if we try to include an argument list with a reversed use statement? If we keep things simple, we are still fine:

use 2 Fatal qw(open close);
is equivalent to
use Fatal 2 qw(open close);

The real problem

If reversed use statements were orthogonal -- that is, if they treated argument lists in the same way as their documented brethren -- one could argue that they were a misfeature or even a feature. But, alas, reversed use statements are not orthogonal. Perl complains about reversed use statements with some lists, but not with others. As examples, the following statements are all reported as syntax errors:

use 2 Fatal +"open", "close";
use 2 Fatal "open", "close";
use 2 Fatal +qw(open close)
use 2 Fatal ("open", "close")
use 2 Fatal +("open", "close")

The argument lists in the above statements are all syntactically correct. To see this, tranpose module name and version number into their documented order and try the "un-reversed" statements out. Perl will accept them, and will interpret their lists as argument lists.

Back it out?

Is backing out the reversed use statement an option? My guess is no. Almost certainly, in production environments somewhere out there, Perl scripts contain reversed use statements. Most of these, I would expect, came about by accident. No diagnostic identifies reversed use statements and the most common ones will perform just fine under testing. Even a careful desk-checking of the code could miss them, and a jargon file entry from back in 1996 mockingly informs me that desk checks went out of style some time prior to that.

Keep it?

Let's suppose, then, that the reversed use statement is kept on as a misfeature, documented or undocumented. That implies that it should continue to handle all the arguments lists it currently handles. After all, we don't want to break any scripts -- that is the whole point of keeping it.

But what of those arguments lists that the reversed use statement does not handle? If you look at the code for the use statement you will see it is extremely complicated. It would be very difficult to keep the reversed use statement's behavior perfectly stable with respect to argument lists. So difficult that it would make the code for the use statement almost untouchable.

One candidate for the least bad solution is to allow changes to the list-handling behavior of the reversed use statement, but only in the direction of expanding the syntaxes allowed. It may not be desirable to document the reversed use statement, but it should probably should be added to the test suite. The reversed use statement would then live on, a deprecated but permanent part of Perl.

3 Comments

Wow, I had no idea the perl parser was that loose. A better approach than expanding the allowed list syntaxes would be to deprecate the reversed syntax and make it emit a warning, and then remove it completely in a future version of perl.

BTW, there's a missing double quote in the second example in the list of erroring reversed statements.

Interesting that the qw() syntax succeeds in the indirect-object "use" but very little else does.

There's little value in breaking old programs that do this syntax, though. I would prefer version-specific grammars so that under "use 5.016" this is a syntax error but under "use 5" it is not. Then we can slowly permute the language but people have to opt-in and old stuff continues "just working". Not that it is possible for every behavior change to be so accommodating.

About Jeffrey Kegler

user-pic I blog about Perl, with a focus on parsing and Marpa, my parsing algorithm based on Jay Earley's.