Perl and Parsing 12: Beating up on the "use" statement
If you have been following the Perl blogosphere recently,
you may have noticed that it has been a bad few weeks
for Perl's use
statement.
I have been
picking it apart
in this series,
and chromatic, on his blog,
recently pointed out a documentation issue.
Unlike chromatic, who focuses on user concerns,
I use Perl as a way to implement and to illustrate parsing.
Though, to be sure,
one of the points I try to make
is that the choice of parsing strategy
is ultimately very much a user concern.
I find
Perl's use
statement
especially interesting because it is a good example of
a natural syntax that you would like to be easy to parse,
but which proves problematic with current parsing technology.
With a general BNF parser, like Marpa,
Perl's use
statement is easy to parse.
But the use
statement strains
Perl's parser LALR parse engine to the limits.
Indeed, as I will show next, even a bit beyond.
Reversed use statements
Consider this statement
use 2 Fatal;
Perl accepts this without error or warning, and interprets it as
a request of at least version 2 of the Fatal
module.
But there's a problem.
If you missed it,
look carefully -- module name and version are reversed
from the documented order.
As documented, the statement should be
use Fatal 2;
What happens if we try to include an argument list with a reversed use statement? If we keep things simple, we are still fine:
use 2 Fatal qw(open close);
is equivalent to
use Fatal 2 qw(open close);
The real problem
If reversed
use
statements were orthogonal --
that is, if they treated argument
lists in the same way as their documented brethren --
one could argue
that they were a misfeature or even
a feature.
But, alas,
reversed use
statements
are not orthogonal.
Perl complains about reversed
use
statements with some lists,
but not with others.
As examples,
the following statements are all
reported as syntax errors:
use 2 Fatal +"open", "close";
use 2 Fatal "open", "close";
use 2 Fatal +qw(open close)
use 2 Fatal ("open", "close")
use 2 Fatal +("open", "close")
The argument lists in the above statements are all syntactically correct. To see this, tranpose module name and version number into their documented order and try the "un-reversed" statements out. Perl will accept them, and will interpret their lists as argument lists.
Back it out?
Is backing out the reverseduse
statement an option?
My guess is no.
Almost certainly,
in production environments somewhere out there,
Perl scripts contain reversed use
statements.
Most of these, I would expect, came about by accident.
No diagnostic identifies
reversed use
statements
and the most common ones
will perform just fine under testing.
Even a careful desk-checking of the code
could miss them,
and
a jargon file entry from back in 1996
mockingly informs me that desk checks
went out of style some time prior to that.
Keep it?
Let's suppose, then, that the
reversed use
statement
is kept on as a misfeature, documented or undocumented.
That implies that it should continue to handle all the arguments
lists it currently handles.
After all, we don't want to break any scripts --
that is the whole
point of keeping it.
But what of those arguments lists that the
reversed use
statement does not handle?
If you look at the code
for the use
statement
you will see it is extremely complicated.
It would be very difficult
to keep
the reversed use
statement's
behavior
perfectly stable
with respect to argument lists.
So difficult that it would make the code for
the use
statement almost untouchable.
One candidate for the least bad solution is to
allow changes to the list-handling
behavior of the
reversed use
statement,
but only in the direction of expanding the
syntaxes allowed.
It may not be desirable to document
the reversed use
statement,
but it should probably should be added to the test suite.
The reversed use
statement would then live on,
a deprecated but permanent part of Perl.
Wow, I had no idea the perl parser was that loose. A better approach than expanding the allowed list syntaxes would be to deprecate the reversed syntax and make it emit a warning, and then remove it completely in a future version of perl.
BTW, there's a missing double quote in the second example in the list of erroring reversed statements.
Interesting that the qw() syntax succeeds in the indirect-object "use" but very little else does.
There's little value in breaking old programs that do this syntax, though. I would prefer version-specific grammars so that under "use 5.016" this is a syntax error but under "use 5" it is not. Then we can slowly permute the language but people have to opt-in and old stuff continues "just working". Not that it is possible for every behavior change to be so accommodating.
@ilmari.org: I fixed the typo. Thanks for catching it.