[ This is cross-posted by invitation, from its
home
on the Ocean of Awareness blog. ]
Comparisons between top-down and bottom-up parsing
are often either too high-level or too low-level.
Overly high-level treatments reduce the two approaches to buzzwords,
and the comparison to a recitation of received wisdom.
Overly low-level treatments get immersed in the minutiae of implementation,
and the resulting comparison is as revealing as placing
two abstractly related code listings side by side.
In this post I hope to find the middle level;
to shed light on why advocates of bottom-up
and top-down parsing approaches take the positions
they do;
and to speculate about the way forward.
Top-down parsing
The basic idea of top-down parsing is
as brutally simple as anything in programming:
Starting at the top, we add pieces.
We do this by looking at the next token and deciding then and there
where it fits into the parse tree.
Once we've looked at every token,
we have our parse tree.
[ This is cross-posted by invitation,
from its home on the
Ocean
of Awareness blog.
]
Marpa::XS, Marpa::PP, and Marpa::HTML are obsolete versions of
Marpa, which I have been keeping on CPAN for the convenience of legacy
users.
All new users should look only at
Marpa::R2.
I plan to delete the obsolete releases from CPAN soon.
For legacy users who need copies, they will still be available on backPAN.
[ This is cross-posted by invitation, from
its home on the Ocean of Awareness blog. ]
In many contexts, programs need to identify
non-overlapping pieces of a text.
One very direct way to do this
is to use a pair of delimiters.
One delimiter of the pair marks the start
and the other marks the end.
Delimiters can take many forms:
Quote marks, parentheses, curly braces, square brackets,
XML tags, and HTML tags
are all delimiters in this sense.
Mismatching delimiters is easy to do.
Traditional parsers are often poor at reporting these errors:
hopeless after the first mismatch,
and for that matter none too precise about the first one.
This post outlines a scaleable method for the accurate
reporting of mismatched delimiters.
I will illustrate the method with a simple
but useable tool --
a utility which reports mismatched brackets.