Results tagged “marpa parser”

"Marpa and procedural parsing"

The newest post on the "Ocean of Awareness" blog is "Marpa and procedural parsing": Marpa's procedural parsing is more flexible and more powerful than recursive descent's.

"Parsing with Pictures"

The newest entry in my Ocean of Awareness blog is about the paper "Parsing with Pictures" by Pingali and Bilardi. It describes a new, easier and more natural way to look at parsing. I try it out, looking at Marpa and the Might/Darais algorithm.

Why is parsing considered solved?

"Why is parsing considered solved?" is the newest entry on my Ocean of Awareness blog.

It is often said that parsing is a "solved problem". Given the level of frustration with the state of the art, the underuse of the very powerful technique of Language-Oriented Programming due to problematic tools, and the vast superiority of human parsing ability over computers, this requires explanation.
On what grounds would someone say that parsing is "solved"? To understand…

Is language just a set of strings?

The newest entry on my Ocean of Awareness blog: "Is language just a set of strings?"

"Or is there something wrong with the way we go abo…

Parsers and their useful power

I have posted a new entry on the Ocean of Awareness blog: "Parsers and useful power". I look at what parser users want and what makes a parser successful, in light of the 1960s contest between the Irons parser, the first ever published, and recursive descent. One of those is very much with us today, and one survives only in the literature.

For more about Marpa, my own parsing project, there is the semi-official web site, maintained by Ron Sa…

Version 3 of "Parsing: a timeline"

I have published version 3 of my parsing timeline. It has many changes-- The new material includes coverage of combinator and monadic parsing, and operator expression parsing, making it considerably less Marpa-centric.

The link above is to the announcement on my own blog. You can also "cheat" and go st…

Introduction to Marpa book in progress

My latest blog post is the introduction to my Marpa Book, currently in progress. The book will be a theory monograph, so it's kind of stuffy, but it's a good summary of Marpa's features. It also discusses the implications of these features for applications.

What parser do birds use?

In my new blog post, I compare parsing, as practiced by birds and by computer programmers.

What are the reasonable programming languages?

My latest blog post is "What are the reasonable programming languages?" Nowadays, we think we know what languages are realistically possible. But in the 1970's, programmers knew that they didn't know. So they asked for the languages they actually wanted. What kinds of language did they ask for?

Grammar reuse

My latest blog post looks at a grammar reuse, comparing regular expressions, PEG, Perl 6 grammars and general BNF parsers, including Marpa. A good property to have in itself, grammar reusability is crucial if a parser is going to be the basis for language-driven programming.

Fast handy languages

My new blog post is a summary of what the Marpa parser is about. It even includes a section on when not to use Marpa.

Parsing: top-down versus bottom-up

[ This is cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

Comparisons between top-down and bottom-up parsing are often either too high-level or too low-level. Overly high-level treatments reduce the two approaches to buzzwords, and the comparison to a recitation of received wisdom. Overly low-level treatments get immersed in the minutiae of implementation, and the resulting comparison is as revealing as placing two abstractly related code listings side by side. In this post I hope to find the middle level; to shed light on why advocates of bottom-up and top-down parsing approaches take the positions they do; and to speculate about the way forward.

Top-down parsing

The basic idea of top-down parsing is as brutally simple as anything in programming: Starting at the top, we add pieces. We do this by looking at the next token and deciding then and there where it fits into the parse tree. Once we've looked at every token, we have our parse tree.

Removing obsolete versions of Marpa from CPAN

[ This is cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

Marpa::XS, Marpa::PP, and Marpa::HTML are obsolete versions of Marpa, which I have been keeping on CPAN for the convenience of legacy users. All new users should look only at Marpa::R2.

I plan to delete the obsolete releases from CPAN soon. For legacy users who need copies, they will still be available on backPAN.

Reporting mismatched delimiters

[ This is cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

In many contexts, programs need to identify non-overlapping pieces of a text. One very direct way to do this is to use a pair of delimiters. One delimiter of the pair marks the start and the other marks the end. Delimiters can take many forms: Quote marks, parentheses, curly braces, square brackets, XML tags, and HTML tags are all delimiters in this sense.

Mismatching delimiters is easy to do. Traditional parsers are often poor at reporting these errors: hopeless after the first mismatch, and for that matter none too precise about the first one. This post outlines a scaleable method for the accurate reporting of mismatched delimiters. I will illustrate the method with a simple but useable tool -- a utility which reports mismatched brackets.

Evolvable languages

[ Cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

Ideally, if a syntax is useful and clear, and a programmer can easily read it at a glance, you should be able to add it to an existing language. In this post, I will describe a modest incremental change to the Perl syntax.

It's one I like, because that's beside the point, for two reasons. First, it's simply intended as an example of language evolution. Second, regardless of its merits, it is unlikely to happen, because of the way that Perl 5 is parsed. In this post I will demonstrate a way of writing a parser, so that this change, or others, can be made in a straightforward way, and without designing your language into a corner.

Significant newlines? Or semicolons?

[ Cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

Should statements have explicit terminators, like the semicolon of Perl and the C language? Or should they avoid the clutter, and separate statements by giving whitespace syntactic significance and a real effect on the semantics, as is done in Python and Javascript?

Actually we don't have to go either way. As an example, let's look at some BNF-ish DSL. It defines a small calculator. At first glance, it looks as if this language has taken the significant-whitespace route -- there certainly are no explicit statement terminators.

1

About Jeffrey Kegler

user-pic I blog about Perl, with a focus on parsing and Marpa, my parsing algorithm based on Jay Earley's.