User experiences with Marpa: some observations

When it comes to user experiences with Marpa, I confess to being a highly biased source. I hope the following observations will be useful nonetheless. (Marpa, for those new to this blog, is a new, powerful and fast parser and parsing algorithm. To learn more, check out its web page.)

Marpa does the job

If you've read user's accounts of work with BNF grammars over the years (I have studied many), you know they follow a familiar pattern. The user has some BNF. He then tries tool X (for X substitute yacc, bison, PEG, recursive descent, etc.) and finds that it almost works. Almost, but not quite. The rest of the account describes the user beating up his grammar in an effort to make it fit the tool. Perhaps 50% of the time, he reports that his effort was wasted.

The accounts from Marpa users are different, because the problem the users of other tools spend most of their time describing, and spent most of their time solving, does not exist with Marpa. Once they have their grammar worked out, Marpa parses it. As a practical matter, this difference is a big deal. A methodology that, half the time, takes you to a dead end, is one that practitioners will avoid.

The reason that regular expressions became so dominant, I am convinced, is that they did the job. If you can write your problem as a regular expression (here I speak of regular expressions, not regexes) you are guaranteed two things. First that your regular expression engine *WILL* handle it. Second, that it will do so in acceptable time. Regular expressions did the job. With Marpa, BNF does the job.

BNF as a lost art

An obstacle to learning Marpa seems to be that writing BNF is something of a lost art. BNF is not difficult. By my accounting, BNF is actually simpler than regular expressions. BNF involves one idea: concatenation. Regular expressions involve all of concatenation, sequence and alternation. Even if you don't agree with my way of calculating difficulty, I think you'll agree that BNF is vastly simpler than Perl regexes.

When I learned programming (around 1970), BNF was much better known than regular expressions. BNF had been established for years as the way of specifying languages. Regular expressions, on the other hand, did not come in prominence before Thompson's ed editor. The first PDP11 assembler version of that came out in 1971, the year I was taking the introduction to Computer Science with Alan Perlis. Prof. Perlis spent a lot of time on BNF and Markov machines. I cannot recall that he mentioned regular expressions. Since then, at least in the US, BNF and Markov machines seem to have been consigned to nearly the same degree of oblivion.

It makes some sense that those with a pragmatic bent might forget BNF. Aside from reading standards documents, there wasn't much you could do with it -- yacc to the contrary notwithstanding. Meanwhile, in editing we came to use regular expressions so often many of us can write short ones without thinking.

When I wrote Marpa, I assumed that its users would already know BNF. A number of users seem to have done it the opposite way -- they've learned Marpa and used it to get the hang of BNF. Now that I think of it, I learned regular expressions, not in the classroom, but from the ed editor. When there's a real use for it, I have no doubt the lost art of BNF will be once again found.

2 Comments

Do keep in mind that being a strong programmer does not imply being a computer scientist. At the same time, it seems that any computer science curriculum that *never* teaches BNF (with at least some notion of E) is probably not worth attending.

My familiarity with BNF comes from reading standards documents, I suspect that'd be common for people who never took a course in compiler design.

About Jeffrey Kegler

user-pic I blog about Perl, with a focus on parsing and Marpa, my parsing algorithm based on Jay Earley's.