Marpa has a new web site

[ Cross-posted by invitation from its home on the Ocean of Awareness blog. ]

Marpa has a new official public website, which Ron Savage has generously agreed to manage. For those who have not heard of it, Marpa is a parsing algorithm. It is new, but very much based on earlier work by Jay Earley, Joop Leo, John Aycock and R. Nigel Horspool. Marpa is intended to replace, and to go well beyond, recursive descent and the yacc family of parsers.

Language design: Exploiting ambiguity

[ Cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

Currently, in designing languages, we don't allow ambiguities -- not even potential ones. We insist that it must not be even possible to write an ambiguous program. This is unnecessarily restrictive.

This post is written in English, which is full of ambiguities. Natural languages are always ambiguous, because human beings find that that's best way for versatile, rapid, easy communication. Human beings arrange things so that every sentence is unambiguous in context. Mistakes happen, and ambiguous sentences occur, but in practice, the problem is manageable. In a conversation, for example, we would just ask for clarification.

Evolvable languages

[ Cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

Ideally, if a syntax is useful and clear, and a programmer can easily read it at a glance, you should be able to add it to an existing language. In this post, I will describe a modest incremental change to the Perl syntax.

It's one I like, because that's beside the point, for two reasons. First, it's simply intended as an example of language evolution. Second, regardless of its merits, it is unlikely to happen, because of the way that Perl 5 is parsed. In this post I will demonstrate a way of writing a parser, so that this change, or others, can be made in a straightforward way, and without designing your language into a corner.

Significant newlines? Or semicolons?

[ Cross-posted by invitation, from its home on the Ocean of Awareness blog. ]

Should statements have explicit terminators, like the semicolon of Perl and the C language? Or should they avoid the clutter, and separate statements by giving whitespace syntactic significance and a real effect on the semantics, as is done in Python and Javascript?

Actually we don't have to go either way. As an example, let's look at some BNF-ish DSL. It defines a small calculator. At first glance, it looks as if this language has taken the significant-whitespace route -- there certainly are no explicit statement terminators.

A Marpa-powered C parser

Jean-Damien Durand has just released MarpaX::Languages::C::AST, which parses C language into an abstract syntax tree (AST). MarpaX::Languages::C::AST has been tested against Perl's C source code, as well as Marpa's own C source.

[ This is cross-posted from its home on the Ocean of Awareness blog. ]