[ Cross-posted by invitation, from its home on the Ocean of Awareness blog. ]
Ideally, if a syntax is useful and clear,
and a programmer can easily read it at a glance,
you should be able to add it to an existing language.
In this post, I will describe
a modest incremental change to the Perl syntax.
It's one I like, because that's beside the point, for two
First, it's simply intended as an example of language evolution.
Second, regardless of its merits, it is unlikely to happen,
because of the way that Perl 5 is parsed.
In this post I will demonstrate a way of writing a parser,
so that this change,
or others, can be made in a straightforward way,
and without designing your language into a corner.
[ Cross-posted by invitation, from
its home on the Ocean of Awareness blog. ]
Should statements have explicit terminators, like the semicolon of Perl and
the C language?
Or should they avoid the clutter, and separate statements by giving whitespace
syntactic significance and a real effect on
Actually we don't have to go either way.
As an example, let's look at some BNF-ish DSL.
It defines a small calculator.
At first glance, it looks as if this language has taken the
significant-whitespace route -- there certainly are no explicit statement
Jean-Damien Durand has just released
which parses C language into an abstract syntax tree (AST).
MarpaX::Languages::C::AST has been tested against
Perl's C source code, as well as Marpa's own C source.
[ This is cross-posted from its home on the Ocean of Awareness blog. ]
[ This is cross-posted from the Ocean of Awareness blog
Abstract Syntax Forests (ASF's) are my most recent project.
I am adding ASF's to my Marpa parser.
has long supported ambiguous parsing,
and allowed users to iterate through,
all the parses of an ambiguous parse.
This was enough for most applications.
Even applications which avoid ambiguity benefit from better ways to detect
and locate it.
And there are applications that require the ability to select among
manipulate very large sets of ambiguous parses.
Prominent among these is Natural Language Processing (NLP).
This post will introduce an experiment.
Marpa in fact seems to have some potential for NLP.
Writing an efficient ASF in not a simple matter.
The naive implementation is to generate complete set
of fully expanded abstract
syntax trees (AST's).
This approach consumes resources that can become
exponential in the size of the input.
Translation: the naive implementation quickly becomes unuseably slow.
by aggressively identifying identical subtrees
of the AST's.
Especially in highly ambiguous parses,
many subtrees are identical,
and this optimization is often a big win.
[ This is cross-posted from the Ocean of Awareness blog. ]
a recent post,
I looked at an unusual language which serializes arrays and strings,
using a mixture of counts and parentheses. Here is an example: