Yet another BNF: Extended Marpa Scanless InterFace

This post is to introduce another BNF, namely MarpaX::ESLIF - as the name suggests, it is largely inspired by Marpa::R2's BNF, and aim to extend the later.

The intent was to provide the following features:


  • native regular expression

  • This is done using a built-in version of PCRE2.

  • support of syntactic exception

  • externalized data reader in a streaming compatible architecture

  • unlimited number of sub-grammars

Although it looks like Marpa's BNF, it is not fully backward compatible with it! I invite readers to read the Introduction, that is covering the architecture and the main features, as well as its BNF.

The inner implementation is an XS proxy to a complete C library built on top of Marpa::R2's core engine, namely c-marpaESLIF.

This could have never exist without remarkable Marpa library, copyrighted by Jeffrey, that I applaud here for his fantastic work that deserve a wide audience IMHO.

I have also uploaded a JSON parser, MarpaX::ESLIF::ECMA404 to give a concrete example of how MarpaX::ESLIF is working. If these packages can boost marpa reputation, great.

Impatient readers ? Here is an ESLIF version of JSON grammar, hopefully correct-;

# ----------------------------
# JSON Grammar as per ECMA-404
# ----------------------------
#
# Default action is to propagate the first RHS value
#
:default ::= action => ::shift
#
# JSON starting point is value
#
:start ::= value
#
# ----------------------------
# I explicitely expose string grammar for one reason: inner string
# elements have specific actions
# ----------------------------
#
object   ::= '{' members '}'         action => ::copy[1]
members  ::= pairs* separator => ',' action => members
pairs    ::= string ':' value        action => pairs
array    ::= '[' elements ']'        action => ::copy[1]
elements ::= value* separator => ',' action => array_ref
value    ::= string
           | number
           | object
           | array
           | 'true'
           | 'false'
           | 'null'

# -------------------------
# Unsignificant whitespaces
# -------------------------
:discard ::= /[\x{9}\x{A}\x{D}\x{20}]*/

# -----------
# JSON string
# -----------
# Executed in the top grammar and not as a lexeme.
# This is why we shutdown temporarily :discard in it.
#
string ::= '"' discardOff chars '"' discardOn action => ::copy[2]
discardOff ::=
discardOn ::=

event :discard[on] = nulled discardOn
event :discard[off] = nulled discardOff

chars ::= filled
filled ::= char+ action => ::concat
chars ::= action => empty_string
char ::= [^"\\[:cntrl:]]
| '\\' '"' action => ::copy[1]
| '\\' '\\' action => ::copy[1]
| '\\' '/' action => ::copy[1]
| '\\' 'b' action => backspace_character
| '\\' 'f' action => formfeed_character
| '\\' 'n' action => newline_character
| '\\' 'r' action => return_character
| '\\' 't' action => tabulation_character
| '\\' 'u' /[[:xdigit:]]{4}/ action => hex2codepoint_character

# ------------------------------------------
# JSON number: defined as a single terminal.
# ECMA404 numbers are 100% compliant with perl numbers syntax AFAIK.
# -------------------------------------------------------------------
#
number ::=
/\-?(?:(?:[1-9]?[0-9]*)|[0-9])(?:\.[0-9]*)?(?:[eE](?:[+-])?[0-9]*)?/

Leave a comment

About Jean-Damien Durand

user-pic About::Me::And::Perl