Transpiling JavaScript with Marpa
In order to transpile JavaScript to (any language but first) perl(5), I wanted to have a generic methodology, independent of the target language. The proof of concept being the default transpiling of JavaScript to JavaScript.
Marpa::R2, came again to the rescue: Marpa allows the programmer to have introspection on the grammar. This mean that one can:
- generate automatically a package based on the introspection to do the transpilation
- use an AST that would contain the rule Ids which, together with the previously mentionned generated package, allows to transpile JavaScript to any target language.
Let's be concrete: If, in the Marpa grammar, you specify an action in the userspace like that:
:default ::= action => valuesAndRuleId lexeme default = action => [start,length,value]
The valuesAndRuleId method beeing:
sub valuesAndRuleId { my $self = shift; return {values => [ @_ ], ruleId => $Marpa::R2::Context::rule}; }
This mean that, in the parse tree value of the grammar:
- every lexeme (let's say terminal) will have an associated value that is a reference to an array with three items:
- start: the position in the stream
- length: the length
- value: the terminal itself
- every G1 (let's say non-terminal) rule will have an associated value that is a reference to hash with the following structure:
- ruleId: the nonn-terminal rule Id
- values: a reference to all RHS values
Let's evaluate the parse tree value of a simple JavaScript source code:
var i;
using the grammar in the __DATA__ section of this package. This will be:
$VAR1 = { 'ruleId' => 230, 'values' => [ { 'ruleId' => 227, 'values' => [ { 'values' => [ { 'ruleId' => 233, 'values' => [ { 'values' => [ { 'values' => [ [ 0, 3, 'var' ], { 'values' => [ { 'values' => [ [ 4, 1, 'i' ], { 'values' => [], 'ruleId' => 171 } ], 'ruleId' => 168 } ], 'ruleId' => 164 }, [ 5, 1, ';' ] ], 'ruleId' => 163 } ], 'ruleId' => 146 } ] } ], 'ruleId' => 231 } ] } ] };
Now, using the generated transpilation package, one can reproduce a source code that would give the same AST as the original.
This is not only a proof of concept, but the starting point of JavaScript::Transpile, that should be a generic JavaScript transpilator, with perl5 as main target, providing also (yet another) a JavaScript running engine: the original source code will be transformed to an AST using MarpaX::Languages::ECMAScript::AST, then transpiled to perl.
Here is an example of full JavaScript to JavaScript of jquery-1.10.2, (the same here). This looks like jquery. But this is *not* a cut/paste of jquery. This is transpilation of JavaScript to JavaScript via an AST produced by Marpa.
Nothing prevents to transpile to another target language -;
A minor quibble: could we drop the term "transpiling"? There isn't an interesting distinction between compiling to machine code or a virtual machine versus compiling to a different high-level language. The interesting part happens in the AST; the source and target languages are just implementation details.
Otherwise, awesome work.
I think the cat may be out of the bag at this point. And personally I like to see this feline on the loose.
I had not heard the term until Jean-Damien used it., but Wikipedia has a good article on it:http://en.wikipedia.org/wiki/Source-to-source_compiler. I like the term very much and find it fills a true need. In describing the kind of work people are doing with Marpa, I've had trouble -- "compiler" suggests something that not only parses and produces another useable format, but also takes the original language down to the hardware level. I like to point out the great things being done with Marpa, but I also try to avoid exaggeration, and if you say "Marpa allows you to produce compilers easily and quickly", it suggests that Marpa makes it easy to write another GCC. I believe Marpa would make writing another GCC considerably easier, but to write a full compiler like GCC you have to solve a whole lot of problems that Marpa does not even begin to address.
The distinction between "compilers", which take the language a major distance down toward the metal, and "transpilers", which convert the languages at the same level of abstraction is extremely significant in theory and in practice. Having a word for it makes it easier to talk about it.