5.10 regex features: Build a nested structure while matching

In trying to learn 5.10 grammar-like regex features and wrote a program that, while matching simple Lisp-like constructs, builds up a data structure. This involves recursing into subpatterns, but I couldn't find a way to assemble the resulting tree from the bottom up. Therefore I used a stack.

Do you know of a better way of building the data structure with the given regex/grammar?

(I'm aware of Parse::RecDescent and Regexp::Grammars, but wanted to do this as simply as possible.)

#!/usr/bin/env perl

# Exercising perl 5.10 regular expression features:
# Return a tree structure while matching simple Lisp-like constructs

use warnings;
use strict;
use 5.010;
use Test::More;
use Test::Differences;

sub parse {
    my $string = shift;
    # 'my @S' produces 'Variable "@S" will not stay shared' and failures
    our @S = ();
    # 'my @t' produces 'Can't localize lexical variable @t'
    our @t;

    # Use a stack because I can't see a way of localizing variables when
    # recursing into subpatterns.
    state $re = qr{
               (?{push @S, 'MARK'})
               (?&element) (?: \s+ (?&element) )*
                   local @t;
                   while ((my $el = pop @S) ne 'MARK') { unshift @t, $el }
                   push @S, \@t;
          (?<element>  (?&value) | (?&tree) )
          (?<value> (\w+) (?{ push @S, $^N }) )
    $string =~ $re;
    pop @S;

sub check {
    my ($string, $expect) = @_;
    eq_or_diff parse($string), $expect, $string;
check('(print)', ['print']);
check('(print foo bar baz)', [qw(print foo bar baz)]);
check('(print (foo 1 2) (bar 3 (baz 4 5)))',
    [ 'print', [ 'foo', '1', '2' ], [ 'bar', '3', [ 'baz', '4', '5' ] ] ]);
check('((foo (bar 1)) baz)', [ [ 'foo', [ 'bar', '1' ] ], 'baz' ]);

You can get the code at https://gist.github.com/1257636 .

Leave a comment

About Marcel Grünauer

user-pic perlservices.at is a B2B service provider for Perl in Vienna, Austria. I'm a professional software developer since 1989 and a Perl specialist since 1998. I also like Go (囲碁), Japanese and Korean culture and language.