Just saw [2] and [1] (which has to do with $Test::Builder::Level).
[2] says: "CPAN test modules might — in their own tests — be looking for specific output and start breaking. Such tests are inherently fragile, and should be changed to use a TAP parser instead of parsing literal text, but in the meantime, if your favorite test class fails its tests, this Test::More change might be the reason."
[1] https://github.com/Test-More/test-more/pull/396/files
[2] http://www.dagolden.com/index.php/2374/the-next-testmore-might-break-fragile-test-modules/
If is defined as intersection (that is also commutative ) then I F = F I = I and I F can parse only integers, just as F I. If is defined as multiplication, e.g. Cartesian product then I F = F I = F and the above statement on union applies.
"putting F first means I never gets a chance to parse anything" has more to do with algebra of parser combinators that is defined in terms of parse results rather than parseable sets.
read()
ing the tokens for specific applications, e.g., natural language parsing (NLP).
P.S. This behavior is provided by passing default_action
to the parser that will be set to the grammar and used to determine the type (if known) and thus construct the parse tree or anything that default_action
is set about to.
Not sure what I *F must do though. :)
Akin to parser combinators, perhaps.
Executable notation with Marpa just as the relational algebra is executable with SQL.
Looks like an algebra of grammars be defined in terms of their parseable sets (Algebra of Sets). Then I *F must parse only integers (intersection).
And then, if a problem domain can be reduced to an algebra of parseable entities, it must be parsed by an algebra of grammars with all benefits of mixing, matching, and reusing grammars at will.
That's all probably looks too trivial or too vague, but it starts making much more sense (to me at least) when doing general practical BNF parsing with Marpa::R2 now that grammars like this see the light of day.
]]>It is made possible by check_terminal()
, rule_ids()
, and rule()
accessors provided by Marpa::R2::Grammar.
It works because token names to be read()
by Marpa::R2::Recognizer must be terminal symbols of Marpa::R2::Grammar.
A literal (and terminal, in some cases) fits the definition of a token perfectly, so tokenizing input by (pre-)splitting on literals (terminals) looks obvious, but I was unable to find definitive links and so feel a bit uneasy as to whether or not this would work in general.
But more testing will show the truth, I think. :)
]]>Is it possible to make timeit return the time spent to be used as part of the explanation of a test later, like this:
]]>my $tree_expected = 'some fancy stuff'; my $tree_got; my $time_spent = timeit( sub { $tree_got = $p->parse($input) } ); is $tree_got, $tree_expected, "parsed in $time_spent seconds"
As for natural language parsing: couldn't agree more, use of Earley in comp-ling notwithstanding. :)
]]>For the purposes of the above definition, [1] below is an alternation with E being the symbol appearing on the LHS of more than one rule, is it not?
However, it also serves to showcase ambiguity in Marpa's documentation and, as such, "can produce more than one parse".
Hence, [1] somehow serves as both alternation and ambiguity.
But then, another question: suppose a lexem cannot be classified unambigiously at lexing time but can be so classified at parse time based on the preceeding/succedding lexems, as e.g. in "they are jogging" where 'jogging' can be a gerund or a present participle and that ambiguity is resolved by the preceding 'are'.
Can Marpa, given rules like
rules => [
[ 'Sentence', [qw/Pronoun Verb PresentParticiple/], 'do_sentence' ],
],
and token stream like
$recce->read( 'they', 'Pronoun');
$recce->read( 'are', 'Verb');
$recce->read( 'jogging', 'Gerund', 'PresentParticiple'); # ambigious lexem
parse "they are jogging" as 'Sentence' or am I just asking too much and knocking to wrong door?
[1]
rules => [
[ 'E', [qw/E Add E/], 'do_add' ],
[ 'E', [qw/E Multiply E/], 'do_multiply' ],
[ 'E', [qw/Number/], ],
],
]]>
My apology is in advance if this is just another stupid question, but I can't help thinking that what you call ambiguity (as per [1]) is really a BNF alternation in disguise, all the more that Marpa::Grammar seems to provide no support for BNF alternations directly, unless I'm missing something.
I'd be very grateful for an explanation and thanks a lot for all of your work.
[1]
rules => [
[ 'E', [qw/E Add E/], 'do_add' ],
[ 'E', [qw/E Multiply E/], 'do_multiply' ],
[ 'E', [qw/Number/], ],
],