Now, backtracking is tricky to get right. Currently Parrotlog has a problem with cuts. A cut is supposed to remove all choice points encountered after the currently executing predicate was invoked. Parrotlog's cut prunes all choice points since the invocation of the last predicate that matched successfully. Close, but no cigar.
Graham's CL implementation (that Parrotlog owes quite a bit to) solves this by taking a reference to the choice point stack on invocation and just restoring that on cut. That works because Lisp lists are cons cells, but Parrot RPAs aren't, so I'll have to be a bit more clever. At the moment, I've two possible outs: 1) Copy the whole stack on invocation. Not very sophisticated, but it works. 2) Change the mark method to generate unique marks each time the stack is mark, so that cut can backtrack arbitrarily far into the stack, as long as you know the ID of the destination.
I think I'm gonna take a look at the materials I have on the WAM as well, see if I can't get some inspiration from how cuts are implemented there. But first, I have to decide if I want sushi or crêpes for dinner. Choices, choices...
]]>There's a module with the same name on CPAN, but I've based the code on a Common Lisp version I wrote for school a while back. At the moment the module is pretty basic, and doesn't support assigning probabilities to unseen data for example. The module is available on GitHub.
A more advanced version will support computing the sum of the log-probabilities rather than the product, smoothing and unobserved data, and the option of mixing in a role to domain objects so that they can be passed directly to the decode method from the client code.
]]>Now, the good news are that the Prolog spec is actually an operator precedence grammar, which happens to be how NQP does its expression parsing as well. The bad news are that the spec uses term for everything, while NQP makes a distinction between terms (atomic expressions) and expressions (expressions, with or without operators). This means that I have to figure out if I should use term
or EXPR
whenever the spec says term. Let's see how deep the rabbit hole is.
Now, as I've mentioned before, in Parrotlog this is implemented using continuations, based on example code from Graham's On Lisp book (chapter 22). Simply put, continuations allow you to restore the execution of your program to a previous state. For the C programmers, this is simillar to setjmp(3)
and longjmp(3)
, but returning from the originating function doesn't invalidate the saved state. On Lisp chapter 20 has more about continuations, and so do the Parrot docs.
This, then, is the core of our backtracking, and with continuations it's actually pretty simple. Each time we encounter a choice point, we just store a continuation which will try a different value from the one we're going with right now. Failure is then just a matter of popping the top continuation of the stack (since we want to backtrack to the last choice point encountered, the LIFO semantics of a stack is what we want) and invoke it, returning execution to the choice point.We do this until we find a match, or we backtrack out of the search entirely.
Then we have cuts. A cut can be seen as pruning the search tree, or committing to some of the choices that have been made. We implement this just like Graham's Scheme code does: we mark the limit of the cut with mark()
, which stores a subroutine reference to fail()
on the stack, so that failure still consists of popping the top element off the stack and invoking it. Popping the mark will just result in a recursive call to fail()
. cut()
simply pops items off the stack until it finds the mark.
The main difference between my code and Graham's (apart from the fact that my code is PIR and his is Scheme) is that I explicitly thread the stack of continuations through the various functions as their first argument. This is based primarily on a gut feeling that it might come in handy at some later point. Someone once told me that global variables were a bad idea, and I've found that to be right most of the time. Primarily I think it might be useful for the metalogical predicates like findall/3
where the code will have several nested backtracking searches. I think a single global stack might work, but I'm not sure, and I think an explicit stack would make that code clearer anyways.
This means that the core infrastructure I need should now be in place: unification, backtracking and cuts (a post on those last two is coming up). Now it's time to start looking into writing the grammar, and figuring out how to represent rules and the fact database.
]]>Now, on to the implementation. Most of the documents I've found tell you to start your Parrot HLL project with a script called mk_language_shell.pl, but I found that it doesn't do quite what it say on the tin. Instead I used tools/dev/create_language.pl. This script creates a basic folder hierarchy similar to the one used by Rakudo. A quick tour of the files and folders:
build/
contains everything that has to do with the build process. Most interesting is PARROT_REVISION
which specifies which Parrot is required, and Makefile.in
which is where the build process can be extended.Configure.pl
does what it's called. Call it with --gen-parrot
to build the required Parrot version as well.t/
and src/
contain the usual bitsEDIT: I lied. Backtracking is chapter 22 of Graham's book. Chapter 20 is continuations.
]]>To try to suppress my natural tendency to move from subject to subject (being a fox and not a hedgehog) I'm going to keep a development diary here. Also, I've found that the documentation on implementing HLLs with Parrot is either thin on the ground or a bit dated. So hopefully some of my scribblings turn out to be useful to someone else.
I've started work on the project, and keep a repository on GitHub. If you're interested, have a look at http://github.com/arnsholt/parrotlog.
]]>