Count-up to 100 CPAN Distributions: Test-XML-Ordered, SDLx-Betweener, and more
Well, it's no longer a secret that I'm craving to join the "100 CPAN Distributions Club" by releasing some code that is hopefully not too useless. Here I would like to blog about the two new additions to my CPAN page which are the 83rd and 84th distributions respectively. The story is a bit more complicated than that.
The first upload is SDLx-Betweener, which allows for creating high-performance and smooth animations in SDL by making use of Perl/XS. Almost all of the coding (and a related YAPC::Israel talk) was done by Ran Eilam, who is a very cool guy, a good friend and a former boss of mine, and I've just done (with his permission) some last-minute cleanups and did the initial upload to CPAN. So I kinda feel like the frog that sat on top of the elephant who crossed the river and said "We did it!".
However, I needed SDLx::Betweener for something I've been intending to work on, and wanted it on CPAN and packaged in Mageia Linux, so I did that. If you run into any problems with it, drop me a note or file a bug report, and hopefully I can resolve these problems (possibly with some help from Ran).
The next distribution is original, but its story is more complicated. It all started when I decided to resume work on Qantor, which aims to eventually be a modern and saner alternative to TeX/LaTeX and Troff (although it is still extremely far from that now). I noticed that the parsing was still done by Regexp::Grammars, and so decided to convert it into Parser::MGC. Parser::MGC worked eventually and I was impressed from the straightforward way to do stuff with it, but it was time consuming to get there (like most other parser generators I tried, only a bit better), and involved writing some Moose 'around' code to debug the method calls, and also sometimes delving into the code. Parser::MGC is still not perfect (and some of the code I read there had gems like die $e if $committed or not eval { $e->isa( "Parser::MGC::Failure" ) }; which I found hard to parse, and made me want to drown a kitten), but I think it sucks less and is more transparent than other parser generators I tried.
Anyway, after I got it working, I noticed that the Test::XML::Ordered code (then still unreleased), generated a some memory problems and I suspected either XML::LibXML or libxml2 to be the culprit. This involved a long investigation process with valgrind, gdb, perl -d and other tools which culminated in a very small change and more lines of test code. This was released in XML::LibXML. After that I received a report that the tests got stuck, but that turned out to be due to external loading of DTDs and was easily fixed. There's another report for a problem with perl-5.8.8 (ouch!) on Red Hat Enterprise Linux (ouch again!), but I'm not too motivated to do a lot about it (see this fortune cookie).
Well, after I fixed the problem with XML::LibXML, I decided to release Test::XML::Ordered. I'm well aware of Test::XML, but found it hard to rely on it, because it uses XML::SemanticDiff (the latter I adopted) and which attempts to rearrange the order of the nodes when it sees fit (so <ul><li>One</li><li>Two</li></ul> and <ul><li>Two</li><li>One</li></ul> may be considered the same), which was not what I want and not what other people who contacted me about XML::SemanticDiff wanted.
After Test::XML::Ordered, I decided it would be a good idea to finally work on the release of XML::GrammarBase which aims to provide "base classes and roles" for facilitating creating XML validators and processors. Work on it was pretty straightforward after I had Test::XML::Ordered available, but required some refactoring the classes from using Any::Moose to using Moo per advice of the people on #moose, and I have yet to convert the XSLT role into a parameterised role.
I have some other code I'd like to release on CPAN after all that.
Another related hacktivity was a set of patches to the core Vim that I've written to improve support for DocBook 5. Since I wanted to maintain compatibility with DocBook 4 documents (because DocBook 4 is still popular), this involved writing a Perl, Bash and Python Amalgam to generate the Vim code in question with a list of common, DocBook 4-only and DocBook 5-only tags. You can find it in its BitBucket repository. It is very hacky, but it works. Using the Python code gave me an idea for a missing XML::LibXML feature - DTD introspection - which I'd like to implement in the future.
Well, that's all for now - just wanted to get it out of my system. Enjoy the holidays, and Cheers!
Why generate a tool to write/generate recursive descent parsers instead of using Marpa and its ecosystem?
Hi Jakub, thanks for your comment.
I should note that in general, I have yet to study Marpa extensively (though I looked into it a bit). One thing I dislike about it is that it still requires a separate tokenising/lexing stage instead of allowing one to use regular expressions for tokens/terminals, which is what Parse::RecDescent and Parser::MGC allow.
Another aspect of Parser::MGC that I like is its transperency, and the fact I don't have to put my grammar in one big string, and instead write it piecemeal using a combination of OOP and inheritance and using closures. I do hope to take a closer look at Marpa, though.
Replying to myself, I would like to note that the Marpa scanless interface was announced, so it's one less reason for me not to use it. Thanks!