Count-up to 100 CPAN Distributions: Test-XML-Ordered, SDLx-Betweener, and more

By Shlomi Fish on December 6, 2012 5:49 PM

Well, it's no longer a secret that I'm craving to join the "100 CPAN Distributions Club" by releasing some code that is hopefully not too useless. Here I would like to blog about the two new additions to my CPAN page which are the 83rd and 84th distributions respectively. The story is a bit more complicated than that.

The first upload is SDLx-Betweener, which allows for creating high-performance and smooth animations in SDL by making use of Perl/XS. Almost all of the coding (and a related YAPC::Israel talk) was done by Ran Eilam, who is a very cool guy, a good friend and a former boss of mine, and I've just done (with his permission) some last-minute cleanups and did the initial upload to CPAN. So I kinda feel like the frog that sat on top of the elephant who crossed the river and said "We did it!".

However, I needed SDLx::Betweener for something I've been intending to work on, and wanted it on CPAN and packaged in Mageia Linux, so I did that. If you run into any problems with it, drop me a note or file a bug report, and hopefully I can resolve these problems (possibly with some help from Ran).

The next distribution is original, but its story is more complicated. It all started when I decided to resume work on Qantor, which aims to eventually be a modern and saner alternative to TeX/LaTeX and Troff (although it is still extremely far from that now). I noticed that the parsing was still done by Regexp::Grammars, and so decided to convert it into Parser::MGC. Parser::MGC worked eventually and I was impressed from the straightforward way to do stuff with it, but it was time consuming to get there (like most other parser generators I tried, only a bit better), and involved writing some Moose 'around' code to debug the method calls, and also sometimes delving into the code. Parser::MGC is still not perfect (and some of the code I read there had gems like die $e if $committed or not eval { $e->isa( "Parser::MGC::Failure" ) }; which I found hard to parse, and made me want to drown a kitten), but I think it sucks less and is more transparent than other parser generators I tried.

Anyway, after I got it working, I noticed that the Test::XML::Ordered code (then still unreleased), generated a some memory problems and I suspected either XML::LibXML or libxml2 to be the culprit. This involved a long investigation process with valgrind, gdb, perl -d and other tools which culminated in a very small change and more lines of test code. This was released in XML::LibXML. After that I received a report that the tests got stuck, but that turned out to be due to external loading of DTDs and was easily fixed. There's another report for a problem with perl-5.8.8 (ouch!) on Red Hat Enterprise Linux (ouch again!), but I'm not too motivated to do a lot about it (see this fortune cookie).

Well, after I fixed the problem with XML::LibXML, I decided to release Test::XML::Ordered. I'm well aware of Test::XML, but found it hard to rely on it, because it uses XML::SemanticDiff (the latter I adopted) and which attempts to rearrange the order of the nodes when it sees fit (so <ul><li>One</li><li>Two</li></ul> and <ul><li>Two</li><li>One</li></ul> may be considered the same), which was not what I want and not what other people who contacted me about XML::SemanticDiff wanted.

After Test::XML::Ordered, I decided it would be a good idea to finally work on the release of XML::GrammarBase which aims to provide "base classes and roles" for facilitating creating XML validators and processors. Work on it was pretty straightforward after I had Test::XML::Ordered available, but required some refactoring the classes from using Any::Moose to using Moo per advice of the people on #moose, and I have yet to convert the XSLT role into a parameterised role.

I have some other code I'd like to release on CPAN after all that.

Another related hacktivity was a set of patches to the core Vim that I've written to improve support for DocBook 5. Since I wanted to maintain compatibility with DocBook 4 documents (because DocBook 4 is still popular), this involved writing a Perl, Bash and Python Amalgam to generate the Vim code in question with a list of common, DocBook 4-only and DocBook 5-only tags. You can find it in its BitBucket repository. It is very hacky, but it works. Using the Python code gave me an idea for a missing XML::LibXML feature - DTD introspection - which I'd like to implement in the future.

Well, that's all for now - just wanted to get it out of my system. Enjoy the holidays, and Cheers!

3 comments

Tagged as:

100, cpan, hacktivity, hacktivity log, perl, sdlx, xml

3 Comments

Jakub Narebski | December 10, 2012 11:28 AM | Reply

Why generate a tool to write/generate recursive descent parsers instead of using Marpa and its ecosystem?

Shlomi Fish | December 30, 2012 7:19 AM | Reply

Hi Jakub, thanks for your comment.

I should note that in general, I have yet to study Marpa extensively (though I looked into it a bit). One thing I dislike about it is that it still requires a separate tokenising/lexing stage instead of allowing one to use regular expressions for tokens/terminals, which is what Parse::RecDescent and Parser::MGC allow.

Another aspect of Parser::MGC that I like is its transperency, and the fact I don't have to put my grammar in one big string, and instead write it piecemeal using a combination of OOP and inheritance and using closures. I do hope to take a closer look at Marpa, though.

Shlomi Fish replied to comment from Shlomi Fish | January 3, 2013 4:22 PM | Reply

Replying to myself, I would like to note that the Marpa scanless interface was announced, so it's one less reason for me not to use it. Thanks!

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Shlomi Fish

An Israeli software developer, essayist, and writer, and an enthusiast of open/free software and cultural works. I've been working with Perl since 1996.

More info »

Shlomi Fish