When a solution has the same shape
as the problem,
it is a very good thing,
and not just because it looks pretty.
I have described
a Marpa-based, "Ruby Slippers"
approach to parsing liberal
and defective HTML.
A major advantage
is that it looks like
the problem it solves.
HTML parsing: the problem
The problem of parsing an HTML document
the problem of finding
the hierarchy of its HTML elements.
HTML elements are delimited by start and end tags.
The HTML standards specify that certain of the
start and end tags can be omitted.
In liberal and defective HTML,
any HTML tag might be missing.
In liberal and defective HTML,
unknown and spurious tags
may be present in the physical input.
HTML parsing: the solution
This is a shot of the rooftop terrace on the Pyle Center roof where the YAPC::NA 2012 banquet will be held on Wednesday, June 13th.
I recently released a couple of API clients for the Ge.tt file sharing service, one in Perl and one in Python. (I am just a fan of the service, not an employee or contractor.) I would judge myself an "intermediate" pythonista mostly due to inexperience.
It's a culture shock coming from a background of CPAN. The old joke is that Perl is just a life support system for CPAN and that is arguably true, but I am here to tell you: you may not appreciate how good Perl hackers have it with respect to CPAN and the culture around documenting, packaging and testing distros once they're on CPAN.
A while ago I was converting a simple PHP website to Dancer,
and moving it from being deployed on Apache to Starman.
There wasn't a lot of code,
so rewriting went quickly -- but,
the site used a few specific features of Apache,
namely directory indexes (courtesy of mod_autoindex) to allow user access to directories/files on the server,
htpasswd files to password-protect some of those directories.
I could just deploy the new Dancer website on Apache and keep using those goodies,
but I thought that it would be nice if Dancer itself provided similar features.
I created two plugins that do just that: Dancer::Plugin::DirectoryView and Dancer::Plugin::Auth::Htpasswd.
Let me now show you how to use them.
It was not that difficult getting my first distribution released on CPAN. But getting rid of the rough edges meant spending time with my favourite search engine and on IRC.
My code is hosted on github and I decided to let Dist::Zilla handle most of the work associated with releasing. Oh and I already had a PAUSE account so I could start right away.
Using Dist::Zilla turned out to be easy. Just install and after following the tutorial for some minutes I took my new distribution for a first spin.
to make sure everything works. (Btw. I finally got around to writing tests, something I have avoided for far too long.)
so I can have a look at the final result
publish the first development release.
I took over maintenance of Pod-Perldoc, and with the help of a lot of people, I'm ready to merge it into the perl sources. There aren't major new features or a change in structure. I applied a lot of patches. Pod-Perldoc-3.15_12 is on CPAN so you can play with it. This next release fixes 15 old RT tickets, some of them major problems but most of them quick fixes that I merely needed to apply. This week is a good time to test it as the perl sources have a contentious code freeze before Perl 5.16.
The biggest change gets rid of pod2man, which perldoc was using before it turned into the Pod::Man module. Now that it's a module, we can just call it directly. I think that mostly works now, but please test it. See if you can read all of perlfunc.
I’m very pleased to announce that Best Practical Solutions is sponsoring YAPC::NA 2012. Best Practical Solutions are the creators of RT: Request Tracker, the leading open-source issue tracking system.
Best Practical was founded to deliver value to RT’s established base of users by providing custom development and user support for RT. We are fully committed to supporting RT as an open source technology, while providing the quality development and support necessary to operations in commercial enterprises and corporations.
An interesting question came up today regarding our team’s Act development for YAPC::NA: Are we going to take the effort to maintain internationalization for the new features we add? My answer was an emphatic yes!
We’re a US conference, with a team largely composed of US programmers (we have a few foreigners lending us a hand; thanks, guys!). Even so, Perl is not exclusively a US programming language; people all over the world use Perl for just about everything! The Act team has been doing a great job making their application available to people speaking a variety of languages, and I’m proud to carry on that tradition. Now, if you’ve never written internationalized code before, the prospect may seem a bit daunting; so here’s a tip you can use when working on Act: