Neil Bowers released a survey of markdown to HTML formatters recently. I thought it was an interesting coincidence, because I have recently written a CPAN library to go the opposite way, from HTML to Markdown.
For various and sundry reasons I wanted to move my blog from a Wordpress installation to a static blog where the post content is represented as markdown, but there were (to my complete astonishment) no CPAN modules to convert HTML to markdown, so I decided to write one based on HTML::Format.
In general, I was surprised by the lack of tools (in any language) to convert Wordpress exports into markdown, but now we have something for Perl. I was pleasantly surprised how quick and straightforward it was to implement the converter. If you have a need to convert HTML into format X, give HTML::Format some serious consideration as the base platform to do that work.
Over the weekend my new module was merged and released to CPAN by the HTML::Format maintainer. The driver script for the WordPress to Markdown conversion is here. I may revise my driver script to put post metadata into TOML but I haven't done that yet mostly because the static blog engine is still under construction so the exact post format requirements are still unstable.
I used a fairly good sized corpus of posts as tests and had good results but more tests are always welcome.
CPANfile is a simple way to declare your project's dependencies in a build system independent manner.
- In recent versions of cpanminus, it makes your entire project installable from a git repository, and,
- it also allows you to "pin" your dependencies on a specific CPAN release in a very sophisticated way, rather than "this version or newer" which is the typical Perl dependency resolution.
Why would you want to install a project from git instead of the normal CPAN download/build/test/install process? There are a lot of use cases, but the one I c…
One of the fun / cool things about Perl is that it can easily inhabit that space between "too complex for bash" and "too insignificant to invest in a C implementation." In my opinion a lot of the command line tools for EC2 are pretty terrible - they have a large learning curve, a high amount of dependencies and they just aren't that easy to get up and going.
So I started by thinking what was the "minimum viable product" for an EC2 client? Something that slaps a valid v2 AWS signature on any arbitrary API request and translates the XML returned into a Perl data structure. And that's…
One of the things that brian d foy worked pretty hard on for Perldoc inside of 5.16 was better UTF-8 support. We found that there are a huge number of variables for getting good Unicode support out of the "man" formatting pipeline. perldoc internally uses the "podulators" distribution to turn POD markup into man pages, HTML, XML, etc. But with the "man" formatting, the pipeline of operations looks something like this:
perldoc (a tiny little wrapper around the Pod::Perldoc module) finds the appropriate pod markup (either embedded in a .pm or a .pod), passes it to Pod::Man, takes th…