February 2011 Archives

Patching XS modules

Today, I found out about a new release of Image::Thumbnail , which has support for Image::Epeg , a library coming from the Enlightenment desktop manager. Likely, epeg is quite fast, but it didn't build and test right under Windows. Conveniently though, Tokuhiro Matsuno maintains the module on github, so it was just a matter of forking and cloning his repository, and then trying to find out what made it break.

The breakage itself was three parts:

  1. A build failure where the symbols were not exported. It seems the epeg library wants -DBUILDING_DLL , which I supplied as a cc_optimize_flag through Module::Install, because I couldn't find a better way.
  2. Some test failures because binmode() was not used with the test image files. Easily fixed.

These two tests made it into v0.11 , released about 30 minutes after I told tokuhirom about the patches.

  1. The remaining problem was that a function call crashed Perl with

    Free to wrong pool ... at ...

This was due to a call to free(pOut) to release a memory block that the epeg library allocated for us. Unfortunately, the library uses malloc and free, and Perl redefines the two, so I had to find a way to make the XS module also call the original free() and not the redefined free(). Tye McQueen pointed out a working hack by making the compiler forget the redefinition:

#undef free

That undefinition would not have worked if there were other places below that line that still needed access to the redefined free(), but luckily that was not the case. The only snag was my limited knowledge of C preprocessor specifics, as I had some whitespace before the #undef and it was not picked up nor flagged as error. But after some trials, I also figured that out, and the pull request went out.

Cascading config parsing

Prompted by some comments about how to handle RAW files from cameras, I revisited App::imagestream , the program I use to automatically publish all images I touch to a gallery page on my website.

An experiment that I run with this program (and with App::fritzgrowl as well) is to specify the configuration information via POD. There is code in Config::Spec::FromPod and Config::Cascade that takes POD and turns it into a hash of hashes containing the configuration model. Then there is more code in App::ImageStream::Config to turn this model into a DSL for a config file, a parser for Getopt::Long or simply the defaults. Config::Cascade then fills out the values, starting with the most specific values coming from the command line, the less specific values coming from the config file and the least specific values coming from the application defaults, all driven by the documentation.

=head2 C<< output DIR >>

=for config
    repeat => 1,

Output directory

Specifies the output directory into which the output will be written.

Example:

    output 'C:/ImageStream/';

May appear only once.

So far, the experiment has been successful in the sense that specifying the config items via POD forced me to document each configuration item immediately. Writing the DSL and Getopt evaluator was fun, and having them data-driven will make it easy to also inspect %ENV for an intermediate stage of configuration between command line and configuration file. Having the first line be a really short description and always listing an example also makes me think more about the values and how to best describe them.

The bad thing is of course that internationalization / localization will be nasty when specifying program data through program configuration. If you ruin the pod, the program won't work anymore, as it can't read its configuration anymore. This is currently of small concern to me as I don't plan on translating its documentation. But if some other well-meanung and unsuspecting individual translates the documentation, that'll be fun breakage at a distance.

Also bad is that my way of specifying out-of-band information implies embedding Perl again via =for blocks. This is a nasty-but-convenient hack.

In the long run, Config::Spec::FromPod should produce something suitable for consumption by Config::Model, so that I can reuse the GUI editors and the schema provided by it. But maybe I'll realize sooner that the experiment is doomed to fail, or I abandon the applications using it.

I haven't released the modules onto CPAN for two reasons. Firstly, because I think that while interesting, parsing POD for code is a debatable design choice. Second, the API of Config::Cascade isn't that great yet, and it's missing the provider that fetches the values from %ENV, and lots of use and documentation. I've only used the module in two applications, and in my experience, if I haven't used something at least three times, the API will likely be crap.

Win32::Wlan released

Win32::Wlan has now escaped onto CPAN. I've redone the whole structure. There now is Win32::Wlan::API, which is a very thin layer over the Microsoft Wireless API. Win32::Wlan itself provides an object that handles the initialization and deinitialization of the API and provides convenient access to the first (and highly likely, only) wireless connection of the computer.

Now I have to write a convenient "snapshot" program to invoke to "mark" a place, and another program to recognize those wireless snapshots and to associate them with a location. When a location is then recognized, this could trigger scripts that configure other programs for the proper proxy settings.

Making Perl location-aware, one bit at a time

One of my long-term ideas is to make my laptop more location aware and to have something like cron except for locations instead of time. It will be useful to automatically reconfigure my network and proxy settings in Firefox and Thunderbird and to prevent information leakage. Also, my laptop can automatically synchronize its minicpan and git repositories with the mothership when it detects that a "fast" or "local" network connection is available.

As I'm not really interested (yet) in the physical location, it is enough for the laptop to know what wireless networks are visible to determine how it should behave. Later (or once GPS sensors become commonplace in laptops, or Perl becomes commonplace on GPS-enabled devices), adding GPS detection etc. will also become interesting.

As a first step, I've written a component that should get me a long way in the right direction. Win32::Wlan can read the list of currently visible wireless networks, and can also read what wireless network(s) the computer currently is associated with, and what encryption is available. This should be enough to drive a small rule engine that fires off scripts that handle the transitions between "no network" / "known network" / "unknown network" etc.

Win32::Wlan is in very rough shape. This is why I haven't released it onto CPAN yet. It uses Win32::API, which is somewhat unfortunate, as that precludes its use with a 64bit Perl. On the upside, it does not need any extra C headers installed. For going native, the wlanapi.h header file needs to be available. MinGW does not have that header file available, and the version that is freely distributable, a reimplementation by Google, is missing interesting functions. Once I get Win32::Wlan a bit of use, I will likely wrap it in a nicer API than the raw Windows C API, so that managing the various handles becomes more automatic. It would of course be nice to also have a Wlan API for other operating systems, and to maybe find a common ground, but as I know nothing about how other OSes handle Wlan, I'll first work on solving my immediate problem...

For the time being, Win32::Wlan lives at https://github.com/Corion/Win32-Wlan, until it climbs the fence and escapes onto CPAN.

App::scrape - Simple HTML scraping from the command line

Inspired by a demonstration of Mojolicious on the command line , I replicated the relevant functionality as a stand-alone program, tentatively named App::scrape. It's currently based on HTML::TreeBuilder::XPath , the ever-useful HTML::Selector::XPath, and LWP::Simple.

App::part released to Github

Today, I moved the development place of one of my incredibly simple yet incredibly useful data munging tools from http://perlmonks.org/?node_id=598718 to Github . I hope that by moving it there, it becomes easier for people to submit patches.

I haven't given much thought about putting the program onto CPAN as well. The packaging is already done, but I'm not sure if I want to split out the meat of the program into a module, or maintain two versions, a module-using version and the stand-al…

WWW::Mechanize::Firefox 0.45 released

I've just pushed out a new release of WWW::Mechanize::Firefox , my favourite interactive web scraping module. The changes in this version are:
  • ->eval_in_page() raises errors from perspective of the caller instead of raising errors in Firefox.pm.

    This was an especially annoying error, as it always pointed to somewhere deep within Firefox.pm when I tried to access an non-existing Javascript variable or had Javascript disabled in Firefox.

  • Added ->by_id() method and { id…

About Max Maischein

user-pic I'm the Treasurer for the Frankfurt Perlmongers e.V. . I have organized Perl events including 9 German Perl Workshops and one YAPC::Europe.