May 2012 Archives

Alien::Base Perl Foundation Grant Report Month 3

This month’s work on Alien::Base started really exciting. I had tracked down several bugs and was honing in on full Linux compatibiliy. Turns out much of my testing problems had been in the test suite, wherein I mimiced make/Makefile with a perl script (in the name of Xplatform) but made some incorrect assumptions.

The problem came when I went to look at OS X. The problem I have found is that Mac bakes the full path to the library into the library itself during compiling/linking. It can be changed manually as long as the new path is shorter than the old one! Yes there are work-arounds but they would involve delving into the library’s Makefile, which I have been ardently trying to avoid (opens a pandoras box). The other mechanism would be to target the library to install in the final place directly, but this breaks the usual workflow of installing a Perl module, in which perl Build.PL and ./Build do not do anything permenant, waiting until ./Build install to move the files to their final places.

This problem cropped up a few weeks ago and since then my work on Alien::Base has been residing in the grey-matter, hoping a solution would present itself. I don’t want to say “forget Mac” because the whole point of Alien is to be as cross-platform as possible, still I keep coming back to thinking that perhaps an early release might have to forgo Mac initially.

My research has picked up, some of you may have seen my PerlGSL and related modules arrive this month; these modules are not purely a diversion from Alien::Base but rather they have been critical in some simulations I am developing. These and my thesis have taken some time from Alien::Base but I knew that (and disclosed it) going into the grant.

I am targeting a Linux release before next month’s grant report as most of the API seems sufficient, though I am reluctant to release while the Mac issue is outstanding like this. Hopefully the Perl Foundation will be patient with me as I deal with these pressing matters both inside and outside of the Alien::Base project.

TeXlipse for LaTeX development

I know that this is a Perl blog, so I will just post a quick link to my other blog. If you are writing any LaTeX, you should look at TeXlipse; look again if you have already tried it in the past. Read more on my other blog.

Using Mojo::DOM

Mojolicious is already well known for its web framework, but I am finding more and more (after being told by our own brian d foy) that its DOM parser (Mojo::DOM) is worth the price of admission as well. Anyway today I was poking around StackOverflow and I ended up answering a question using nothing more than some well crafted DOM calls. Here is my (slightly reworded) response. It makes for a nice example of using simple CSS3 selectors to simplify HTML parsing.

The question goes something like this: Lets say we have some HTML which contains the times that a shop is open. How can we get this information in a HTML5/CSS3 (i.e. modern) way? Mojo::DOM.

#!/usr/bin/env perl

use strict;
use warnings;

use 5.10.0;
use Mojo::DOM;

my $dom = Mojo::DOM->new(<<'HTML');
<div class="box notranslate" id="venueHours">
<h5 class="translate">Hours</h5>
<div class="status closed">Currently closed</div>
<div class="hours">
  <div class="timespan">
    <div class="openTime">
      <div class="days">Mon,Tue,Wed,Thu,Sat</div>
      <span class="hours"> 10:00 AM–6:00 PM</span>
    </div>
  </div>
  <div class="timespan">
    <div class="openTime">
      <div class="days">Fri</div>
      <span class="hours"> 10:00 AM–9:00 PM</span></div>
    </div>
    <div class="timespan">
      <div class="openTime">
        <div class="days">Sun</div>
        <span class="hours"> 10:00 AM–5:00 PM</span>
      </div>
    </div>
  </div>
</div>
HTML

We can use find to get a collection of results, each to make an array, then manually appling the text method.

say "div days:";
say $_->text for $dom->find('div.days')->each;

say "\nspan hours:";
say $_->text for $dom->find('span.hours')->each;

Or equivalently we can let Mojo do a map for us!

say "div days:";
say for $dom->find('div.days')->map(sub{$_->text})->each;

say "\nspan hours:";
say for $dom->find('span.hours')->map(sub{$_->text})->each;

Both forms give the output:

div days:
Mon,Tue,Wed,Thu,Sat
Fri
Sun

span hours:
 10:00 AM–6:00 PM
 10:00 AM–9:00 PM
 10:00 AM–5:00 PM

But say we want to get the times corresponding to the days? We can use the children of the openTimes div:

say "Open Times:";
say for $dom->find('div.openTime')
            ->map(sub{$_->children->each})
            ->map(sub{$_->text})
            ->each;

Output:

Open Times:
Mon,Tue,Wed,Thu,Sat
 10:00 AM–6:00 PM
Fri
 10:00 AM–9:00 PM
Sun
 10:00 AM–5:00 PM

I know it may not seem very impressive to people who do this all the time, but to a relative web outsider, the intuitiveness of this interface makes parsing out the HTML a breeze!

Announcing: PerlGSL - A Collection of Perlish Interfaces to the Gnu Scientific Library

With this post I am happy to announce the release of my new distribution: PerlGSL. This accompanies several other releases I’ve made in the past few days, I’ll get to those in a moment.

A few days ago I asked what I should call my new multidimensional integration module. The discussion centered on whether it was more important that it required the GSL library, or whether it was a set of bindings for the GSL (was that set complete)? Was it a dist in its own right needing a toplevel name, or that it was mathematical and should be under Math::?

After discussion and reflection, I have decided that I wanted a toplevel namespce for this project, mostly because the need to satisfy the external dependency on the GSL separates these modules from others. To make it worthy of that honor, I have made it into a dist in its own right, not unlike other named dists like Mojolicious or Catalyst, though more modular.

Unlike those projects I am not reserving the entire namespace for myself; I want people to contribute to the PerlGSL namespace. Is it a set of bindings to the GSL? No, but close. I’m calling the namespace a ‘collection of interfaces’. Can there can be more than one interface to the same library? I’m OK with that. Does any need to span an entire library? No. Can a library pull several functions from different places to create one useful Perl module? By all means!

So what does the dist named PerlGSL do? First it serves to define the namespace. Second, if you install it, it will install what I am calling the “standard modules”. So far I am the only author of these “standard” modules, but I would love to add yours, though I reserve the right no to. I want the individual modules to live on their own, but be installable together; a modular, bottom-up collection, but one that can be installed together for convenience. PDL, for example, is a great dist, but it’s huge and mostly monolithic; I can’t just install what I need. I hope that PerlGSL finds a nice balance between monolithic and separate dists.

Ok on to the technical stuff. So far I have uploaded two new modules, and rechristened another. The new ones are PerlGSL::Integration::SingleDim and PerlGSL::Integration::MultiDim, I think their names give away their tasks :-). I have rechristened Math::GSLx::ODEIV2 as PerlGSL::DiffEq. These are all part of the “standard” modules, though you only get PerlGSL::DiffEq if you have GSL >= 1.15. I have also released a new version of Math::GSLx::ODEIV2 which announces its deprecation, though it does still provide the ode_solver function via PerlGSL::DiffEq for now.

I hope to add more functionality as I need them. I hope you might do the same. The GSL is too big to ask one person to wrap it all; very few people will ever need all of it. It is really good software though, and it works really well with Perl. I hope you enjoy it.

(Sorry, this has been a bit stream of consciousness, I have been working on this a little too much in the past few days to compose something concise it would seem.)

On CPAN Namespaces: Urban Namespace Planning

I’m having a bit of a conundrum over where to put my next GSL-based module. First some background.

I’m already the author of a GSL-based module (see my first rant), the horribly named Math::GSLx::ODEIV2. This name reflects the same odd namespacing conundrum that I find myself in again, as well as the sub-library name odeiv2.c/h.

Duke Leto has already essentially taken the whole Math::GSL namespace by brute-force SWIG-ing the entire library. Much of this work is not fully implemented, but still parked. Further, since the namespace is already fairly crowded, its next to impossible to tell which parts are his and which would be anyone else’s. So lets call that out of the running. Note that I’m not complaining about his efforts, but it makes choosing a name harder.

I released my first module which uses GSL under the namespace Math::GSLx, but this is also less than desirable since it seems to be related to Math::GSL which it isn’t (at least not in the way that MooseX is related to Moose). Its also hard to type and hard to search for.

This leave me thinking about starting a new toplevel, which I do not undertake lightly. My two concepts are the simple GSL and the more namey PerlGSL. I say namey since toplevels with names like Mojolicious are not contentious since they represent more of a concept than an implementation detail (like Net::).

To be distinct from Math::GSL I would encourage users of PerlGSL to

  1. Make their interfaces Perlish rather than the utilitarian output the SWIG may produce
  2. Give their module a name that is descriptive without squatting on the sub-library name or other implemenations

In this way if two different authors want to provide an interface to GSL’s Monte Carlo multidimensional integrator, one might be PerlGSL::Integration::MultiDim (since there are a number of 1D integrators to be considered) and another might be PerlGSL::Integration::NDim.

Once I settle on a toplevel, I expect that I will “release” a namespace decriptor module (not unlike Alien) describing this for future users. It might also eventually pull in the GSL library via Alien::GSL once my Alien::Base work permits. From there I would release PerlGSL::Integration::MultiDim and rechristen Math::GSLx::ODEIV2 as PerlGSL::DiffEq (assuming the toplevel PerlGSL).

Anyway, I’m interested in your comments, so please let me know!

About Joel Berger

user-pic As I delve into the deeper Perl magic I like to share what I can.