Use the latest ISBN data without upgrading the module

I've made Business::ISBN much more fresh, and allowed users to freshen it themselves without installing a new version.

A long time ago, I created the Business::ISBN to help me cleanse a publisher's database. That is, to help them cleanse the Excel workbook that they were using as their database. About 10% of their ISBN's were wrong in the file, and this little bit of Perl identified the problem (while an intern took those titles and looked them up online and corrected them in Excel). I think this was my second CPAN distribution.

To check an ISBN, there are several things to look at. The last digit is a checksum. If that doesn't check out, something else is wrong. The group code, the publisher code, or the book code might be wrong. These things move around slightly though.

A continuing problem has been keeping up with new group codes and new publisher codes assigned within a group. If I wasn't actively using the module, I didn't care about letting the data go stale. When people sent me updates, I applied them, just as I did today. It was a tedious web scrapping process.

Instead of updating the Business::ISBN module each time I wanted to do this, I split the data into a separate module, Business::ISBN::Data, so you could update the data while leaving the interesting code alone. That part is about to get a bit more interesting.

I noticed that the ISBN now publishes these data as the RangeMessage.xml file now. It might have been there a long time, but I never noticed it. With regular data, generating the data structure is much easier, at least until they change the XML structure. But, this also means that if I load data from that file, I can let you set the source for that file in preference to the one I put in the distribution. You can use a later version without waiting for a new release:


BEGIN {
$ENV{ISBN_RANGE_MESSAGE} = '/here/is/my/RangeMessage.xml';
}
use Business::ISBN;

You can fetch the very latest every time:

BEGIN {
use LWP::Simple qw(getstore);
my $file = '/here/is/my/RangeMessage.xml';
getstore( 'http://www.isbn-international.org/agency?rmxml=1', $file );
$ENV{ISBN_RANGE_MESSAGE} = $file;
}
use Business::ISBN;

Version 20120719 supports that now.

But, if I can read from that file from the network, the module can load the file from the network, perhaps storing it in a temporary file while it needs it so it doesn't stick around once you are done. I haven't done that bit yet, and probably won't until I need it.

Despite all this, you can still do things the old way. The module still has a hard-coded data structure as a fall back when you don't specify a file or the included RangeMessage.xml is missing.

Leave a comment

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).