December 2011 Archives

Tinkering with a (safe) string use

I read Schwern’s post How (not) To Load a Module just as I was wanting to dynamically load different Module::Build subclasses for different OSes. It struck me just as odd as it seems to for everyone that use-ing a module from a string should be so hard.

In my spare time, I have been working on some use problems using Devel::Declare and it gives some intersting hope here. Preliminarily I am calling it UseX::Declare but hopefully someone will come up with something better. Basically it provides a function called use_from which acts like:

use UseX::Declare;
BEGIN {
  our $var = 'Net::FTP';
}
use_from $var;

Through the magic of Devel::Declare, the parser sees:

use UseX::Declare;
BEGIN {
  our $var = 'Net::FTP';
}
use_from(1); use Net::FTP;

The use_from(1); is no-op cruft that allows me to get around a limitation in Devel::Declare (or if not a limitation, then a failure in my understanding).

The upshot is that NO eval is needed (not even in the UseX::Declare module)! A string is stored, then that string is made to be a bareword. That’s it.

Does this seem good? Comments welcome!

P.S. Since I am not too set on a name, I don’t have a GitHub repo for it yet, however here it is as a gist.

P.P.S. Another feature I hope to add in a similar manner, is some better way to only use if something is installed.

Tie::Array::CSV is now more efficient on row ops

Not nearly so exciting as my announcement of Zoidberg, but today I announce the release of Tie::Array::CSV version 0.04.

Tie::Array::CSV leverages Tie::File and Text::CSV to allow access to a CSV file as a native Perl 2D array (i.e. array of array references), without having to read the (entire) file into memory.

The major improvement in 0.04 was inspired by a conversation with David Mertens at the WindyCity.pm imformal meeting a couple weeks back. As I was explaining T::A::CSV to the group over a couple beers, David asked if the file is updated at every change. I said proudly that it was, however he noted that this has some drawbacks.

His first point was that it is expensive to write out the whole row even when the next operation might be on the same row. To address this, now, by default, same-row operations do not write out on each change. More specifically as long as the reference to the row stays in scope, the related line is not updated. This allows for more efficient operations like map on the rows.

Additionally, this feature is optional, controlled by the hold_row option to the constructors, which is true by default.

His second point, which I cannot do anything about, is related to Tie::File. He asks if changes to the top of the file are more costly than near the bottom? Honestly I didn’t know so I ask you, the reader; if you know please comment. Either way, I don’t see changing away from Tie::File for line access, but it is something worth understanding.

One other change is that the tie and new constructors now have the same option passing mechanism. In 0.03 the new constructor allowed a more flexible system for passing options, but the tie constructor did not. This mechanism has been ported back to the tie constructor as of the new release.

I hope Tie::Array::CSV can help save people some effort when using CSV files, if you think it might help you, please check it out, and as always, let me know your thoughts.

P.S. development is hosted on my GitHub.

The Triumphant Return of Zoidberg -- A Modular Perl Shell

After more than a year of hoping to be able to say this, here goes: Zoidberg is back!

What is Zoidberg you ask? Well its a Perl shell, of course. Think of the fun of it all:

mv($_ => lc($_)) for grep /[A-Z]/, <*>

If you use a reserved Perl word first or if you wrap code in a block it is interpreted as Perl, if not then its a shell command. Along with having most of the things you would expect from a shell, it features plenty of extra bells and whistles, like a multiline input system (when Term::ReadLine::Zoid, part of Bundle::Zoidberg, is installed) and automatic splitting of variables to arrays like @PATH which is just $ENV{PATH} split on :.

When used as your login shell, it can be used from inside other programs like vim.

:!cat % | {/^\S/}g | wc -

Further, it is easy to extend with the Zoidberg::Fish plugin system.

Read plenty more about it in a couple of rather old, but still relevant articles:

Zoidberg had developed a few warts over the years, which have been fixed. I also noticed a few things which needed immediate work, which has been done. There is plenty more to do, but Jaap was close to declaring it stable back in 2006, so it is very usable already.

Some of the TODOs include modernizing parts of the test suite and investigating the viability of using some standard modules, like Devel::REPL, Moose, and Getopt::Long to bring zoid into the world of Modern Perl.

It is a testament to the stability of Perl that after all these years without maintenance that it took so little work to get it going again. Please enjoy Zoidberg and let me know what you think of it, or of course of any problems you have.

To try it, you can simply install with:

cpanm Bundle::Zoidberg

and the repo is on GitHub, so fork away!

For some more info start at the zoiduser man page. Then run zoid to take it for a spin!

Announcing MooseX::Types::NumUnit

I use Moose to write scientific simulations, including one very large simulation with a user api. To this point all of the numerical quantities, kept in attributes, needed to be of Num type. This always meant an implied covenant between the me and the users, which was to use SI units.

However I have a few quantities that I want to use eV units, which makes a lot more sense. Therefore I setup a simple type with coercion to accept a string num eV and coerce it to a number given by qe * num. This got me to thinking, why can’t I do this for all my units?

To answer this need, I present MooseX::Types::NumUnit which provides a couple static types, but also provides the function num_of_unit, which creates anonymous types which will automatically coerce a string to a number of the desired unit. For example:

package MyTest;

use Moose;
use MooseX::Types::NumUnit qw/num_of_unit/;
#$MooseX::Types::NumUnit::Verbose = 1;

has 'length' => ( 
  isa => num_of_unit( 'm' ), 
  is => 'rw', 
  default => '1 ft'
);
has 'speed' => ( 
  isa => num_of_unit('ft / hour'),
  is => 'rw', 
  required => 1 
);

no Moose;
__PACKAGE__->meta->make_immutable;

my $test = MyTest->new( speed => '2 m / s' );

print $test->speed, "\n";
print $test->length, "\n";

__END__

prints:
23622.0472440945
0.3048

Now my users can use whichever units they want, and I know that my simulation will see the number in the units system that it needs!

A Zoidberg Story

I just read Buddy Burden’s recent post A Random Story and it was fun to see that his reasons for adopting Data::Random was the same as my reason for adopting Zoidberg: failing tests. The difference was that his failing tests were his own and mine were in Zoidberg.

I was a few years into using Linux and I had really fallen in love with Perl, which was my first real programming language (after Maple, Mathematica, and LaTeX), but I had never really gotten the hang of Bash. Since Perl was used in scripting, and even things that looked like shell scripting, I wondered if there was a Perl shell. I looked and I saw a few Zoidberg and Psh stood out as being functional.

Now I can’t remember what it was, but Zoidberg was the one that kept my attention, there was probably something, but I can’t remember anymore. It had a problem though, a failing test on installation through CPAN. I could install it through apt though, so no problem, right?

A little while longer, I had gotten better at Perl and I decided to see if I could play the sleuth and find the problem. It was an interesting problem (documented at SO) wherein its GetOpt module fails parsing its prototypes. Matches were leaking across loop iterations, not getting reset. This minimal example shows the problem.

#!/usr/bin/perl

use strict;
use warnings;

my @a = qw/n a$ b@ c/;
my @b = @a;
my @c = @a;

print "Test A -- Doesn't work (c !-> 0)\n";
for (@a) {
  s/([\$\@\%])$//;
  my $arg = $1 || 0;
  print $_ . " -> " . $arg . "\n";
}

print "\nTest B -- Works\n";
for (@b) {
  my $arg;
  if (s/([\$\@\%])$//) {
    $arg = $1;
  }
  $arg ||= 0;
  print $_ . " -> " . $arg . "\n";
}

print "\nTest C -- Works, more clever\n";
for (@c) {
  my $arg = s/([\$\@\%])$// ? $1 : 0;
  print $_ . " -> " . $arg . "\n";
}

The strange thing is, that test A worked before 5.10, even though it apparently wasn’t supposed to (cjm confirmed that I wasn’t going crazy). I think this may have scared many people away, but to me it was a great challenge.

Much like Buddy I submitted a patch, and while the author, Jaap Karssenberg, was responsive, he had moved on. He offered that I could adopt it if I wanted, and has helped, finding his old notes and commenting on some of my fixes.

Turns out the failing test had masked something much more sinister, the shared data wasn’t getting installed to the proper location anymore (was it installing at all?). Anyway, Module::Build had progressed a long way since Zoidberg 0.96 had been released and offered the share_dir feature, which when combined with File::ShareDir, fixes the problem quite nicely.

These and a few other fixes are essentially complete, and I hope to submit Zoidberg 0.97 to PAUSE soon. There are plenty of modernizations planned and features (literally, I want to add say, state and given/when), but those will probably have to wait until the next release.

This brings me back to the beginning of the story to say, thanks to all the CPAN authors for writing the tests, they can help an intermediate programmer try to reach in and salvage an excellent piece of software for falling into obscurity before it should. I hope to live up to the example (remember that when you start to feel lazy Joel!).

And P.S. Buddy, thanks for reminding me about done_testing, I need to look around to see if I have fallen victim to that one (I’m sure I have!).

Auto(Split|Loader) in a modern Perl world

As a few of my previous posts have implied, I am attempting to reinvigorate the Zoidberg Perl shell. Much of the work of getting it back to a functional state has already been done at my GitHub repo. I have a bigger post coming on why this is cool and even another with some examples, but for now I have a question:

Is an AutoSplit/AutoLoader mechanism helpful on modern hardware? I mean Moose/MOP (and many other projects) are huge and doesn’t use it. In fact it seems that very few modules depend on it.

Now, I understand that AutoLoader has some cool uses for causing subs to spring into existance. In the context of Zoidberg, though, I only care about its use to defer loading infrequently used subs.

To test some things, I created a branch in which I naively removed these bits, and lo and behold, with only one missing my (see AutoLoader Considered Harmful), the tests all pass and a quick run seems fine.

So I fairly call the question: should I leave AutoLoader in, or pull it?

About Joel Berger

user-pic As I delve into the deeper Perl magic I like to share what I can.