A milestone for Alien::Base

I have been working on a set of base classes intended to make creating a new Alien:: distribution for some library as easy as making a simple Module::Build based distro. So far the code isn’t on CPAN yet, follow its progress on GitHub.

I haven’t been feeling so well today, so I have been sitting around watching movies (which I own on DVD) on TV. Of course I can’t sit still that long without doing anything so Alien::Base saw a burst of activity today.

Along with testing I am also keeping an Alien::Base-based Alien::GSL (which provides the Gnu Scientific Library) in the examples folder. The big news today is that this example distro can now query the GNU FTP server, pick the newest version of the library. It then downloads, extracts and builds the library in a temporary folder. Finally it “installs” the library in a File::ShareDir directory in the Alien::GSL root/share directory. Even this isn’t as cool as how it does this:

It does it entirely from the Build.PL configuration!

It is my hope that most small/self-contained libraries can be wrapped in this simple way. In this way I hope to increase the number of Alien:: modules available on CPAN.

Of course its still needs much more functionality, lots more tests, and all the documentation. All of that is coming however, so keep watching!

Why would I use Tie::Array::CSV?

After (IMO) elegantly solving an SO question using my Tie::Array::CSV, I thought I might share it here to give you all an idea of when you might want to use it. This example is only reading the file, but remember that T::A::CSV gives you full row/column read/write access to the underlying CSV file in place.

The OP needed to find the column with a certain identifier, (an employee number I would imagine) which was 7 chars starting with a letter. Then extract the number of repetitions of that identified in that column. Here was the solution that I posted.

#!/usr/bin/env perl

use strict;
use warnings;

use File::Temp;
use Tie::Array::CSV;
use List::MoreUtils qw/first_index/;
use Data::Dumper;

# this builds a temporary file from DATA
# normally you would just make $file the filename
my $file = File::Temp->new;
print $file <DATA>;
#########

tie my @csv, 'Tie::Array::CSV', $file;

#find column from data in first row
my $colnum = first_index { /^\w.{6}$/ } @{$csv[0]};
print "Using column: $colnum\n";

#extract that column
my @column = map { $csv[$_][$colnum] } (0..$#csv);

#build a hash of repetitions
my %reps;
$reps{$_}++ for @column;

print Dumper \%reps;

__DATA__
"CRPGAMERBAS05","site","date1","nb96ytl","date2"
"CRPGAMERBAS05","site","date2","nb96ytl","date2"
"CRPGAMERBAS05","site","date1","jb98ytl","date2"

(The OP gave one line of data, so I puffed it to 3, also to play blogs.perl.org’s width restrictions the data given here is rewritten. See the original post for the full stuff if you must.)

Of course I know you can do this with Text::CSV directly, but I like that it lets me think in terms of columns rather than objects and parsers and accessors.

My $0.02 on strict and the community

By now most people who would be reading my blog are aware of the kerfuffle going on about people being pushy about strict (and other Modern Perlisms).

As a relatively new Perler (my first scripts are dated 2009) I believe I have an underrepresented opinion on the matter. I was lucky to have had StackOverflow and the community around me as I was learning Perl. Someone, I don’t remember who or with what tone, told me that I should use strict and warnings on my code. Not knowing any better, I did.

Then Perl was easier. Simple as that.

I have learned a lot since then. I know when I need to no strict 'refs' or no warnings 'once'. Personally I wish these pragmas were default. In fact, I have had so little problem with Perl that I’m horrendous at the debugger; I really haven’t needed it. Of course I know that one of Perl’s best assets is its compatibility, and therefore strict/warnings is not default.

And yes, I get annoyed now when some new Perler asks on StackOverflow and didn’t use strict, but then I take a deep breath, remind myself I was there, right there, myself.

And yes, I get annoyed when people comment on my blog posts with self-righteous bull, but then I take a deep breath and realize that they might know lots more than me about a great many things.

Open source programming is an incredible social experiment. Many people are working together. We haven’t all met, we don’t even all speak the same language. Often we are not paid. But together we can make incredible products. Then usually we give them away, to help other people.

Think about how awesome that is. Then go explain why you do the things you do to a new Perler.

Should Perl have a `chomped` function?

Edit: orginially rchomp, but Aristotle’s suggestion of chomped is perfect!

brian d foy posed an interesting interview question: “What five things do you hate most about language X?” positing that an experienced user of X should know 5 things (s)he hates about it.

In my list is the return value of chomp. Yes I understand why it works as it does

print "chomped" if chomp $input;

but I find that use case happens far less often than the usual

chomp( my $input = <> );

It looks bad, and it is not intuitive, especially to the new user. Just today another one popped up on StackOverflow. This has got to be one of the most common questions on the site.

Wouldn’t it be great to have a chomp function that returns the chomped value or values? In the spirit of the new s///r flag I originally wanted to call it rchomp, Aristotle’s comment of chomped is my new favorite. You would use it like:

my $input = chomped <>;

Since we all love CPAN of course I could make a CPAN module for this, but no one would add a dependency on it just for one convenience function. I don’t expect large adoption of my Tie::Select for this reason, even thought I think it has reasons to be safer than the core’s SelectSaver in some rare circumstances.

So anyway, is chomped something that the community would want? Could it possibly be in CORE:: so that people might actually use it? I even see that a similar concept made an rfc for Perl6. Just daydreaming I guess, but oooh I do hate it.

The Case for Simplicity

Part of my design goal for Tie::Array::CSV was to be an elegant blend of tied objects making hard things easy both at the user and author (me) levels.

A few months back I announced that Tie::Array::CSV is now more efficient on row ops. Since then I have had a nagging thought; this change cost me elegance and simplicity.

To implement the deferred row operations, I made my row objects wait until their destructor to update the file. Sounds nice until you realize that you now have race conditions all over the place. So you hunt them down and store/update more internal data, always keeping track of what has been changed. A simple change became a big undertaking. As the project finished I couldn’t help but yearn for the simplicity of the original design goal.

Yesterday, in this staring match with myself, I finally blinked. I retrieved the old code, merged in a few of the newer niceties that I wanted to keep and moved the more convoluted deferred-row-op logic into a subclass.

Here I announce the release of Tie::Array::CSV version 0.05; featuring simplicity in the base class and deferred row operations in a subclass.

Its not the most efficient (read: fast) way to read CSV files and it doesn’t handle embedded newlines, but if you just want to act on a CSV file like a 2D Perl array (i.e. array of array references), give it a try.

Fork Tie::Array::CSV on GitHub

PS. Mithaldu, you can now pass a Text::CSV object (or subclass) to the constructor if you would like :)