Tie::File don't while(<>)

By Jesse Shy on January 17, 2014 3:04 PM

Consider the common case where you want to take a text file and walk it doing various operations such as stripping whitespace, looking for duplicates, sorting or other common operations. I was recently asked to work on a program where part of the process is 'cleaning' uploaded files before doing other things with it. This process was over 40 lines of subs opening old and new filehandles, doing the standard while(<>) {} and then renaming new file to old file. If you find yourself wanting to do something you would normally do on an array, but on a file and often in place, consider Tie::File before while()ing away at it. The documentation has many examples and there are several examples on the web including at Perl Monks. In this case I was able to whittle this down to this:

tie my @file, 'Tie::File', $file;
@file = grep !$seen{$_}++,
            sort,
            map s/$RE{ws}{crop}//g,
            map lc, @file;

untie @file;

If you're working with CSV or other delimited files, consider Tie::Array::CSV

Cheers

3 comments

3 Comments

Gnustavo | January 20, 2014 10:42 AM | Reply

I think "sort @file" should be simply "sort", right?

Moreover, if you're useing strict that %seen hash must have been declared above. I usually prefer to use List::MoreUtils::uniq method instead because it seems simpler and more intuitive.

I've never used Tie::File before. I'm sure going to give it a try. Thanks!

Steven Haryanto | January 20, 2014 12:47 PM | Reply

One of the advantages of Tie::File is that you don't have to slurp all the file contents into memory. You can just push() or do a while ($item = each(@ary)), for example to work item by item.

On the other hand, it's slow compared to low-level routines like open/read/>.

The each(ARRAY) is Perl 5.12+ only.

Jesse Shy | January 20, 2014 1:06 PM | Reply

@Gnustavo What? Not use strict and warnings, are you crazy? Yes, there are a couple of setup lines like that that I did not show. And you are correct on the sort, I changed this a little to protect the innocent and did not copy it correctly. @Steven, where I am using this, slurping the file could cause out of memory issues in some cases.

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Jesse Shy

Perl pays for my travel addiction.

More info »

Jesse Shy