Load a list of lines into an array (easily)
This blog post describes a common task my colleagues ask often about repeating a dynamic string in a defined token and adding some or
, and
, =
in between, plus finishing smartly.
I like to use the Perl's __DATA__
token at the end of my scripts for this. The __DATA__
token strength is to make possible to « "embed” a file inside a Perl program then read it from the DATA filehandle ». It saves you the creation and opening of a real file and is very handy for quick prototypes and tests.
#!/usr/bin/env perl
use strict;
use warnings;
# Your script here
# Everything under is considered as
# the end of the code
__DATA__
a
lot
lot
of
stuff
here
...
A common practice is to load those data to an array by treating them as a file handle:
my @lines = <DATA>;
But the values would include carriage returns, what you obviously don't want. I used two solutions for this:
my @lines;
push @lines,
split while <DATA>;
This is quite readable and self-explanatory (remember Perl a natural language, it was created by a linguist). Feel free to comment if something is unclear so I could improve the post.
Ok, I have to admit a little secret:
push my @lines,
uniq split while <DATA>;
... without the pre-declaration of @lines
does the same. I had to counter check it worked, but as often with Perl, when you spontaneously think of something silly, it actually works naturally (I have to admit it sometimes looks like a miracle).
If you want uniq
values (you surely do), one way is to use the core module List ::Util
:
use List::Util qw(uniq);
push my @lines,
uniq split while <DATA>;
Another way to do it is always possible:
chomp( my @lines = uniq <DATA> );
I actually prefer this list context solution, for it's shortness, dunno which one is the more readable, and it is good to choose the readable way.
Let's say you want to generate a series of or
for your colleagues or customers. We are actually doing a super advanced language generation thing here:
#!/usr/bin/env perl
use strict;
use warnings;
use List::Util qw(uniq);
chomp( my @lines = uniq <DATA> );
for ( @lines ) {
# $_ is the current loop element
print generate_string( $_ );
# $lines[-1] is the last array element
if ( not $_ eq $lines[-1] ) {
print ' or ';
} else {
print "\n";
}
}
sub generate_string {
return 'line == "' . shift . '"';
}
__DATA__
a
lot
lot
of
stuff
here
...
$ perl lines.pl
line == "a' or line == "lot' or line == "of' or line == "stuff' or line == "here' or line == "...'
Lots of other solutions exist, check the Perl one-liners thing that allow to learn a lot more about those kind of practices.
The quantities of cools things you can do inside this loop is infinite, from log parsing to generating code or data munging, thanks to the kindness of Perl.
References
- Read a file into an array using Perl
- Stupid DATA Tricks
- This is actually a crosspost of the same article on dev.to
Note
I wrote this because my memory is awful and I was tired of always searching for the exact syntax of the __data__
token to array process. Hope it will help all kinds of people including me when I type it in a search engine.
Leave a comment