Perl vs Shell Scripts
Last week, I posted on my Other Blog about how I still prefer to use tcsh
for my interactive shell. Of course, I maintained that bash
was the only real choice for shell scripts.
But then this brings us to another interesting point. I, of course, am a Perl programmer. A choice between Perl and shell scripts is not like a choice between C++ programs and shell scripts. Back when I was a C++ programmer, there was no question but that some tasks should be done in bash
(or, actually, I was using ksh
back in those days). But Perl is quite different: it’s not only the case that Perl can do anything that bash
can do (that’s true of C++ as well), but Perl can also do it just as easily. Perl is often considered a scripting language, and, while we could argue about whether that’s true or not due to the fact that Perl is compiled while bash
is not, we can’t (and shouldn’t) argue that deploying a Perl program is as easy as deploying a shell script, and that’s part of what being a “scripting language” entails.
But, in the end, I still, sometimes, choose to use bash
over Perl, for certain tasks. I suppose you could argue there’s a certain amount of inertia involved: I got used to doing certain types of things with shell scripts back when my only other (viable) option was C++, or maybe awk
. But the comparison to awk
is quite appropriate. Before I learned Perl, I used awk
a lot. Nowadays ... hardly ever. I certainly would never use awk
inside a shell script: it would always be Perl there. At the command line, I still occasionally type awk
when I mean perl
, but it happens less and less often, and more and more I find myself just giving up on awk
before I even get to the end of the line. Perl just really completely replaces awk
.
But not bash
.
Recently I was doing a personal scripting task (it involved fiddling around with MP3s, if you’re curious). I started out doing it in bash
, and then ended up ripping it apart about halfway through and starting over in Perl. I had just made a bad decision on that particular task. But, while I was cursing myself out for not just using Perl in the first place, it occurred to me that maybe I ought to try to articulate the places where bash
really is (or might be) better. If I had a checklist, maybe I could more easily identify where to put my efforts in from the get-go. If I had a checklist, and I posted it here on this blog, maybe even you other Perlites would come along and tell me why I’m wrong, and maybe I’ll learn something. ;->
For the impatient, the executive precis is this: I generally write bash
scripts for tasks which are essentially job control scripts. Yes, Perl can call external programs just as well as any shell script can, but there are a few things bash
gives us which Perl doesn’t. This is not surprising, really: bash
(as ksh
before it, and the venerable sh
before that) was basically invented for doing job control. What sh
lacked in that department, csh
filled in, and then ksh
and bash
backported. Perl has other foci. Personally, I’m okay with Perl not being the answer for every job.
So, let’s take a look at the (few) places where bash
beats Perl:
Job Failure
If I want to run a command in bash
, I simply do it, like so:
run some command
In Perl, I’d have to do it more like so:
system("run some command");
It’s a bit more typing, sure, but that’s not the real problem. The real problem is that, if the bash
version has a problem—command not found, not enough memory, process table full—it stops and throws an error. The Perl version just blithely keeps going. Now, these days the situation is better than it used to be, because I can do this:
use autodie qw< :all >;
system("run some command");
And that works as well as the bash
version. Except, what if I care whether the command succeeded or not? Here’s the bash
version:
if ! run some command
then
some recovery command
fi
In Perl, perhaps the best we can do is this:
use autodie qw< :all >;
use Try::Tiny; # TryCatch is nicer, but more overhead
try
{
system("run some command");
}
catch
{
system("some recovery command");
}; # do NOT forget this semi-colon!
That’s a lot more typing, and probably not as clear either. And clarity is maintainability, as we know.
Commands on Exit
In Perl, if you want to run something when your program exits, no matter where it’s exiting from, you can do this:
END
{
system("run some command");
}
The equivalent in bash
looks like this:
trap "run some command" EXIT
Except the bash
command actually does what you want. That command in the bash
script gets run no matter what. Normal exit, error exit, explosion, Ctrl-C, core dump ... unless someone does a kill -9
on you, you’re pretty much guaranteed to get your command run. Not so in Perl. In fact, the perlmod
man page has this to say on the topic:
(But not if it’s morphing into another program via “exec”, or being blown out of the water by a signal—you have to trap that yourself (if you can).)Lame.
Now, admittedly, the vast majority of times I use trap
in this fashion, it looks like this:
trap "/bin/rm -f $tmpfile" EXIT
and, if that were Perl, I’d be using File::Temp
, and then I wouldn’t have to worry about removing my tempfile. But, then, I don’t think File::Temp
handles signals either. Overall, trap
is a much easier way to deal with signals than Perl’s %SIG
hash, although I have to admit that I’ve never written a trap
statement that didn’t end with EXIT
.
Processing Job Output Lines
If I know my output lines won’t have any spaces in them, I’m golden:
for line in $(run some command)
do
process each "$line"
done
That’s a good bit simpler than the equivalent Perl:
use autodie qw< :all >;
open(PIPE, "run some command|");
while ( <PIPE> )
{
chomp;
system(qw< process each >, $_);
}
close(PIPE);
The only problem I have in
bash
is if my lines might have spaces. That complicates the shell script version to where it’s not particularly better than the Perl:
OIFS="$IFS"
IFS="
"
for line in $(run some command)
do
process each "$line"
done
IFS="$OIFS"
Still, the simple case is often sufficient.
Here Documents
Sure, Perl has “here documents.” But they’re different. In Perl, a here doc defines a string. In shell scripts, it defines STDIN
. So, in bash
, I could say:
mysql <<END # assume ~/.my.cnf is set up
select count(*) from some_table;
END
whereas in Perl, it would be:
use autodie qw< :all >;
open(PIPE, "| mysql");
print PIPE <<END;
select count(*) from some_table;
END
close(PIPE);
Of course, for this particular example, I could just use DBI instead, but I generally find that to be more of a PITA than I want to deal with for a quick script.
File Equivalencies
I have no idea why Perl doesn’t have something like this. Here’s some bash
I’ve needed on several occasions:
if [[ ! $(dirname $0) -ef $(pwd) ]]
then
echo "must run this from its home dir" >&2
exit 1
fi
Until recently, this was stupidly difficult to replicate. The Cwd
module includes a realpath
function, but its original implementation only worked on directories (leading to a number of subs in my Perl code named really_realpath
). Finally that was fixed, making it easier. Nowadays, I’d probably use Path::Class
to do this in Perl:
use Path::Class;
if (file($0)->dir->absolute->resolve ne dir()->absolute->resolve)
{
die("must run this from its home dir");
}
which ... well, actually, now that I look at it, isn’t so bad, although awfully verbose. But the
bash
version reads a lot more cleanly.
File Timestamp Comparisons
This one doesn’t come up that often, but, still. In bash
I can do:
if [[ $last_run -ot $touchfile ]]
then
do it again
touch $last_run
fi
In Perl, I’d have to do the stat calls myself, and pluck out the mtime from the array, which I always have to look up which element it is ... moderately irksome. The bash
version is just cleaner.
Tilde Expansion
I know, I know ... it’s just a convenience. But it’s so very ... well, convenient. In bash
:
rcfile=~/.me.rc
In Perl, there’s File::HomeDir
, which once-upon-a-time had the vaguely nifty $~
, but they went and deprecated it. Yeah, I’m sure it was a perfectly awful idea for multiple reasons. But it was a lot more convenient than:
use File::HomeDir;
my $rcfile = File::HomeDir->my_home . "/.me.rc";
And that’s without even going all
Path::Class
on it, for portability (not that I’m likely to care much about having most of my personal job control scripts run on Windows or whatnot). Yet another minor place where Perl just gives me more to type without significantly increasing any functionality I might actually use.
Now don’t get me wrong: Perl still beats the crap out of bash
for most applications. Reasons I might prefer Perl include (but are not limited to):
- It’s going to be faster. Mainly because I don’t actually have to start new processes for many of the things I want to do (
basename
anddirname
being the most obvious examples, but generallycut
,grep
,sort
, andwc
can all be eliminated as well). - String handling in
bash
is rudimentary at best, and the whole$IFS
thing is super-clunky. - Conditionals in shell scripts can be wonky.
- Quoting in shell scripts can be a nightmare.
bash
’scase
statement leaves a lot to be desired beyond simple cases (NPI).- Arrays in
bash
suck. Hashes inbash
(assuming yourbash
is new enough to have them at all) suck even harder. - Once processing files or command output goes beyond the simple case I listed above, Perl starts really smoking
bash
. - CPAN.
So it’s not like bash
is going to take over for Perl any time soon. But I still find, after all these years, that many times a simple shell script can sometimes be simpler than a simple Perl script. As I say, I welcome all attempts to convince me otherwise. But, then again, there’s nothing wrong with having a few different tools in your toolbox.
I too find myself using bash for most of my scripting needs and have wondered too if it's mostly inertia.
Part of my inertia is my
$USRLIB/common.sh
which is a small function library that makes my shell scripts look like lots of this:where
insist
runs the test, and if the test fails prints the message following the colon and exits with a non-zero error code.The other way scripts happen for me is when I start off with something on the command line, possibly add them to my .bashrc, and over time they grow, possibly ending up as a script in my bin directory.
newest
started off that way. All it does is report the newest 10 files in the current directory. Unless you include a different number or give a different directory or filespec. It started out as a simple alias to anls
awk
tail
pipe which has become an 85 line function in .bashrc. It seems as fast asls
so there's no need to rewrite it.
Perl code tends to happen when I'm actually planning it. And I almost always use
perl
andDBI
for database access that isn't obviously one-off viewing. But I think that's inertia on my part too.My ~/bin directory breakdown looks like this
Your other blog wouldn't let me comment, but I would have written:
-----
Or you could use zsh and get the best of both worlds. Or even better. It might not be worth it for you if you are happy with what you have, and you don't miss what you never had, but I don't regret switching to zsh 20-odd years ago.
-----
But while I'm here, for File Timestamp Comparisons you might like to look at the -M, -A and -C tests (perldoc -f -x).
And for Tilde Expansion, rcfile=~/.me.rc --> my $rcfile = <~/.me.rc>
I prefer to use Perl for my scripting needs but if I have to call a lot of external programs, then I use bash(1).
I tend to use the shell for manipulation at the file system level: moving files around, deleting, creating directions, changing permissions, etc. Anything beyond that, like dealing with the data inside the files, I'll almost always go to Perl. Or to put it another way: if it needs a regex or arithmetic or more than a single variable, I'll use Perl.
I am the new maintainer of Zoidberg, and I wonder if you might want to take it out for a spin. No I don't think its going to replace bash, but you might be interested in a Perl shell just as another option. http://p3rl.org/Zoidberg
The shell has one other killer feature that Perl lacks (even on the CPAN!)... Process Substitution.
But I'm working on that right now...
Thanks for all the comments guys!
@Paul: Yeah, it may be too late for me to switch at this point, but perhaps I'll give
zsh
another look.I am familiar with
-M
and its cousins, but that doesn't do the same thing as-ot
/-nt
.-M
compares a file time to the script time, whereas-ot
compares two arbitrary files (one of which could be$0
if you like).Good tip on the globbing angle brackets tho ... I'd never thought about the possibility that they might do tilde expansion.
@butteredham: Yeah, I think that pretty much sums up how I feel as well.
@Joel: You know, I've glanced at the Zoidberg docs many times, but never actually taken it out for a spin. Perhaps I'll give it a shot. Thanx for the tip!
I also use Perl when I need to do anything more than the simplest command-line parsing. Getopt::Long just makes it too easy to give up.
You can always use parseopt from git for shell... if it were a separate library.
@Mike and @jnareb:
For options parsing, I use a block similar to the following that I just copy-n-paste to pretty much every
bash
script I write:Now, admittedly, this only deals with short (stackable) flags, but that's generally good enough for me. I'm an old-school *nixite who believes that
ls
as an abbreviation for "list" was a good idea. :-DAwesome write up and comparison. Now I am clear to what extent I need to put my efforts in learning perl, and when I should attemp writing a script in perl.
Thanks Buddy.
~Dinesh
About "Processing Job Output Lines":
In bash you cant set IFS=$'\n' instead of IFS="
"