Perl vs Shell Scripts

By Buddy Burden on April 16, 2012 12:31 AM

Last week, I posted on my Other Blog about how I still prefer to use tcsh for my interactive shell. Of course, I maintained that bash was the only real choice for shell scripts.

But then this brings us to another interesting point. I, of course, am a Perl programmer. A choice between Perl and shell scripts is not like a choice between C++ programs and shell scripts. Back when I was a C++ programmer, there was no question but that some tasks should be done in bash (or, actually, I was using ksh back in those days). But Perl is quite different: it’s not only the case that Perl can do anything that bash can do (that’s true of C++ as well), but Perl can also do it just as easily. Perl is often considered a scripting language, and, while we could argue about whether that’s true or not due to the fact that Perl is compiled while bash is not, we can’t (and shouldn’t) argue that deploying a Perl program is as easy as deploying a shell script, and that’s part of what being a “scripting language” entails.

But, in the end, I still, sometimes, choose to use bash over Perl, for certain tasks. I suppose you could argue there’s a certain amount of inertia involved: I got used to doing certain types of things with shell scripts back when my only other (viable) option was C++, or maybe awk. But the comparison to awk is quite appropriate. Before I learned Perl, I used awk a lot. Nowadays ... hardly ever. I certainly would never use awk inside a shell script: it would always be Perl there. At the command line, I still occasionally type awk when I mean perl, but it happens less and less often, and more and more I find myself just giving up on awk before I even get to the end of the line. Perl just really completely replaces awk.

But not bash.

Recently I was doing a personal scripting task (it involved fiddling around with MP3s, if you’re curious). I started out doing it in bash, and then ended up ripping it apart about halfway through and starting over in Perl. I had just made a bad decision on that particular task. But, while I was cursing myself out for not just using Perl in the first place, it occurred to me that maybe I ought to try to articulate the places where bash really is (or might be) better. If I had a checklist, maybe I could more easily identify where to put my efforts in from the get-go. If I had a checklist, and I posted it here on this blog, maybe even you other Perlites would come along and tell me why I’m wrong, and maybe I’ll learn something. ;->

For the impatient, the executive precis is this: I generally write bash scripts for tasks which are essentially job control scripts. Yes, Perl can call external programs just as well as any shell script can, but there are a few things bash gives us which Perl doesn’t. This is not surprising, really: bash (as ksh before it, and the venerable sh before that) was basically invented for doing job control. What sh lacked in that department, csh filled in, and then ksh and bash backported. Perl has other foci. Personally, I’m okay with Perl not being the answer for every job.

So, let’s take a look at the (few) places where bash beats Perl:

Job Failure

If I want to run a command in bash, I simply do it, like so:

run some command

In Perl, I’d have to do it more like so:

system("run some command");

It’s a bit more typing, sure, but that’s not the real problem. The real problem is that, if the bash version has a problem—command not found, not enough memory, process table full—it stops and throws an error. The Perl version just blithely keeps going. Now, these days the situation is better than it used to be, because I can do this:

use autodie qw< :all >;
system("run some command");

And that works as well as the bash version. Except, what if I care whether the command succeeded or not? Here’s the bash version:

if ! run some command
then
    some recovery command
fi

In Perl, perhaps the best we can do is this:

use autodie qw< :all >;
use Try::Tiny;                  # TryCatch is nicer, but more overhead

try

{

    system("run some command");

}

catch

{

    system("some recovery command");

};                              # do NOT forget this semi-colon!

That’s a lot more typing, and probably not as clear either. And clarity is maintainability, as we know.

Commands on Exit

In Perl, if you want to run something when your program exits, no matter where it’s exiting from, you can do this:

END
{
    system("run some command");
}

The equivalent in bash looks like this:

trap "run some command" EXIT

Except the bash command actually does what you want. That command in the bash script gets run no matter what. Normal exit, error exit, explosion, Ctrl-C, core dump ... unless someone does a kill -9 on you, you’re pretty much guaranteed to get your command run. Not so in Perl. In fact, the perlmod man page has this to say on the topic:

(But not if it’s morphing into another program via “exec”, or being blown out of the water by a signal—you have to trap that yourself (if you can).)

Lame.

Now, admittedly, the vast majority of times I use trap in this fashion, it looks like this:

trap "/bin/rm -f $tmpfile" EXIT

and, if that were Perl, I’d be using File::Temp, and then I wouldn’t have to worry about removing my tempfile. But, then, I don’t think File::Temp handles signals either. Overall, trap is a much easier way to deal with signals than Perl’s %SIG hash, although I have to admit that I’ve never written a trap statement that didn’t end with EXIT.

Processing Job Output Lines

If I know my output lines won’t have any spaces in them, I’m golden:

for line in $(run some command)
do
    process each "$line"
done

That’s a good bit simpler than the equivalent Perl:

use autodie qw< :all >;

open(PIPE, "run some command|");

while ( <PIPE> )

{

    chomp;

    system(qw< process each >, $_);

}

close(PIPE);

The only problem I have in bash is if my lines might have spaces. That complicates the shell script version to where it’s not particularly better than the Perl:

OIFS="$IFS"
IFS="
"
for line in $(run some command)
do
    process each "$line"
done
IFS="$OIFS"

Still, the simple case is often sufficient.

Here Documents

Sure, Perl has “here documents.” But they’re different. In Perl, a here doc defines a string. In shell scripts, it defines STDIN. So, in bash, I could say:

mysql <<END                       # assume ~/.my.cnf is set up
    select count(*) from some_table;
END

whereas in Perl, it would be:

use autodie qw< :all >;

open(PIPE, "| mysql");

print PIPE <<END;

    select count(*) from some_table;

END

close(PIPE);

Of course, for this particular example, I could just use DBI instead, but I generally find that to be more of a PITA than I want to deal with for a quick script.

File Equivalencies

I have no idea why Perl doesn’t have something like this. Here’s some bash I’ve needed on several occasions:

if [[ ! $(dirname $0) -ef $(pwd) ]]
then
    echo "must run this from its home dir" >&2
    exit 1
fi

Until recently, this was stupidly difficult to replicate. The Cwd module includes a realpath function, but its original implementation only worked on directories (leading to a number of subs in my Perl code named really_realpath). Finally that was fixed, making it easier. Nowadays, I’d probably use Path::Class to do this in Perl:

use Path::Class;

if (file($0)->dir->absolute->resolve ne dir()->absolute->resolve)

{

    die("must run this from its home dir");

}

which ... well, actually, now that I look at it, isn’t so bad, although awfully verbose. But the bash version reads a lot more cleanly.

File Timestamp Comparisons

This one doesn’t come up that often, but, still. In bash I can do:

if [[ $last_run -ot $touchfile ]]
then
    do it again
    touch $last_run
fi

In Perl, I’d have to do the stat calls myself, and pluck out the mtime from the array, which I always have to look up which element it is ... moderately irksome. The bash version is just cleaner.

Tilde Expansion

I know, I know ... it’s just a convenience. But it’s so very ... well, convenient. In bash:

rcfile=~/.me.rc

In Perl, there’s File::HomeDir, which once-upon-a-time had the vaguely nifty $~, but they went and deprecated it. Yeah, I’m sure it was a perfectly awful idea for multiple reasons. But it was a lot more convenient than:

use File::HomeDir;

my $rcfile = File::HomeDir->my_home . "/.me.rc";

And that’s without even going all Path::Class on it, for portability (not that I’m likely to care much about having most of my personal job control scripts run on Windows or whatnot). Yet another minor place where Perl just gives me more to type without significantly increasing any functionality I might actually use.

Now don’t get me wrong: Perl still beats the crap out of bash for most applications. Reasons I might prefer Perl include (but are not limited to):

It’s going to be faster. Mainly because I don’t actually have to start new processes for many of the things I want to do (basename and dirname being the most obvious examples, but generally cut, grep, sort, and wc can all be eliminated as well).
String handling in bash is rudimentary at best, and the whole $IFS thing is super-clunky.
Conditionals in shell scripts can be wonky.
Quoting in shell scripts can be a nightmare.
bash’s case statement leaves a lot to be desired beyond simple cases (NPI).
Arrays in bash suck. Hashes in bash (assuming your bash is new enough to have them at all) suck even harder.
Once processing files or command output goes beyond the simple case I listed above, Perl starts really smoking bash.
CPAN.

So it’s not like bash is going to take over for Perl any time soon. But I still find, after all these years, that many times a simple shell script can sometimes be simpler than a simple Perl script. As I say, I welcome all attempts to convince me otherwise. But, then again, there’s nothing wrong with having a few different tools in your toolbox.

13 comments

13 Comments

eh1mnwy.myopenid.com | April 16, 2012 8:31 AM | Reply

I too find myself using bash for most of my scripting needs and have wondered too if it's mostly inertia.

Part of my inertia is my $USRLIB/common.sh which is a small function library that makes my shell scripts look like lots of this:



declare -r kindleDrive="/Volumes/$device_name"

insist -a "$kindleDrive" : "$device_name is not mounted."

insist -d "$kindleDrive" : "'$kindleDrive' is not a disk Volume!"

  

declare -r target="$kindleDrive/kindle"

insist -a "$target" : "Target '$target' does not exist."

insist -d "$target" : "Target '$target' is not a directory."

insist -w "$target" : "Target '$target' is not writable."

where insist runs the test, and if the test fails prints the message following the colon and exits with a non-zero error code.

The other way scripts happen for me is when I start off with something on the command line, possibly add them to my .bashrc, and over time they grow, possibly ending up as a script in my bin directory. newest started off that way. All it does is report the newest 10 files in the current directory. Unless you include a different number or give a different directory or filespec. It started out as a simple alias to an ls awk tail pipe which has become an 85 line function in .bashrc. It seems as fast as ls so there's no need to rewrite it.

Perl code tends to happen when I'm actually planning it. And I almost always use perl and DBI for database access that isn't obviously one-off viewing. But I think that's inertia on my part too.

My ~/bin directory breakdown looks like this



for f in *; do if [[ $f =~ \. ]]; then echo "${f##*.}"; fi; done | sort | uniq -c

   3 osa

   5 pl

  18 sh

Paul Johnson | April 16, 2012 1:38 PM | Reply

Your other blog wouldn't let me comment, but I would have written:

-----
Or you could use zsh and get the best of both worlds. Or even better. It might not be worth it for you if you are happy with what you have, and you don't miss what you never had, but I don't regret switching to zsh 20-odd years ago.
-----

But while I'm here, for File Timestamp Comparisons you might like to look at the -M, -A and -C tests (perldoc -f -x).

And for Tilde Expansion, rcfile=~/.me.rc --> my $rcfile = <~/.me.rc>

Shawn H Corey | April 16, 2012 1:56 PM | Reply

I prefer to use Perl for my scripting needs but if I have to call a lot of external programs, then I use bash(1).

http://www.butteredham.com/blog/ | April 16, 2012 4:49 PM | Reply

I tend to use the shell for manipulation at the file system level: moving files around, deleting, creating directions, changing permissions, etc. Anything beyond that, like dealing with the data inside the files, I'll almost always go to Perl. Or to put it another way: if it needs a regex or arithmetic or more than a single variable, I'll use Perl.

Joel Berger | April 16, 2012 7:55 PM | Reply

I am the new maintainer of Zoidberg, and I wonder if you might want to take it out for a spin. No I don't think its going to replace bash, but you might be interested in a Perl shell just as another option. http://p3rl.org/Zoidberg

Hercynium | April 16, 2012 8:23 PM | Reply

The shell has one other killer feature that Perl lacks (even on the CPAN!)... Process Substitution.

But I'm working on that right now...

Buddy Burden | April 16, 2012 8:25 PM | Reply

Thanks for all the comments guys!

@Paul: Yeah, it may be too late for me to switch at this point, but perhaps I'll give zsh another look.

I am familiar with -M and its cousins, but that doesn't do the same thing as -ot/-nt. -M compares a file time to the script time, whereas -ot compares two arbitrary files (one of which could be $0 if you like).

Good tip on the globbing angle brackets tho ... I'd never thought about the possibility that they might do tilde expansion.

@butteredham: Yeah, I think that pretty much sums up how I feel as well.

@Joel: You know, I've glanced at the Zoidberg docs many times, but never actually taken it out for a spin. Perhaps I'll give it a shot. Thanx for the tip!

Mike Doherty | April 18, 2012 11:11 PM | Reply

I also use Perl when I need to do anything more than the simplest command-line parsing. Getopt::Long just makes it too easy to give up.

jnareb.myopenid.com replied to comment from Mike Doherty | April 19, 2012 12:18 AM | Reply

You can always use parseopt from git for shell... if it were a separate library.

Buddy Burden | April 24, 2012 2:14 AM | Reply

@Mike and @jnareb:

For options parsing, I use a block similar to the following that I just copy-n-paste to pretty much every bash script I write:


readonly me=${0##*/}                                                    # slightly prettier than $0
partial=0
while getopts ':iph' opt
do
    case $opt in
        p)  partial=1                                                   # partial matches are the default for locate
            ;;
        i)  ignore_case='-i'
            ;;
        h)  echo "usage: $me -h | [-ip] dirname" >&2
            echo "    -i : ignore case when matching dirname" >&2
            echo "    -p : partial match on dirname" >&2
            echo "    -h : this help message" >&2
            echo "    colnum : column of numbers to add (non-numeric values in this col treated as 0)" >&2
            exit
            ;;
        :)  echo "$me: $OPTARG requires an argument ($me -h for help)" >&2
            exit 2
            ;;
        \?) echo "$me: unknown argument $OPTARG ($me -h for help)" >&2
            exit 2
            ;;
    esac
done
shift $(( $OPTIND - 1 ))

Now, admittedly, this only deals with short (stackable) flags, but that's generally good enough for me. I'm an old-school *nixite who believes that ls as an abbreviation for "list" was a good idea. :-D

Dinesh Sehra | October 23, 2012 8:43 AM | Reply

Awesome write up and comparison. Now I am clear to what extent I need to put my efforts in learning perl, and when I should attemp writing a script in perl.

Thanks Buddy.

~Dinesh

Olivier Mengué (dolmen) | December 9, 2013 1:10 PM | Reply

About "Processing Job Output Lines":

run some command | while read line
do
    process each "$line"
done

mug896 | November 16, 2014 1:30 AM | Reply

In bash you cant set IFS=$'\n' instead of IFS="
"

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Buddy Burden

14 years in California, 25 years in Perl, 34 years in computers, 55 years in bare feet.

More info »

Buddy Burden