Programming Archives

Elapsed Time in Test Output

I’ve been writing tests for a web service in which one of the methods would occasionally take an inordinate amount of time. The amount of time was inconsistent and it would be useful to capture some timing data to provide the vendor. But, I didn’t want to litter my test files with code to calculate the elapsed time. Eventually, the bugs would be worked out and the timing information would just be noise.

So I wanted to somehow hook into the test framework itself to give me the timing data.

Repeated Capturing and Parsing

An interesting query was recently posted to the internal Perl mail list at work. The questioner was trying to match a pattern repeatedly, capturing all of the results in an array. But, it wasn't doing quite what he expected. The message, with minor edits, went a little something like the following.

I'm trying to extract key/value pairs from a file with the following contents:
- name = gcc_xo_src_clk, type = rcg
+ name = cxo_clk, type = xo, fgroup = xo, wt = 10, bloo = blah
? type = hm_mnd_rcg, name = bo : type = rcg_mn
+ name = pxo_clk

I was hoping to do something like this:

@list = $_ =~ m{ ^[-+?] \s* (\S+) \s* = \s* (\S+) \s* (?:, \s* (\S+) \s* = \s* (\S+) \s*)* }xms;

Thinking @list would be assigned the alternating key/value pairs. But the above doesn't extract anything sane. Adding the /gc modifiers doesn't make any difference.

If I do the following, it extracts the first two key/value pairs correctly (if the line has more than one pair).

@list = $_ =~ m{
    ^[-+?] \s* (\S+) \s* = \s* (\S+) \s*
    , \s* (\S+) \s* = \s* (\S+) \s*

If I keep repeating the pattern in the second line, it keeps matching more key/value pairs.

I would expect using (?: )* should mean zero or more instances of match inside the parentheses, but obviously it's not working. What am I doing wrong?

When I'm presented with a problem like this, that is some kind of structured data, I immediately think of writing a parser. I'll get back to that in a bit, but I wanted to address the confusion about capturing in the pattern. And, in fact, that's how the discussion on the mail list proceeded.

Automating On Call Jury Instructions

Early in February, I received a jury summons for the United States District Court, Southern District of California. Prospective jurors for federal jury service (at least in this court) are placed on call for a period of about 30 days. I was to call for instructions on April 1 and potentially proceed to do so periodically until May 4 (assuming I wasn’t instructed to report).

Since my initial instruction date was nearly two months away, I created an entry for it in Google Calendar, and promptly forgot about it. On Monday, April 2, I was riding the train to work when I realized that I hadn’t remembered to check my instructions. Fortunately, after arriving at my office and checking my instructions, I had been deferred to the next day.

So I added a new entry in Google Calendar, this time with an SMS reminder. I proceeded to do this for most of April, checking my instructions and duplicating the calender entry with another SMS reminder.

I’m embarrassed to admit that it wasn’t until the last week of April that it occurred to me that I could automate the whole process. After all, isn’t automating drudgery the whole reason I ended up programming Perl in an engineering support group at my day job?

For Want of a Newline

Recently I had the pleasure of spending three hours debugging an obscure bug. An obscure bug I caused by introducing a newline. That little punk, 0x0A.

I released a new version of a command line program. For me, it’s an elegant piece of work, combining a marvelously complex-but-intuitive configuration for system administrators with an absolutely simple interface for users. To use the command, the user runs it with a couple of arguments and it prints out a single line of useful text derived from that marvelously complex configuration.

But, it doesn’t print a newline.

It’s never printed a newline. The original author didn’t include one for some reason. Anyone who has ever encountered a command like this knows well my irritation.

my awesome prompt> some_lame_command
my awesome prompt>e answer


The workaround I’ve seen used, after seeing the above is to face-palm, then run the command again, only differently.

my awesome prompt> echo `some_lame_command`
42 is obviously the answer
my awesome prompt>

I’m embarrassed when users see behavior like this from a program I wrote. Being the arrogant bastard programmer that I am, I decided to fix this. Since all commands print newlines, everyone should already be assuming that this one does too and should already be handling it in the proper manner. Right? When writing a shell script, the distinction between newline-printing and non-newline-printing commands is irrelevant. In either Bourne shell:


Or C shell:

setenv FROBBED `frobnosticate`

The shell is benevolent enough to remove the newline, if it exists. After all, this is the most commonly desired behavior when assigning command output to a variable. However, things are a bit different when switching to a so-called real programming language, like Perl:

$ENV{FROBBED} = `frobnosticate`;  # Caution, newline ahead!

Sure, it looks more or less the same, but veteran Perl programmers will immediately grimace when reading the above. Unlike the shell, Perl, like other programming languages, will preserve the output of the command verbatim. In this case, preserving data and letting the programmer decide how to use it is the most commonly desired behavior. Since everything coming from an external command ends with a newline, the environment variable being set in this case will have a newline. This will almost always cause a problem. One that, as I’ve learned, is not always easy to find. Since stripping input of newlines is just as common as the desire to preserve data, Perl makes this easy and most Perl programmers will habitually write it.

chomp( $ENV{FROBBED} = `frobnosticate` );

Now it doesn’t matter if the command prints a newline or not, the chomp builtin has your back. It’s just like being in the warm embrace of the shell, only with a little extra syntax.

So it turns out that one of the engineering groups I support was using a Perl script that set an environment variable as in the first example. The value of this environment variable was then being passed off to the batch system and used by an engineering program as a network address to connect to. Of course, the program made the fatal mistake of trusting user input and, in a spectacular fashion, failed to connect to the server whose name just happened to contain a newline.

After chasing down a couple of red herrings which left me flummoxed, one of the affected users shared with me an error log and the script that generated it. There, in all its syntax highlighted, monospaced glory was the environment variable being set without any attempt made to trim off the newline. I immediately swallowed my pride and released an updated version of the command reverting the newline behavior and the problem went away. My engineers—at least, the subset using this particular script in this particular way—could once again get their work done.

By far, this isn’t the worst thing I’ve done to our batch system. One time I caused all jobs executing on Solaris hosts to immediately fail. Whoops.

Anyway, what’s the lesson to be learned from today’s experience?

Never—and I’ll repeat that, never—assume everyone will be doing the right thing (remember what they say when you assume something). Inevitably, someone won’t be.

There’s a corollary to today’s lesson. When coming across something that could be improved with a small change, don’t. Seriously, just don’t. Inevitably, someone will be depending on the current behavior, no matter how right or wrong it may seem. This is why the phrase “bug compatible” exists in the lexicon.

Ninja Code

As this is only my second post—my first merely being a plug for a Damian Conway talk at my local Perl Mongers meeting—I thought I’d keep it short and silly.

At OSCON 2008, Amazon had a prominent booth advertising heavily that they were hiring. They didn’t want to hire just anyone. No, the question posed on the large sheet of poster board was, “Are you a ninja coder?” This was combined with a raffle, which as anyone who has ever been to a conference knows, is the most common tool used to get people to hand over their contact information. To enter, one had simply to look over some Perl code written out on that same poster board and tell them what it did. It looked a little something like this:

my $code = qq{
    print 1+1 . "\n";
    $code =~ m/(\d+)\+(\d+)/;
    $new = $1 + $2;
    $code =~ s/\d+\+(\d+)/$2+$new/;

for ( 1 .. 10 ) {

Do you see the bug? It’s actually besides the point, but it’s there.

Yes, it should use q{}, or the variables will interpolate on the initial assignment to $code. To the credit of those staffing the booth, when I pointed this out they responded that the original version used single quotes, but people told them it was too hard to read.

I wasn’t content with just figuring out what the code did and fixing a small bug. I think the code itself can be improved upon. So I hacked something up between sessions:

eval($code = q{
    print 1+1 . "\n";
    $code =~ s/(\d+)(\+)(\d+)/"$3$2" . ($1 + $3)/e;
    eval $code;

Much better. Not only is it more concise, I was able to remove that pesky loop, so I wouldn’t be bothered by any silly upper bounds. Except for, you know, overflows and stuff. What’s more, it’s given me something to scare interview candidates with.

About Chris Grau

user-pic I use Perl to support an EDA engineering organization. Occasionally I write about the things I do with Perl.