IPC::System::Simple: A success story

system seems like such a simple thing to use. You tell it which command to run, and it runs it. When it's done, you continue with your program, or so you think. I recently ran into a problem with this. I was creating a lot of zombies, but Dread Pirate Fenwick to the rescue (and this is not his first time to rescue me).

As part of my DPAN work, I'm indexing every distribution in BackPAN. I collect loads of data on every distribution so I can make a searchable catalog of it. When you want to use DPAN to create private CPAN repositories, you shouldn't have to re-index distributions where we already know the answer. I can provide most of the answers pre-cataloged so you focus on only the novel distributions in your repository. I have a tarball of sample results for 16,000 distributions.

My indexing has one feature that PAUSE doesn't: I run the distribution code. PAUSE does various things to guess distributions (and in rare cases guesses incorrectly). I run the build file and look in blib as well as use PPI to extract program elements.

Since I'm indexing every distribution, I have to deal with every goofy thing that someone has ever done in a Makefile.PL, all the way back to 1994. That includes things that prompt for information without using Makemaker's prompt function which knows how to deal with non-interactive installations:

# Makefile.PL

print "Tell me something> ";

That's not a huge deal, or shouldn't be. As I index, I have a timeout value that I enforce with alarm:

# From MyCPAN::Indexer::Worker
local $SIG{ALRM} = sub { die "Alarm rang for $dist_basename!\n" };
alarm( $config->alarm || 15 );
$logger->debug( "Examining $dist_basename" );
my $info = eval { $Indexer->run( $dist ) };
$logger->debug( "Done examining $dist_basename" );
my $at = $@; chomp $at;
alarm 0;

Eventually in that run(), I'd run a system so I could create blib:

system( $^X, 'Makefile.PL' );

For some reason that I don't care to investigate fully, when the alarm would trigger for these prompting Makefile.PLs, I would leave behind a zombie process. That's not terribly bad, but I'm doing this for 170,000 distributions. I discovered, at least on FreeBSD 8, that there is a limit to the number of zombies I can have, and it's around 25 or so. That might be some sort of resource limit that ulimit isn't telling me about. Once I made another zombie, everything hung. That would happen about 15 minutes after I started the process and had already gone off to bed. Those overnight runs didn't get much done.

I did various things to try to reproduce this on the small scale, but eventually decided to stop trying to figure it out and start trying to solve it. So, instead of system, I switched to IPC::System::Simple. Paul Fenwick has done a lot of work to make things work properly, so I figured some of that would chop the heads off these zombies. I think it just might have. Since IPC::System::Simple has its own system, I really just needed to load the module so I could replace the built-in version.

use IPC::System::Simple qw(system);
system( $^X, 'Makefile.PL' );

Now everything works. At the moment I don't care how or why, as long as it's churning out indexing reports.

Leave a comment

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).