Minimizing script startup overhead

By Steven Haryanto on November 24, 2011 5:16 PM

Here are some of the techniques I use to minimize startup overhead. Due to the nature of our application (command-line tools), we would like quick responses everytime a script is called from the shell.

1) Avoid Moose. Mouse used to be a bit too heavy for my taste too, so we always use Moo, but nowadays I guess Mouse is okay, especially since Mouse provides much more feature. But in general I still avoid OO-style or use bare Perl5 OO for simpler apps.

2) Avoid heavy modules and use lighter variants. I usually run this script when trying out modules, and take its result into consideration. I used to choose CGI::Lite over CGI for this reason, though nowadays I very seldom have to resort to CGI directly (thanks to PSGI/Plack).

3) Delay loading of modules as much as possible. Some modules are only needed by a subroutine in a using-module, so instead of:

use Foo; use Bar;

I usually write:

# Foo is used by lorem only sub lorem { require Foo; my ($arg1, $arg2) = @_; ... Foo::foo(); }

# Bar is used by ipsum only sub ipsum { require Bar; ... }

Sometimes a module is used by two or three subroutines, and I pepper each with the require statements. I think this is still okay.

4) Use AutoLoader, or a similar home-brewed technique based on AUTOLOAD mechanism.

5) Split larger modules into smaller ones. Of course, splitting is mainly decided based on design, but sometimes this is done for reducing startup overhead too.

6) Use persistent Perl environment, e.g. PPerl or SpeedyCGI. Actually I have had little success so far with this approach, because of some pitfalls and restrictions you need to be aware of. Years ago I tried to use PPerl with AWStats (which is a monolithic giant Perl script with at least 1-2s of compilation overhead) and it gave me strange results. I'm also contemplating using PPerl or SpeedyCGI when writing a Perl-based suexec replacement for Apache (for greater flexibility of CGI scripts execution, e.g. adding extra checks and making it work with dynamic virtual hosts) but so far no luck yet due to lacking setuid support.

Most of these tools are also no longer maintained. For example, Debian has maintained the speedy-cgi-perl package to compile with Perl 5.10 but the upstream version on CPAN no longer compiles with more current Perl versions.

What techniques do you use?

7 comments

7 Comments

confuseAcat | November 24, 2011 9:16 PM | Reply

Before I go ahead and change "use" into "require", I go and buy new hardware. If a change like that really makes a difference, you should either optimize the module being used/required or admit that you are working on hardware that should have been replaced years ago.

Steven Haryanto | November 24, 2011 10:15 PM | Reply

@confuseAcat: Granted, my work PC is about 2 years behind (Athlon64 X2 5600+, 4GB RAM). But here's an example. Just loading DateTime will pull about 40 files totalling +- 28K lines. It adds up to about 0.11s of compilation time. When you run a script that needs to do bash autocomplete, this multiples of 0.1s is *quite* noticable.

Steven Haryanto | November 24, 2011 10:19 PM | Reply

Also, I actually would love doing daemons/persistent scripts, that way I don't have to worry about startup times so much, but sometimes it's hard in Unix. Examples are mail filters (.qmail), CGI scripts in shared hosting (yup, VPS plans are cheap nowadays but still there are millions of shared hosting users), Apache external filters, and many others. Those are all one-off, once-per-request invocations.

Aristotle | November 28, 2011 6:02 PM | Reply

Have you actually measured that AutoLoader is an improvement? Multiple trips to the hard disk (at least to its caches) involving separate files (and therefore going all the way down the VFS kernel subsystem rathole): the phrase that comes to mind at all this is: “are you insane?” Once upon a time when memory was very scarce and the relative performance of all the tiers of memory (registers, L1, L2, main memory, disk) was much closer to each other, it may have made sense. But on modern hardware where memory is plentiful, bus speeds are near memory (so loading lots of code from disk at once costs effectively the same as loading a little), and it’s literally millions of times more costly to hit disk instead of RAM, I suspect AutoLoader is almost bound to make things worse, maybe much worse.

Its only win is to delay compilation. In all other respects it makes things slower.

So at most I’d use SelfLoader.

In my refactoring of POSIX.pm, I switched it from AutoLoader to a an approach that uses AUTOLOAD to compile on demand code stored as strings in a hash.

Steven Haryanto | November 29, 2011 6:54 PM | Reply

@Aristotle: Yup, my reason for using AutoLoader is to delay compilation, without resorting to creating new subpackages. We have a few (not many) routines which require heavy modules, like Regexp::Grammars. By delaying loading Regexp::Grammars *and* delaying compilation of RG regexps, quite a bit of startup time is shaved off. Another module which benefits delaying in our case is, as mentioned above, DateTime, since it's quite heavy.

Steven Haryanto | November 29, 2011 6:58 PM | Reply

@Aristotle: I guess SelfLoader is nice too, except that I suspect it doesn't work nicely with Dist::Zilla/Pod::Weaver and syntax highlighting out-of-the-box.

Aristotle | November 29, 2011 8:56 PM | Reply

Hmm. It should be easy to write a small dzil plugin that will turn a ## SelfLoader ## line or some such into an __END__ token as SelfLoader requires, so that syntax highlighting would not be affected during development.

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Steven Haryanto

A programmer (mostly Perl 5 nowadays). My CPAN ID: SHARYANTO. I'm sedusedan on perlmonks. My twitter is stevenharyanto (but I don't tweet much). Follow me on github: sharyanto.

More info »

Of course I still use Perl