use B::Stats to check for less bloat

By Reini Urban on December 9, 2011 7:40 PM

I'm constantly concerned about small modules or p5p adding more and more bloat, especially "innocent" dependencies. The authors obviously do not care about memory on small devices or vmware hosting with 256MB, and since Moose is so popular obviously also do not care about serious run-time penalties loading all this cruft for almost no gain.

My top concerns are:

1) use warnings

which depends on warnings::register and Carp, which loads a whole bunch of big generally unnecessary warnings category hashes. And XSLoader. And B if compiled. With B::C I introduced -fno-warnings for 5.13.5 to save 68KB executable size on 32-bit. Uncompiled the numbers are of course obscenely higher.

2) use Carp

This requires XSLoader and also includes the huge B, at least when compiled. The biggest hit is the Carp warning when AUTOLOAD a XS module fails. Then Carp is dynaloaded to print the callstack. Printing the callstack in case of dynaloader errors should seriously not be dependent on DynaLoader, it should be provided by the static libperl, e.g. pp_caller() alone.

3) swash_init utf8

utf8_heavy is autoloaded whenever perl needs upper-case/lower-case folding tables. These tables include all unicode tables because we are not ASCII anymore, which are loaded as fat perl tables, not as fast c arrays as e.g. Encode or icu does. Perl might have the most unicode features but is seriously not yet unicode ready enough. There is no heuristic to check the string for possible non-ascii strings, there is no ascii pragma (no utf8 would be the correct name) to prevent from loading these tables. With B::C I introduced -fno-fold for 5.13.9 to save 1.6MB executable size on 32-bit when utf8 folding is not required.

Anyway, I'm now measuring the size of the optree at certain stages with my new module B::Stats. It unfortunately requires B, which itself includes 14 files, 3821 lines and ca. 4883 ops. I have to subtract this constant overhead, similar to a profiler. I tried but it makes no sense to reimplement B for B::Stats. I could write everything as XS to get rid of the B overhead.

B::Stats counts all ops statically at compile-time and end-time, to see which run-time loaded modules are added, and also counts the actually performed ops dynamically at run-time.

The B::Stats output also give you exact size and performance numbers independent of the CPU and machine load, contrary to heavy benchmarks. Of course certain ops are more costly than others, I haven't averaged yet the typical op costs to output better numbers. B::Stats is still in its early stage. The options -l and -f and -C fragmentation are not yet implemented.

It is roughly comparable to Devel::Size. Devel::Size does more, this also includes the size of the data, B::Stats just counts the ops, no data.

The primary need for B::Stats was to come up with fair numbers of op distributions for benchmarks. A benchmark should not be too slow but should cover the typical run-time cost of the to-be-tested program. So the static and run-time distribution of the ops should be comparable.

Running a single benchmark function 10.000 times is a waste of time if the cost stabilizes after 5-15 runs and is measurable and if the benchmark does not compare to the actual usage. Hence B::Stats.

5 comments

5 Comments

Mithaldu | December 9, 2011 11:29 PM

Wouldn't it be better to just go with Erlang when you're trying to run something on a phone?

Adam Kennedy | December 12, 2011 12:54 AM

I highly approve of your efforts to debloat the core!

Adam Kennedy | December 12, 2011 1:02 AM

One other possibility for optimisation (although I don't know how big...) is $^O.

There's lots of code all over the core that does if ( $^O eq 'something' ) style things.

Although this is treated as a variable and optimised as a variable, as far as I'm aware the odds of $^O changing at any point is effectively zero, yes?

So adding some kind of constant and switching code over to it would greatly increase the amount of code that could be factored away by the compiler.

Or at least, so it seems.

Reini Urban replied to comment from Adam Kennedy | December 12, 2011 6:31 PM

Whow! This is a very good constant-folding idea.

pjcj.net | December 14, 2011 2:17 PM

Just to note that changing $^O can be a useful technique in testing. I've also used it to work around bugs, but not for a long time. IIRC, an early Windows port used to identify itself as Irix.

About Reini Urban

Working at cPanel on cperl, B::C (the perl-compiler), parrot, B::Generate, cygwin perl and more guts, keeping the system alive.

More info »

Reini Urban