Holy bloat, Batman!
Let's compare the latest constant.pm to a minimal equivalent:
$ ./perl -Ilib -le 'print $^V'; /usr/bin/time -l ./perl -Ilib -le 'use constant X => 1..5; print X' 2>&1 | grep 'maximum resident' v5.17.10 3829760 maximum resident set size $ /usr/bin/time -l ./perl -I/tmp -le 'use constant X => 1..5; print X' 2>&1 | grep 'maximum resident' 1200128 maximum resident set sizeThat's 2.6MB bloat to define a constant. (The culprit turns out to be utf8, natch, to handle Unicode constants. (Why, God?!)). For reference,
/tmp/constant.pm
, which does most useful constant-type stuff, is here:
package constant; sub import { shift; my $caller = caller; if (ref $_[0] eq 'HASH') { while (my ($k, $v) = each %{$_[0]}) { *{"$caller\::$k"} = sub () { $v }; } } else { my $k = shift; my @vals = @_; *{"$caller\::$k"} = sub () { @vals }; } } 1;
The offending line:
which is only used for:
Thank you for bringing this to my attention.
(I have sent a message detailing this to the Perl 5 Porters mailing list)
So… this cost… it is not a once-per-interpreter that will be be incurred anyway if any part of the program processes Unicode?
Are we, in other words, talking about bloat in
constant.pm
itself, or about missing lazy loading (which one should hope is possible)?(Unicode, of course, requires an unfortunate amount of what a lot of people think is bloat. It would be nice to cut down on that; on p5p there have been proposals before, regarding how to do that.)
So all your scripts using constants also use Unicode?
Did I write that they do? This question and yours both have the same answer.
My own previous questions were not rhetorical. My agenda in asking them pointedly is not to absolve the pragma and dismiss your concern, but to prevent the pragma’s support for Unicode per se from becoming contested. I do not presume the answers to them to be “yes” and “the former”, though, even if I expect it. If they are, then there are two avenues open here: a) ensure that
constant.pm
will not incur this cost for code which does not require it, and/or b) find ways to reduce that cost significantly. Either of these may address your issue sufficiently. The first is quick and localised; the latter will benefit Perl long-term, long-range.(What the interpreter currently does to load the Unicode data is both slow and memory hungry compared to how it is done elsewhere. It runs Perl code to parse the tables at runtime into Perl-level hashes, whereas elsewhere it is precompiled to a binary shared library and simply mapped into memory on demand. There were posts to p5p – I do not remember whether by Karl, Father Chrysostomos, Reini, Yves, or someone else yet – proposing to follow that example.)
(FWIW, when it comes to scripts, often as not I forgo
constant.pm
entirely in favour of just (e.g.)sub DEBUG () { 0 }
. It’s such thin syntactic sugar I’m always torn over paying any cost for it. This is especially in my CPAN modules – I am loathe to make users pay for… barely even a convenience to me. I am not torn over it when I’m defining so many constants at once that its pass-a-hash feature spares me loads of copy-pasta:)
I have sent a patch to Perl5 Porters to remedy this situation.
Based on the comment just before "
*{chr 256} = \3;
" this should have been reworked after the next dev release (5.15.4).This is sufficient to prevent
utf8
from being loaded prematurely.