Preallocating scalars
I'm using the fabulous FFI::Platypus to interface to a C routine which uses caller-allocated buffers to return data. While FFI::Platypus
transparently translates Perl arrays to C arrays and back, the buffers are used only to return data, so I only need the C-to-Perl conversion and not the Perl-to-C conversion.
The first step is to efficiently allocate a buffer of a given size in Perl (the last step, converting the retuned data in the buffer to Perl, is done straightforwardly with unpack
).
If you do your due diligence, you'll find a link to an old PerlMonks post, which provides the following recipe:
my $str;
vec( $str, $length, 8 ) = 0;
$str = '';
That seemed like too many things to type, so I thought back to my telemetry unpacking days and of course, pack
to the rescue. There's at least a couple of ways to do it with pack
.
This one writes $length
null bytes:
my $str = pack( "x$length" );
And this one null fills to a given absolute position
my $str = pack( ".", $length );
I tend to like this one, as it doesn't require string interpolation.
Then, there's Convert::String::grow:
my $str;
grow $str, $length;
which is a thin wrapper around Perl's internal sv_grow
routine.
So which is the fastest? Here are benchmarks (Perl 5.28.1) for various buffer lengths.
Buffer size = 1000 bytes
Rate vec pack_x pack_dot grow
vec 3.31756e+06+-160/s -- -50.2% -57.4% -66.5%
pack_x 6.6639e+06+-1700/s 100.9% -- -14.4% -32.7%
pack_dot 7.784e+06+-1800/s 134.6% 16.8% -- -21.4%
grow 9.8974e+06+-2900/s 198.3% 48.5% 27.2% --
Buffer size = 10000 bytes
Rate vec pack_x pack_dot grow
vec 2.82529e+06+-0/s -- -10.0% -12.9% -72.1%
pack_x 3.139e+06+-1200/s 11.1% -- -3.3% -69.0%
pack_dot 3.24488e+06+-610/s 14.9% 3.4% -- -67.9%
grow 1.01173e+07+-2200/s 258.1% 222.3% 211.8% --
Buffer size = 100000 bytes
Rate pack_dot pack_x vec grow
pack_dot 213512+-94/s -- -1.5% -3.3% -97.9%
pack_x 216718+-87/s 1.5% -- -1.8% -97.9%
vec 220760+-180/s 3.4% 1.9% -- -97.8%
grow 1.01299e+07+-740/s 4644.4% 4574.2% 4488.6% --
Buffer size = 1000000 bytes
Rate vec pack_dot pack_x grow
vec 20776.6+-3.2/s -- -0.2% -0.3% -99.8%
pack_dot 20822.7+-0.75/s 0.2% -- -0.1% -99.8%
pack_x 20843+-0.73/s 0.3% 0.1% -- -99.8%
grow 1.01142e+07+-2200/s 48580.7% 48473.0% 48425.7% --
Buffer size = 10000000 bytes
Rate pack_dot pack_x vec grow
pack_dot 1886.17+-0.24/s -- -0.1% -6.7% -100.0%
pack_x 1887.12+-0.21/s 0.1% -- -6.7% -100.0%
vec 2022.37+-0.48/s 7.2% 7.2% -- -100.0%
grow 1.01198e+07+-750/s 536423.5% 536153.1% 500289.6% --
Not surprisingly, Convert::String::grow
blows everything else out of the water.
If you need a core Perl solution, pack( '.', ...)
is faster for smaller buffers and comparable to vec
for larger ones, so provides the best overall solution. Unfortunately both solutions both allocate and clear the memory. I can't think of a core Perl solution which will just allocate the memory.
And here's the code:
use Benchmark::Dumb qw[ cmpthese ];
use strict;
use warnings;
use Convert::Scalar qw[ grow ];
for my $length ( 1e3, 1e4, 1e5, 1e6, 1e7 ) {
print "\n\nBuffer size = $length bytes\n";
cmpthese(
1000.001,
{
pack_x => sub {
my $str = pack( "x$length" );
},
pack_dot => sub {
my $str = pack( ".", $length );
},
vec => sub {
my $str;
vec( $str, $length, 8 ) = 0;
$str = '';
},
grow => sub {
my $str;
grow $str, $length;
}
} );
}
print "\n";
You may also find the utility functions in FFI::Platypus::Memory and FFI::Platypus::Buffer interesting, though those are really for dealing with opaque pointers to strings when FFI::Platypus's standard C-to-Perl string conversion isn't sufficient.
And there's also Acme::SvGrow from back in the day, which uses either Data::Peek::DGrow, or the `push( '.' . $length)` approach.