Preallocating scalars

I'm using the fabulous FFI::Platypus to interface to a C routine which uses caller-allocated buffers to return data. While FFI::Platypus transparently translates Perl arrays to C arrays and back, the buffers are used only to return data, so I only need the C-to-Perl conversion and not the Perl-to-C conversion.

The first step is to efficiently allocate a buffer of a given size in Perl (the last step, converting the retuned data in the buffer to Perl, is done straightforwardly with unpack).

If you do your due diligence, you'll find a link to an old PerlMonks post, which provides the following recipe:

my $str;
vec( $str, $length, 8 ) = 0;
$str = '';

That seemed like too many things to type, so I thought back to my telemetry unpacking days and of course, pack to the rescue. There's at least a couple of ways to do it with pack.

This one writes $length null bytes:

my $str = pack( "x$length" );

And this one null fills to a given absolute position

my $str = pack( ".", $length );

I tend to like this one, as it doesn't require string interpolation.

Then, there's Convert::String::grow:

my $str;
grow $str, $length;

which is a thin wrapper around Perl's internal sv_grow routine.

So which is the fastest? Here are benchmarks (Perl 5.28.1) for various buffer lengths.

Buffer size = 1000 bytes
                       Rate    vec pack_x pack_dot   grow
vec      3.31756e+06+-160/s     -- -50.2%   -57.4% -66.5%
pack_x   6.6639e+06+-1700/s 100.9%     --   -14.4% -32.7%
pack_dot  7.784e+06+-1800/s 134.6%  16.8%       -- -21.4%
grow     9.8974e+06+-2900/s 198.3%  48.5%    27.2%     --


Buffer size = 10000 bytes
                        Rate    vec pack_x pack_dot   grow
vec         2.82529e+06+-0/s     -- -10.0%   -12.9% -72.1%
pack_x     3.139e+06+-1200/s  11.1%     --    -3.3% -69.0%
pack_dot  3.24488e+06+-610/s  14.9%   3.4%       -- -67.9%
grow     1.01173e+07+-2200/s 258.1% 222.3%   211.8%     --


Buffer size = 100000 bytes
                       Rate pack_dot  pack_x     vec   grow
pack_dot       213512+-94/s       --   -1.5%   -3.3% -97.9%
pack_x         216718+-87/s     1.5%      --   -1.8% -97.9%
vec           220760+-180/s     3.4%    1.9%      -- -97.8%
grow     1.01299e+07+-740/s  4644.4% 4574.2% 4488.6%     --


Buffer size = 1000000 bytes
                        Rate      vec pack_dot   pack_x   grow
vec           20776.6+-3.2/s       --    -0.2%    -0.3% -99.8%
pack_dot     20822.7+-0.75/s     0.2%       --    -0.1% -99.8%
pack_x         20843+-0.73/s     0.3%     0.1%       -- -99.8%
grow     1.01142e+07+-2200/s 48580.7% 48473.0% 48425.7%     --


Buffer size = 10000000 bytes
                       Rate  pack_dot    pack_x       vec    grow
pack_dot    1886.17+-0.24/s        --     -0.1%     -6.7% -100.0%
pack_x      1887.12+-0.21/s      0.1%        --     -6.7% -100.0%
vec         2022.37+-0.48/s      7.2%      7.2%        -- -100.0%
grow     1.01198e+07+-750/s 536423.5% 536153.1% 500289.6%      --

Not surprisingly, Convert::String::grow blows everything else out of the water.

If you need a core Perl solution, pack( '.', ...) is faster for smaller buffers and comparable to vec for larger ones, so provides the best overall solution. Unfortunately both solutions both allocate and clear the memory. I can't think of a core Perl solution which will just allocate the memory.

And here's the code:

use Benchmark::Dumb qw[ cmpthese ];

use strict;
use warnings;

use Convert::Scalar qw[ grow ];

for my $length ( 1e3, 1e4, 1e5, 1e6, 1e7 ) {

    print "\n\nBuffer size = $length bytes\n";

    cmpthese(
        1000.001,
        {
            pack_x => sub {
                my $str = pack( "x$length" );
            },
            pack_dot => sub {
                my $str = pack( ".", $length );
            },
            vec => sub {
                my $str;
                vec( $str, $length, 8 ) = 0;
                $str = '';
            },
            grow => sub {
                my $str;
                grow $str, $length;
            }
        } );
}

print "\n";

2 Comments

You may also find the utility functions in FFI::Platypus::Memory and FFI::Platypus::Buffer interesting, though those are really for dealing with opaque pointers to strings when FFI::Platypus's standard C-to-Perl string conversion isn't sufficient.

Leave a comment

About Diab Jerius

user-pic I blog about Perl.