On interpolating stuff into pattern matches

Tom Wyant:

Interestingly (to me, at least) they reported that the removal of the /o modifier made their case 2-3 times slower. This surprised me somewhat, as I had understood that modern Perls (for some value of "modern") had done things to minimize the performance difference between the presence and absence of /o.

They indeed have.

Ironically, it’s qr objects which don’t get that benefit. On the machine I’m typing on, the following benchmark…

use Benchmark::Dumb 'cmpthese'; # check this out if you haven't
$_ = 'xyzzy';
our $n = 1_000_000;
cmpthese 0, {
'/$str/',  => 'my $x=   "(?:)"; for(1..$::n){++$a if /$x/  }',
'/$str/o', => 'my $x=   "(?:)"; for(1..$::n){++$a if /$x/o }',
'/$qr/',   => 'my $x= qr/(?:)/; for(1..$::n){++$a if /$x/  }',
'/$qr/o',  => 'my $x= qr/(?:)/; for(1..$::n){++$a if /$x/o }',
};

… gives me results like this:

         Rate/s Precision/s /$qr/ /$str/ /$str/o /$qr/o
/$qr/     3.706     0.00022    -- -33.2%  -44.7% -44.7%
/$str/  5.54452     0.00095 49.6%     --  -17.3% -17.3%
/$str/o 6.70235     0.00099 80.9%  20.9%      --   0.0%
/$qr/o  6.70492     0.00074 80.9%  20.9%    0.0%     --

The difference between /$qr/o and /$str/o is mere noise, whereas /$qr/ and /$str/ are separated by a wide gap, with the string version leading by over 30%.

Or to slice it another way, adding /o to /$str/ speeds it up by 20% (nothing to scoff at, but not a giant difference) whereas /$qr/ gets an 80% boost – i.e. a 4× bigger one.

Now, 20% for adding /o to /$str/ is the best case, helped by the fact that the pattern is trivial. For confirmation I tried with much more complex patterns, and the result was as expected that all of the differences diminished – except for /$qr/o vs /$str/o, which however still barely registers and only just suffices to stabilise /$qr/o as the slower one.

Realistically, then, you are looking at something in the 10–15% range from adding /o to /$str/ – which is still a nice bonus, but nowhere near as significant as the 50–60% speedup available to /$qr/. The factor of 4 difference in how worthwhile /o is holds up even when the relative differences diminish.

The concrete figures, of course, apply on my machine and the particular perl with its specific compile options. What generalises is…

  1. the fact that a qr object is much slower to interpolate into a pattern match than a string,
  2. that this difference is erased by /o,
  3. meaning that /o speeds up even patterns that interpolate plain strings by a small but not necessarily insignificant bit,
  4. but that it therefore makes a very real difference for patterns that interpolate qr objects.

Of course /o affects the semantics of your code so you can’t just throw it on everything. What if you can’t use it? The evident answer is:

If your interface accepts a qr object to use repeatedly then you should flatten it to a string at the earliest opportunity and use that string instead of the original qr object.

2 Comments

Thank you for clarifying a point I, at least, found obscure. And for making this a top-level post, which it deserves to be.

Leave a comment

About Aristotle

user-pic Waxing philosophical