My Favorite Modules: Devel::NYTProf

'It is a capital mistake to theorize before one has data.' -- Sherlock Holmes, "A Scandal in Bohemia"

The mental excursion that led to this blog post started with a report from Olaf Alders that my Perl::Critic::Policy::Variables::ProhibitUnusedVarsStricter was generating a false positive on variables used only as defaults in subroutine signatures. After the first cut at fixing this I thought a regression test was in order. I did this by running both unpatched and patched versions of the policy against my Mini CPAN, and then diff on the outputs.

This has always taken the better part of a day to run, and given that it had to expand all the distributions first and then run a fairly brute-force policy against anything it found, I accepted this as the price of conscientiousness.

But then I noticed, quite by chance, that running the patched policy against the Number-Phone-FR distribution appeared to hang. Investigation failed to turn up any reason for the modifications to do this, and when I ran the unpatched code against that distribution it also appeared to hang. I knew it eventually completed, though, since the full Mini CPAN scan using unpatched code eventually completed.

I fairly quickly isolated the slowness to the analysis of one file: lib/Number/Phone/FR/ This was only 46 lines long, but contained 182615 characters, most of them in one monster regular expression.

This immediately made me suspect PPIx::Regexp, since I have great faith in my own ability to write bugs. This finds siblings the same way PPI does, which involves traversing an array rather than holding weak links. So I started looking into an optimization that tracked the number of each kind of token in the parse, since the monster regular expression contained no interpolations. I was about to start implementing when I thought, "Maybe I should actually profile this code."

It was a good thing I did, because Devel::NYTProf told a very different story. According to it, the majority of the time was spent on one statement in Text::Tabs' expand() method, which PPIx::Regexp's tokenizer was using in its computation of logical column positions.

For the curious, that statement was $offs = () = /\PM/g;. Now, I do not have huge amounts of Unicode-Fu, but I understand this to be counting all the characters in the topic variable which are not explicitly zero-width. The profiler said that statement consumed 2825 seconds for 161163 executions.

Since PPI, which underpins any Perl Critic policy, did not appear to have this problem, I looked at how it calculated column positions, and found a simple analysis of tab characters and their positions. This is technically less correct than Text::Tabs, but for the sake of expediency I chose consistency over strict correctness. Using the patched PPIx::Regexp, the patched Perl Critic policy analyzed the problem module in 500 seconds, as compared to 4000 seconds for the unpatched PPIx::Regexp.

The moral: Because I took a couple seconds to actually think, and a couple minutes to install Devel::NYTProf, I saved several hours developing an optimization that did not address the actual problem. I do not count the hour or so running the profiler (it was not significantly longer than the time to run the un-profiled code) because that did not require my actual attention.

The actual timings were done on a 2.3 GHz Dual-Core Intel i5 running macOS 12.2.1 Monterey. No effort was made to control other loads on the system, and the timings were not replicated enough to be statistically valid. But, hey, cutting execution time by 7/8 looks like a win to me.

Previous entries in this series:

  1. if
  2. diagnostics
  3. Term::ReadLine::Perl
  4. re


Specifically what it’s counting is characters that aren’t combining marks. E.g. if you have an “é”, while it might be encoded as one single codepoint (same as in Latin-1), it might also be encoded as an “e” followed by a U+0301 (COMBINING ACUTE ACCENT) – meaning there are two characters in the string, but when rendered they only take up a single character’s worth of space. So that pattern counts base characters while ignoring combining marks.

This is essentially irrelevant for most source, particularly as most accented or otherwise marked characters likely to occur in source code files are almost certainly going to show up in their precomposed forms. Actual combining marks are extremely unlikely to show up – probably even in the most hardcore Raku aficionado’s code. So getting rid of Text::Tabs is highly unlikely to ever make an observable difference, whether merely cosmetic or actually important.

Devel::NYTProf outputs the result of beautiful coloring HTMLs.

Leave a comment

About Tom Wyant

user-pic I blog about Perl.