Smart Match in CPAN
There is nothing like looking, if you want to find something. -- The Hobbit, iv, "Over Hill and Under Hill"
Recently on the p5p mailing list the topic of removing smart match re-surfaced. There was a fairly vigorous discussion about the effect this would have on CPAN. So I thought I would look into how many uses there actually were.
Fortunately there are Perl Critic policies for this: Jan Holčapek's Perl::Critic::Policy::ControlStructures::ProhibitSwitchStatements and Perl::Critic::Policy::Operators::ProhibitSmartmatch. All I had to do was run them against my mini-CPAN.
My results:
- Total distributions: 40704
- Distributions with violations: 842
- Files with violations: 1568
A look at the file names involved says that about two-thirds of the violations are in the published modules themselves, and the rest are in support code (directories t/
, inc/
, and the like).
It is possible that the results of Perl::Critic::Policy::ControlStructures::ProhibitSwitchStatements
contain false positives simply because someone implemented subroutines named given()
or when()
unrelated to smart matching.
It is hard for me to see how there could be false positives from Perl::Critic::Policy::Operators::ProhibitSmartmatch
, though I have learned long since that reality exceeds my ability to imagine it.
Given the nature of Perl, false negatives may have to be detected on a case-by-case basis. I do know that when smart match was briefly removed in a development release a few years back only one module that I use broke, and I had an alternative for it.
The mini-CPAN repository used for analysis was most recently updated 2022-06-24 08:10Z. The configuration file is
remote: https://www.cpan.org/ local: <censored> exact_mirror: 0 skip_perl: 1 dirmode: 0755 path_filters: /Mail-DeliveryStatus-BounceParser-\d
I have unpublished modules in this repository, but they were excluded from the analysis. Also excluded were a few other modules that I have had trouble running Perl Critic against in the past:
CMORRIS/Parse-Extract-Net-MAC48-0.01.tar.gz DOLMEN/Number-Phone-FR-0.0917215.tar.gz GSLONDON/Parse-Nibbler-1.10.tar.gz
A list of the distributions containing violations is at https://trwyant.github.io/misc/smart-match-in-cpan/distros-with-violations.txt
.
An ugly JSON file containing the results of the critique is at https://trwyant.github.io/misc/smart-match-in-cpan/smart-match.json
. By "ugly" I mean non-pretty, non-canonical. This file encodes a hash whose top-level keys are:
asof
- The ISO time the analysis was run;critique
- A hash reference containing the results of the critique (see below);policy
- An array reference containing the fully-qualified names of the policies used to critique the code.
The critique is a set of nested hashes keyed by author name, distribution name, and file name relative to the base directory of the distribution. The value for each file is a reference to an array containing the the violations for that file: line number, column number, policy violated, violation description, and violation explanation. For brevity's sake files without violations are omitted from the output.
That's an excellent, important piece of detective work. Are you going to post a link to it in p5p to get their attention? Also, these are the most recent versions as of 6/24, right?
Thank you.
The short answer is yes, the modules critiqued are most recent as of 6/24. The long answer is a little more complicated, because there are circumstances when more than one distribution can be "current." One such is when a module gets dropped from a distribution. The old distro hangs around because it is the current one for the dropped module.
I am not subscribed to p5p, which is why this information went to my blog. But it is stupid to just hope someone notices. Thank you for the suggestion; I have sent them mail.
It's not that stupid; *I* noticed and made a suggestion which you took. Seems like it worked to me! :-)
If porting code to not use ~~, then match::smart provides a very similar matching function.