Day 13: *PAN

About the series: perlancar's 2014 Advent Calendar: Introduction to a selection of 24 modules which I published in 2014. Table of contents.

What is XPAN you say? I wanted to write *PAN, but since that is not a valid module name (or perhaps it is? maybe there's some word character somewhere in the set which resembles an asterisk/star?) I settled with XPAN. The XPAN::Query explains it and I quote: "XPAN is a term I coined for any repository (directory tree, be it on a local filesystem or a remote network) that has structure like a CPAN mirror, specifically having a modules/02packages.details.txt.gz file. This includes a normal CPAN mirror, a MiniCPAN, or a DarkPAN. Currently it excludes BackPAN, because it does not have 02packages.details.txt.gz, only authors/id/C/CP/CPANID directories.

XPAN::Query provides several functions to list packages, authors, modules, and dists by extracting the 02packages.details.txt.gz. If querying against a remote mirror like http://www.cpan.org it will first download the file (but cache it afterwards for 24 hours, by default). Since the file is only about 1.6Mb for CPAN, it will only take several seconds to complete (and subsequent queries will only take a fraction to 1-2 seconds, since the content of the file and the parse result are cached).

There are CLI utilities wrapper for the functions, released in the App-XPANQueryUtils distribution: list-xpan-packages, list-xpan-authors, list-xpan-modules, list-xpan-dists.

XPAN querying functions can be an alternative to querying the MetaCPAN API, albeit currently the XPAN querying facility is relatively very simple. First, it works with your local MiniCPAN and DarkPANs instead of only against live CPAN mirror. And second, it can reduce the number of API calls if you want to grab a lot of data.

For example, this is a script that ranks CPAN authors by the number of distributions installed on your system:


use 5.010;
use strict;
use warnings;

use Dist::Util qw(list_dists);
use Perinci::CmdLine::Lite;
use XPAN::Query qw(list_xpan_dists);

our %SPEC;
$SPEC{list} = {
v => 1.1,
args => {
url => {
schema => 'str*',
default => 'http://www.cpan.org/',
},
},
};
sub list {
my %args = @_;

# get all installed dists
my @dists = list_dists();

# get all dists info from CPAN, along with their author, etc
my %authors = map {$_->{name} => $_->{author}}
@{ list_xpan_dists(url => $args{url}, detail => 1) };

# gather author by dist count
my %num_dists; # key=PAUSE ID, val=dist count
for my $dist (@dists) {
$num_dists{ $authors{$dist} // '(unknown)' }++;
}

[200, "OK", [map { {author=>$_, num_dists=>$num_dists{$_}} }
sort { $num_dists{$b}<=>$num_dists{$a} }
keys %num_dists]];
}

Leave a comment

About perlancar

user-pic #perl #indonesia