Getting CPAN module tarball from local MiniCPAN mirror
This blog post is just for sharing a simple script.
Like many Perl programmers, I have a mini CPAN mirror locally on my hard drive (I put mine in
/cpan). Aside from installing CPAN modules offline using cpanminus, I also sometimes want to peek into a dist's files (especially the contents of the
t/). So what's a quick way to get the path of a module's tarball, or to extract it?
Normally cpanminus can do this, via
cpanm --mirror /cpan --mirror-only --look Module::Name, but it doesn't work on my setup (both on the PC and laptop). So I turn to CPAN.
There are at least three libraries on CPAN that deal with this. Parse::CPAN::Packages is a huge library that does many things, but is very heavy on startup (loads 110k+ source code lines and over 300 files, takes 0.5s just to load the module on my laptop). To get the path of a single module requires almost 8s!
Apparently the heaviness of Parse::CPAN::Packages annoyed Slaven Rezic enough that he wrote Parse::CPAN::Packages::Fast (he said that initializing Parse::CPAN::Packages alone takes 10s(!) on his system :-) ). But I find that on my laptop it still takes an average of about 2s to get the path of a module using this library.
Then there's Neil Bowers' . But the interface of his module is not great for my use case. First you need to download via HTTP from www.cpan.org first, because you can't directly feed a local 02packages.details.txt(.gz).
I ended up writing a script that opens 02packages.details.txt.gz with zcat using Perl's "unsafe" 2-param open(), parses it line by line, and exits immediately after finding a match. Depending on the listing position of the module you want to search in 02packages.details.txt.gz, the script takes on average about 0.25s on my laptop. I'm happy again.
To get a tarball path:
% get-tarball-path-from-local-cpan Module::Path /cpan/authors/id/N/NE/NEILB/Module-Path-0.13.tar.gz
% tar xfz `get-tarball-path-from-local-cpan Module::Path`