A Dead Simple CPAN API

Since the crowd funding campaign for Pinto is already off to such a great start, I figured that I better roll up my sleeves and start writing some code. The primary goal of the campaign is to enable Pinto users to pull specific versions of modules into their repository without having to know precisely which distribution they came from. Read on to hear what I've done so far...

The first step toward solving this problem is to create (or find) a mapping from module/version to distribution archive (i.e. tar ball). Fortunately, MetaCPAN has tons of information about every module that has ever been released to CPAN.

With some help from the wonderful crew on the #metacpan channel, I was able to put together a query that would tell me which distributions contained any particular version of a module. And by taking a bit of code from cpanm, I could even use notation like ">=1.34, !=1.56, < 2.45".

I'm still a novice with the MetaCPAN API, so the query felt cumbersome to me. Plus, it required two trips to the server to get the final answer. Since I have a Pinto repository that also contains all of BackPAN, I tried to roll my own minimal CPAN API.

Each of these queries will return a JSON array containing the AUTHOR/Dist-File for every distribution that contains the requested version(s) of the Plack module. They are sorted most recent first. Feel free to try these at home with your favorite module (beware the database is about 2 weeks old now)...

Where is Plack?
http://lookup.stratopan.com/Plack

Where is version 1.0016 of Plack?
http://lookup.stratopan.com/Plack==1.0016

Where is a version of Plack that is greater than 0.9949 and less that 1.0006 but not 1.002?
http://lookup.stratopan.com/Plack>=0.9949,<=1.0006,!=1.0002

But I've already run into one stumbling block: unauthorized distributions. Pinto doesn't know about PAUSE's permission system, so it will report distributions that don't contain "official" versions of the module. But MetaCPAN does know about the permissions, so you can filter unauthorized modules out of the results.

So in the end, building on the MetaCPAN API is probably the right way to go here. But it was an interesting exercise to see what I could do with a very large Pinto repository. And it only took about 20 lines of code!

Leave a comment

About Jeffrey Ryan Thalhammer

user-pic Hacker, speaker, author, dad.