MyCPAN indexes 97% of BackPAN

My goal a long time ago was to index about 90 to 95% of BackPAN, thinking that if I didn't get some ancient distributions that would be just fine and no one would miss them. There are about 140,000 distributions to index, and I'm figuring out why I can't get the last 4,200. That means I'm indexing


I've been back to my BackPAN indexing project since I've stopped traveling to all of these conferences. Now I'm looking at the edge cases. Here's a breakdown of what I can't index just yet. Most of this is my fault. That is, I have the goal to index all of BackPAN, trying many methods to get the right answer. Most of the missing 3% are edge cases I don't handle. Some of that missing 3% I'll never be able to index, like the 0 byte distros.

Could not find distro files1707I'm probably doing something wrong.
Could not unpack dist749Some tarballs seem to not like my tar, or not even be valid
Could not find file list735Some things don't unpack normally, or aren't actually Perl dists
Could not find distribution directory307I expect everything to be in a directory. Some distros unpack to the current dir.
Could not find module list276Some distros don't have modules.
Could not parse META.yml163Haven't figured this out. Not all YAML formats and parsers are compatible
No idea139This is a catch-all for things I couldn't classified
Could not run build file91Some of these are missing a library, etc., so the build file dies.
Unparseable YAML files45Same as the META.yml, but for the report I created. Something didn't store correctly.
Other YAML errors13A parser started to parse something then gave up.
Dist has 0 size8Some tarballs are 0 bytes. Two of them are mine.
Permission denied8I can't read some files in some dists because their permissions are wacky.
Some of these I might decide to not even try to index. Remember, BackPAN goes back to the early 1990s, and if I miss some of those older dists, the world isn't going to end.

Leave a comment

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).