CPAN {Spring|Autumn} cleaning time again

Get rid of your old distributions on CPAN! A couple of years ago, I asked CPAN authors to visit their delete files PAUSE page to "increase their Schwartz". Sadly, the use.Perl page has disappeared; the Schwartz Factor is the ratio of latest distros to the total size of CPAN. I named it after Randal Schwartz, who invented MiniCPAN.

Wendy noted that deleting distros was a topic at the recent Lancaster QA workshop. She didn't say much, but it sure sounded like people were looking to create policies and work to police something.

CPAN has never been curated, and that's by design. Let's keep it that way so the rules committees leave it alone. :)

7 Comments

use.perl didn't go away but, sadly, the URL scheme changed soon after pudge closed it down.

Your post is still available though.

We could have a checkbox on PAUSE for authors to say "I'm happy for old superseded dists to be auto-purged" (with a clear definition for that). Lots of (past) CPAN authors won't do this though, so ...

A quarterly check for "old superseded dists" could be done and 'offending' authors emailed a list for action. This email would be a useful liveness check for their contact email address.

Part of the PAUSE contract for CPAN authors (which I know doesn't exist, but think should), could be that if you don't maintain your contactability, then the PAUSE admins reserve the right to tidy up your CPAN area.

During the hackathon, Aaron and I worked on using acme's WWW::UsePerl::Server to serve the old URL s(as pointed to by old bits of the Internet). use.perl.org is history worth preserving (other than by manually pasting broken links into the Web Archive).

Given how long the static version of use.perl.org has been up, it's probably also worth preserving these new URLs. :-)

I occasionally find it useful to have the latest-but-one release off CPAN as well. Sometimes the latest release introduces a bug that I can avoid by using the previous release.

Personally I tend to keep a rotation of three versions of every module. Whenever I upload one, I delete the now fourth-oldest version. Though occasionally I forget, so sometimes I just read down the "delete files" list looking for dists with more than three versions.

I guess it's something crying out for some sort of automated tool that could at least tell me what I ought to delete.

BooK,

I saw that when reading the QA Hackathon wiki just now. This made me very excited as it was only a couple of days ago that I considered doing that myself.

So, how far did you get?

> I occasionally find it useful to have the latest-but-one release off CPAN as well. Sometimes the latest release introduces a bug that I can avoid by using the previous release.

Previous releases are still available after deletion via BackPAN -- and links to there are available on the metacpan page for the distribution. The only thing that wouldn't work the same as before is cpan clients (e.g. new cpanm) that seek to install a specific version, and even then, the clients could be updated to fall back to BackPAN if the dist is not found on PAUSE.

Also, this plugin is available for Dist::Zilla users to see what their existing Schwartz ratio is (and the list of old undeleted dists): https://metacpan.org/module/Dist::Zilla::Plugin::SchwartzRatio

Since old releases are preserved by BackPAN, I tend to keep only the latest releases in my CPAN directory. The reason for that is... there is an existing script/module to do that sort of cleanup for me: WWW::PAUSE::CleanUpHomeDir. I just use the script provided in the Synopsis with slight modification:

https://github.com/sharyanto/scripts/blob/master/cleanup-pause-homedir

Leave a comment

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).