Don't release experiments to CPAN
I'm proposing an explicit community convention where experimental code isn't released to CPAN, but is shared on github, perhaps with an associated blog post, or discussion on PrePAN.
This addresses just one of CPAN's problems, which have also been raised today by Brendan Byrd.
There are many modules on CPAN which appear to be the result of some experimentation. Once the author has demonstrated their point, (s)he loses interest, and the module lurks on CPAN, waiting to catch out the unwary.
I've reviewed a number of module categories now, and in a number of them I've hit such experiments. I've had email with some of the authors, who've admitted the heritage, and often comment "oh, I forgot about that module".
The problem with these modules is that they just reduce the signal-to-noise ratio of CPAN, and make life harder for users, particularly those new to Perl and CPAN. Consider the following:
- If you want to define some constants, there are 21 modules.
- If you want to perform some kind of run-time loading, for example of plugins, there are at least 43 modules you might consider, for about a dozen use cases.
- If you wanted a module to find dependencies of your code there are 26 modules, but only about 5 distinct use cases.
In all of these categories there are modules that were experiments that never went anywhere.
Let me be clear: I don't want to suppress experimentation - it's a key contributor to progress. I imagine Moose started off as a wee experiment.
If you've been experimenting and would like to share, either (a) to show "hey, look what I did", or (b) you're just not sure whether anyone else would use it:
- Put the module on github. Sure, there are other places you could put it, but github is fairly well linked into the Perl ecosphere, and cpanminus 1.6+ can install modules direct from github:
- Write a blog post about it. If you don't have a blog, or your blog doesn't have a wide readership, then consider posting to blogs.perl.org as well. If you link to your blog, you might even gain some more readers.
- Discuss your module on PrePAN.
- If there are existing modules on CPAN which are similar, you could email their authors, asking whether they might link to your github repository in the SEE ALSO section of their doc. Ok, I admit your success rate might not be good here.
- You could annotate these modules on annocpan.
But if many modules start as experiments, how do you decide whether / when to release to CPAN? Just apply some common sense. For example, if you or anyone else starts relying on the module, then it's time for CPAN.
How might this idea evolve?
- If the name weren't already taken, PrePAN could be a place for uploading experimental modules. Maybe PrePAN could evolve to become that as well?
- If MetaCPAN indexed all perl dists on github, it might not include them in search results by default, but could say "N additional things on github matched your query, click here to include them".
You could also consider deleting any experiments you already have on CPAN: cpanminus can install from backpan. Check with the reverse dependencies service first. Deleting modules from CPAN is worthy of a separate post, but Brendan Byrd might beat me to it.
Revisited, 2 days later
I think have should have more clearly defined what I mean(t) by experiment!
There are a number of situations when I'd consider a module to be an experiment, but the classic example (for me) is where you're writing a module with no intention to use it in any real code. This might be to see whether something is possible, possibly trying to (ab)use Perl in some unexpected way. Such an experiment may obviously lead to something unexpected and useful.
Another category, but less clear-cut for me, is when you're creating something, but you're not sure exactly what it is you're creating, and whether there might be a module on CPAN already. Often the namespace(s) will change, and you may drop it anyway. Typically at this stage I just don't share it, because I don't want to worry about namespaces (even though you can free them after you rename, I know), but I've had a couple of cases where people wanted to play with the code anyway. See below.
Some examples might clarify, and might help me refine what the hell I'm talking / thinking about!
Not experimentsI do not see the following as experiments:
- Someone new to Perl writes a module which they're using at work. They proudly upload it to CPAN.
They might have no idea whether it would be of interest to anyone else.
As an aside, one problem (with the current toolchain) is that authors aren't encouraged / helped to find modules on CPAN that might serve their needs, or which they might be able to take over and evolve (commented on by brian d foy in a comment on Brendan Byrd's post on problems with CPAN). A topic for another day.
- Karen's post describing a dev release of Test::Warnings. Karen describes the implementation as experimental, but I don't see the module as an experiment — it's clearly written to meet a need, and will be used.
- Someone writes a module that's a complete hack (deadlines, we've all been there),
but which they imagine they'll probably get around to doing a 'proper' version of eventually.
Aside: it's not very easy to tag your dist with maturity, as others have noted. You can't just look at the version: you'd miss Net::HTTP::Tiny 0.001. 10.7% of dists on CPAN have version 0.01 or 0.001. And you can't just look at reverse dependencies: Net::HTTP::Tiny has none — I tend to use it in some of my scripts, and HTTP::Tiny in modules.
- Someone has a (possibly slightly crazy) idea for a module, which addresses something they see / have as a very real need, and which they think might pan out. I think this comes down to personal definition of experimental, and how you like to play that out, but if there are others already interested in joining in, then I'd always err on the side of CPAN.
Where you draw the line between experimental and not is a personal call. From now on before I upload a new module (and as you can see, it's not something I've done many times), I'll just ask myself whether this is experimental, by my personal definition. If so, I'll release it to GitHub, and possibly describe it in a blog post. And if someone else starts using it, I'll put it on CPAN.
I consider the following to be experiments:
- I was looking for a tool to graph dependencies, and started searching for modules. I didn't find anything that met my needs, so knocked up a module. I'd already found a handful of modules, so started writing a review, and decided I wouldn't put my module on CPAN until I finished the review, in case I found a module I was essentially duplicating. As I progressed I kept finding more modules, and others pointed out modules in namespaces I'd not even considered. Still I found nothing in direct competition with my module. But the 23rd module was, and so I may not ever release my module, but either submit changes, or refactor it as a helper module. If I find myself in this situation again, I'll put it on GitHub.
- I have various modules and scripts I use when writing reviews. At some point I plan on releasing them to CPAN, but until recently, I seemed to tear them apart every time I wrote a review, including a change of namespace. I'm hoping to stabilise them soon, and will then put them on GitHub, and then ask Leo (and anyone else interested) to comment on them. I probably won't put them on CPAN unless / until someone else writes a review using them, which may be never, and that's fine.
- I suspect that Module::Hash is an experiment. Someone new to Perl might come across this module and think "This Toby guy seems to know what he's doing, the module is recent, it's well documented, so maybe this is the modern/new way to load modules at runtime". To me this seems like a good candidate for "GitHub and blog post", and I can see the blog post generating comments, which may in turn lead to a CPAN release, or not. But if it doesn't meet Toby's definition of experiment, then fine, and my apologies to Toby.
- A number of modules in the Acme namespace :-)
Furthermore, I'm not imagining that such code would be forever banished to GitHub, never allowed to sully CPAN. Once code isn't an experiment, then I see it being uploaded to CPAN as well.
This "experiments on GitHub not CPAN" idea addresses just one small part of the "problems with CPAN"; I was thinking about a much smaller percentage of CPAN modules that many of you seemed to think. Mea culpa.
My personal process when creating modules will now be something like:
- Put the module on GitHub.
- Possibly register the namespace. I haven't always done this, but when I have, I've had thoughtful and helpful comments.
- Search for similar modules, and link to them in the SEE ALSO section.
- If there's an existing module close enough, see if I can contribute to that rather than release my module, or if it's gone stale, whether I can take it over.
- If it's not experimental, and I got to this point, then release to CPAN.
- Possibly write a blog post on it.
Most (if not all?) of the people who've commented on this, and related posts, are not part of the demographic I'm worrying about (read: screw the lot of you! ;-) They are: new or casual Perl programmers and CPAN users. CPAN is currently a seriously sub-optimal experience for such users.
A few years ago I was a born-again Perl newbie, and often when I turned to CPAN for "a module to do X", I'd find a handful of modules and no easy way to determine which was the right one to use. I decided I'd do a quick (ha!) review whenever I hit this, so (a) I'd make an informed decision, (b) it might help others, and (c) the peanut gallery might point out gaps / flaws in my reviews, and improve the end result. After doing a few reviews, I gave a talk on CPAN Curation at LPW 2011, where I listed some of the problems I saw with CPAN, and thoughts for how they might be addressed. I'll revisit that in a separate post, as I've ended up thinking about it a lot over the last 2 or 3 days...