CPAN candidates for adoption
Following on from a blog post here last year, I've come up with an improved measure for identifying CPAN distributions that are potential candidates for adoption. I've put a list of the top 1000 adoption candidates online, and you can read more about the scoring measure on my blog. What else could I factor into the score?
Update: I've had lots of good feedback, and am working on the next version, which will hopefully have not so many false positives.
I've added a Questhub quest stencil for adopting a module.
Can I add modules to that list? PDF-Template and Excel-Template (and Graph-Template, for that matter) need to be adopted.
Hi Neil
On Adopting a Module, please add that reference to CPAN::Changes::Spec as on the QuestHub.io page.
Also, phew! Glad to see none of my modules are on the list. But that's probably because no-one's using them ;-).
Rob, I just remembered BDFOY's proposal for how to flag this: give ADOPTME permissions on the modules in those dists. I'm already scanning 06perms.txt, so can look for that as well.
A suggestion for additional scoring is add a point for modules that are mentioned in the core documentation. Perhaps one for modules that are included in core.
This came to mind when wondering why Math::TrulyRandom wasn't on the list. It's mentioned in the core documentation for rand. It's also broken in many ways and has no owner. However it has no dependent distributions so ends up with a score of 6 (my estimate). In this particular case I think the correct answer is to patch the core documentation to remove any mention of the module.
Said module brings up another point, which is that arguably the module should go away entirely.
Spot on Dana: Math-TrulyRandom has a score of 6.
I'll have a think about your two thoughts — have added them to the list.
Modules going away entirely is the subject for another blog post ... :-)
Nice work.
IMHO the scoring should be improved.
The current scoring does not score high, if
- the author releases cosmetic changes
- the author maintains many packages
- the module has no dependents
- the module has a small community
The scoring mechanism doesn't seem to have much of a resolution. Many of those dists don't need adoption at all. One module of mine is in there because it has two wishlist tickets…
Leon: good point. I'll ignore tickets with severity "wishlist".
Yeah, I was thinking about Test-MockModule when opened that page. And here is it, item 2. Good scoring probably.
Thanks Victor. It's definitely identifying some of the modules that could do with some attention, but there are also some false positives. I've had some good ideas from people, and thought of a few improvements myself today. It should get better.
Does this mean you're going to adopt something? :-)
Definitely! Some day. But for now I have less-intrusive ways to contribute to perlish opensource.
Author activity may be useful to factor in. I'm quite active, and would rather not see anything of mine listed, for example. It only suggests that I'll receive "May I adopt?" requests to which I'll have to reply. ;)
It already is (to an extent), since inactive authors receive a higher score.
Also, just because an author is active, it doesn't mean that they support all their modules equally - there are quite a few active authors with modules they've not updated for years.
I see one of my module on the list. The one I maintain the most. It is also the most used, and has quite a few open bugs. So it looks like your criteria don't really work.
@rjbs: a good point, and I'll try and think of a way this could be factored in, but as @mjemmeson points out, there are cases where someone loses interest in one or more of their old dists, while happily releasing their other dists. I've adopted a couple of such dists while doing reviews.
@mirod: yes, I'm overweighting on the bugs vs release date at the moment. XML-Twig is making the list because it has bugs reported since the last release, a lot of open bugs, and a lot of dependent dists.
Hopefully the next iteration will be a more accurate / useful indicator. XML-Twig will be a useful test-case for me!
I think this is a brilliant bit of work, but I think it could do more to distinguish between two goals. Is it trying to identify modules which need adoption or is it a general indicator of the quality of CPAN?
If the goal is identifying modules which need adoption then there should be a qualifying criteria.
I suggest that module which has been released within the last year should not qualify even if its overall score is very high.
As it stands, a module which was released yesterday with a lot of bugs and dependencies could score quite highly.
@Duncan: thanks! I agree on the qualification: this is the 'gating criteria' mentioned in the more detailed blog post. I think the real test of the next iteration will be looking at the modules it picks, and tuning (without overtuning).
I think there could be modules most recently released 10 months ago, with very recent bugs, and they should probably be included. At the moment I'm thinking 6 months as a cut-off, but with other factors included.
I have plenty of modules that have not been updated (except for minor cosmetic updates, like getting the changelog into CPAN::Changes::Spec format) in years. However, that doesn't mean I've abandoned them; it more likely means that currently, they do everything I need them to do.
If my needs change in the future, they are likely to get another round of attention; ditto if somebody reports a serious bug. But if everything seems to be working fine, then there's no point gratuitously updating them just so they seem more active.
I imagine there are plenty of other authors in similar situations.
Hash-Util-FieldHash-Compat (which is not mine, but seems a good example) has been sat on version 0.03 for over 5 years; and the changes since 0.01 have been fairly minor. Yet it works well; OX uses it; KiokuDB uses it. If something went wrong with it, I'd be fairly confident of that it would be fixed quickly; not least because its current maintainer is also a maintainer of KiokuDB so has a vested interest in doing so.
Agreed with Toby. I'd increase the weight of # of open bugs (-wishlist) and number of CPANTesters failure reports, and pay less attention to release date.
@tobyink: hopefully the next iteration will improve the list in this area.
@stevenh: good point about CPAN Testers — I've added that to the list. Might not make it into the next iteration, as I need to find an easy way to get the data.
A metric that took into account the percentage of failures on recent perl versions would be quite useful -- it's definitely a sign that a module has been neglected and needs some love.
@ether: good thought. I was thinking about general percentage of fails vs passes, but maybe I should either only take the latest perl, or at least weight that.
It occurs to me that it might be best to rename this list to something like "Modules Which Could Benefit From More Attention". Calling them candidates for adoption is a bit inflammatory, and some authors may get angry when they see their modules on such a list.
But if the list just highlighted modules that could use some live, that could lead top adoption. I for one regularly ask bug reporters for some of my modules to adopt them because I no longer use them (I've not had a great deal of success there though ;)
Dave: good points. When I first started working on this, I called the script 'needlove'. I'll rewrite the blurb at the top of the list as well.
@mirod: I've changed the way the scoring works. As a result, XML-Twig is no longer listed.
@leont: your dist with two wishlist tickets only no longer appears. You have other dists which do appear though ...
@rkinyon: if you give HANDOFF or ADOPTME permissions on your modules, that will give your unloved dists a +1. They're scoring low though, so with my new version they still wouldn't make the cut. I'm thinking about another column for "adoptability", and also including modules that score high on that, even if the current score doesn't pass the threshold.