The CPAN client version-less dependency problem
An interesting topic came up on #distzilla
. Most modules depend on other modules, but some don't depend on an explicit version. So if you use the module with an ancient version of one of its dependencies it'll break, because the author never tested that version.
We've probably all run into problems as a result of this and grudgingly upgraded the dependencies as a result, but could CPAN clients handle this better?
Technically they're doing the right thing already. The CPAN clients are unable to distinguish between "explicit 0", "any version will do" and "author didn't say".
- Some don't specify versions at all. I mostly fall into this camp.
- Even if you specify a version for every dependency it's really hard to get it right. You might accidentally use some API feature in a dependency that doesn't match the feature set of your declared version.
- Even if you get it 100% right it's all for naught if one of your dependencies isn't as careful about its dependencies.
So what would be a better heuristic? Some suggestions:
Make sure that dependencies aren't much older than the module that requires them. Something in the spirit of:
Are you sure you want to install this shiny new version of Web::Some::Thingy but still use your 4 year old copy of DBD::SQLite? Install the new DBD::SQLite? [Y/n]
Query some publicly available submission system like the CPAN metabase:
You're trying to install Web::Some::Thingy with a 2 year old DBD::SQLite, but test results show that you need the 6 month old version not to get test failures. Install the new DBD::SQLite? [Y/n]
I should reiterate that CPAN clients are doing the right thing already with regards to the relevant standards. But it can be useful to also account for the human factor.
Update: chromatic pointed out on the Modern Perl Books blog that you could set the recommended minimum version to whatever the developer happened to have installed when testing the module (and I'm replying here since it seems that his OpenID method is broken).
I think that alternative is probably the worst of the lot, some CPAN authors do that and it causes all sorts of problems for people using downstream distributors of the CPAN.
For example if you develop on v5.13.5 as I do, and depend on Test::More
you'd implicitly depend on version 0.97_01 using this method. Even though the 0.92 version included with perl v5.10.1 would probably do just fine.
By depending on the latest version someone trying to use your program with Debian, RHEL or other third-party package systems will run into problems. Uers show up on IRC all the time with this problem, and more often than not the answer is "ignore the prerequisite versions, your old module will work just fine". At best they're using the CPAN directly, and will have to needlessly upgrade & test a lot of their dependencies.
Adding automatic soft dependencies as one commenter points out would probably be more useful.
But either solution would require awareness and diligence by CPAN authors, and new releases of the offending distros. Given the human factor of this problem a change in the CPAN clients would probably work better, and would trickle down to existing releases.
One of my long-term goals for the next generation of CPAN Testers clients is to capture granular prereq information that can be used for some of this kind of analysis. That doesn't solve the problem but might make it easier to detect.
One of the things that I know David Golden and I have talked about for our mythical new CPAN client is a version range. I think this almost works in Module::Build:
The syntax isn't the important thing there, but that's the idea.
There are other things to consider with that too. How do you specify a hard dependency (it really won't work with other versions) and a softer one (I only tested it with these versions)?
I haven't thought about this for a bit, but David and I had the idea of a mutable installation plan in the client. Instead of trying to install the latest version of a dependency, failing, and giving up, try to install earlier versions that are known to work (either by explicit information in the build file or CPAN tester results).
The trick, then, is to figure out if Miyagawa is still working on jam, which is where I left off on my thinking. I wasn't going to make my new CPAN client if he was going to make his. :)
The problem I'm describing has to do with people (including myself) who simply can't be bothered to supply at least "you need at least version X". Because doing so involves a lot more work than just coding up something that works, passes all tests and uploading it to the CPAN.
Being able to specify even more information than that, while certainly neat. Isn't going to be taken up by the lazy crowd (although it'd be really useful for others).
So you're back to the problem of how you fail gracefully with incomplete and perhaps incorrect metadata.
Having some default smarts in the CPAN client would probably help a lot with that, so would being able to query some external database for the dependencies. E.g. CPAN Testers who might have found that
Some::Module
really should have>1.23 <=1.99
as the prerequisite version through brute-force testing.Hi Folks
I too struggle with this issue.
Recently I adopted the policy of using mversion - comes with Module::Version - to get the versions I actually have installed, and to put those version #s in Build.PL and Makefile.PL.
But while reading this post and comments I thought: (a) Module::Build has an action called prereq_report. (b) The author could run this and include the output in the dist. (c) CPAN could put this report on-line. (d) End users could query it to at least know what the author tested the code with.
Limitations: (a) It's just a guideline. (b) It only covers the 1 version the author had installed. (c) It reveals info about the author's setup.
But, could it be useful?
PS: Why is the font size in this edit box (not the previewed text) smaller than it was when I first entered the comment?
It’s not just laziness. No one writes software specifically toward outdated libraries, so there are almost never “at most version X of Y” requirements at the time of release. They only become apparent as new versions of Y get released later. And if your module has many dependencies, it’s a tough job to stay on top of CPAN and keep testing it with newer versions.
Why not simply follow the lead of the guys who write 365 RT tickets per year? The recipe seems to me quite easy to follow:
As an author you start out by writing down as many dependencies as you know about and set the version you require to 0 (unless you know better). Then you wait for cpantesters reports and RT tickets. If some dependency turns out to be too lax, you correct and make a new release. Rinse, repeat.
As a user you see a software problem, maybe a failing test. So when you see a failing test, you visit first RT and then cpantesters. Note, this has to be your course of action in any case be it a dependency problem or something else. At that point you actually need help from the author or the community.
Once it turns out that the problem is a dependency problem, you write an RT ticket of the sort "undeclared dependency..." or of the sort "minimum required version of...". Usually you get help from the author or the community or you can solve the problem yourself. The important point is to share. But you knew that already.
Shameless plug ahead. This simple circuit of mutual help via RT and cpantesters is well established and it works. One way you (as an author) can improve it above its current working is by declaring even second and third level dependencies. Don't be shy declaring the dependencies of the dependencies. It will probably turn out that one of your tests is good at detecting a brokenness in the dependencies between your dependencies.
How does it help declaring more dependencies? Cpantesters working with CPAN::Reporter will always report all versions of all declared dependencies in a well defined format, be it PASS or FAIL (CPANPLUS will have this same feature some day as well). And http://analysis.cpantesters.org picks up all these data and applies statistical analyses to them. You can watch it in action. From time to time it discovers version dependencies (right column on homepage contains entries of the form "mod:Module::Name"). Some human should always verify them before submitting to RT. That would probably be me, I still have to write 135 tickets this year to reach the goal:)
With regard to "Report what versions of things the author was using when they made this release" aspect of this problem, I did write a dist::zilla plugin ( MetaData::BuiltWith ) for this purpose:
http://search.cpan.org/dist/Dist-Zilla-Plugin-MetaData-BuiltWith/
The most verbose incantation ( MetaData::BuiltWith::All ) is probably a bit excessive, but you get a somewhat reliable report of what versions of everything can be used to attain a working install of the authors module.
The idea was that, in the event the author failed to report a correct "minimal version", a quick perusal of this metadata would be able to find things that could/should be updated to make it work.
You could present this info as a possible remedy in case some of the tests fail.
But if the author didn't include a single test that can demonstrate that there is a problem: bad author, no cookie!