Test::More has lots of crazy new development that's breaking my modules

I still wish we had a way to remove reports from CPAN Testers. The case of a broken Test::More is a really good reason for this.

I received many fail reports for Business::ISBN, which I've been working on lately. However, it's from a test I hadn't touched for things I wasn't working on.

The failure looked odd. I've never heard of Test::More::DeepCheck:

Modification of non-creatable array value attempted, subscript -1 at .../Test/More/DeepCheck.pm line 82.

Then I noticed that all of the fail reports reported the same development version of Test::More:

build_requires:

Module Need Have
-------------------- -------- ------------
ExtUtils::MakeMaker 0 6.99_14
Test::More 0.95 1.301001_045

There are a few other bad tests for things happening in v5.21, but for the most part I get the black eye on MetaCPAN or CPAN Search from the CPAN Testers using a bad version of Test::More. It happens. Schwern once said that he could break all of CPAN.

The problem, though, is not a broken Test::More but a development version of one that we don't trust yet. I know that Andreas and Slaven want to test everything, but perhaps these sorts of tests don't need to go to CPAN Testers. In the last month I've spent a couple hours tracking down problems that weren't anything to do with anything I did and with what I think is some overzealous new development in Test::More. If you want to write a better is_deeply, make a separate module and let people play with it.

Fortunately, normal users won't have this broken Test::More. Unfortunately, they still get to see the red bars for my module.

9 Comments

Exodist is making a lot of changes to the Test::More guts, which will hopefully lead to a much better Test::More. The aim is that the result should eventually be API-compatible with older Test::More. Because of the major changes, he's rightly releasing a lot of trial versions.

As far as I'm concerned, this is CPAN Testers doing its job correctly. You notice the fails; it's either because you're using some Test::More not as advertised, in which case you should adjust your usage of it to match the documented API, or you're using it properly and you push back on Exodist and he tweaks things again.

In the latter case it's not really your distribution's "fault" that the test suite did not pass. But while it's nice to see a big field of green, CPAN Testers isn't about assigning blame. It's about finding bugs. Those bugs might be in your distribution, or they might be in your dependencies. (And if you're using it for testing, Test::More is a dependency, even if you don't usually think of non-runtime dependencies as such.) It's just part of the price we pay for not having to reinvent wheels.

I want to get notification if a prereq of my module, any prereq, changes in an incompatible way. But I don't think that should affect the apparent stability of my module.

With metacpan emphasizing test results in their sidebar, I don't think a -TRIAL version of a prereq should influence that. CPANTS might not be about assigning blame, but the users of metacpan only see the numbers, not the reason.

So give me the reports so I can fix my module before -TRIAL becomes stable, but don't punish me for them.

CPAN Testers isn’t about assigning blame. It’s about finding bugs.

But the PASS/FAIL stats get shown as PASS/FAIL in Business::ISBN, not in Test::More. You could still make a strained argument that they do belong to Business::ISBN, but they clearly aren’t relevant to anyone except those who are using that TRIAL version of Test::More, which is a tiny audience. To the vastly larger group of other CPAN users this is confounding information.

So your best argument amounts to the toolchain module maintainer’s convenience in finding bugs coming before everyone else’s need to make their own development choices effectively and efficiently. To be persuasive you will need to argue that this remains a net benefit for everyone. A tall order, if you ask me… but the floor is yours: take it away.

If this is meant to be serious then what I will say is, the PASS/FAIL under an unstable toolchain should be aggregated separately and only shown to people who go looking. There is obvious value in these results, they just mustn’t be mixed up with the QA for the module itself. That is already how unstable releases of perl are handled – surprise – because, well, the exact same considerations apply.

If the tester is not testing Test::More, why is it using a trial version of Test::More? Shouldn't the cpantester node be using the production version of everything that is not actively under test?

Just my $0.02.

As a module author, I also don't want to see fails for spurious reasons for my modules.

But when I'm choosing a module to use, I look closely at the failures to see why they are failing before deciding not to use a module. I care more that there are pass reports than the lack of fail reports.

I think the only time I've used that as a reason to not use a module was when that module hadn't been updated in several years, and consistently fails on recent versions of Perl.

I believe that CPAN Testers has a way that you can mark specific tests as not-applying just for such cases, but I've never used it.

> I don't think CPAN Testers is doing the right thing by using broken Test::More's that aren't available for normal installations.

I strongly disagree. What if these Test::More changes *were* production ready? Receiving these failing reports, before Test::More sees a stable release, would be vital to identifying issues with your code, so you have a chance to fix it before that stable release and everyone is affected. The smokers can't know whether this trial is nearing production readiness or not.

CPAN test reports are my best indicator of whether my code is working for everyone, or just for me. I don't want to lessen the number of these reports.

However, the way the statistics are aggregated and reported *is* indeed misleading, and can improperly indicate that there is a problem with the module. In https://github.com/Test-More/test-more/issues/442#issuecomment-56428862 I suggested that reports that tested a dist where *any* listed prereq was in a dev release should be reported differently, and left out of the FAIL statistics. I think that tackles the correct problem, without undermining the efficacy of the test system.

PS. This thread should be taken to (or cc'd on) the cpan-testers-discuss mailing list, so Barbie can see it.

Firstly. @preaction CPANTS != CPAN Testers. They are two very different projects.

Secondly, this is a conversation that has cropped up before, and I'm still in two minds about it. Short answer: I tend to side with brian. The tester platform shouldn't be doing any testing with the trial/development releases of pre-requisities, unless the tester is going to manually filter the results and send to the pre-requisite author if appropriate. I do understand Ether's perspective too, and there is merit in having these reports, but they more often target the wrong author.

Longer answer: The difficulty we have once the reports hits the Metabase, is that all the parsing into the CPAN Testers databases cannot know whether the test failures are because of the pre-requisite or the distribution being tested. The fault could be in the pre-requisite, but equally as Ether notes, the pre-requisite could be the next release and the distribution being tested makes assumptions that are no longer true. The analysis site, run by Andreas, could possibly highlight this with a suitable corpus of reports, but it too wouldn't necessarily know whether the pre-requisite or the distribution being tested is at fault. It takes a human to determine that.

I want CPAN Testers to helpful to authors, and in this situation it isn't. The wrong author is being alerted, and in order for the right author to be notified, it requires the wrong author to take the time to feedback to the pre-requisite author. It's a step I don't think we should be asking authors to do. In the short term, the solution is to ask testers not to test with trial/development pre-requisites. This could be to ensure smokers test in a clean environment, or to exclude any trial/development releases if you test in batches.

One longer term solution would be to add more logic in the smoker client to re-attribute the report to the pre-requisite. This would require quite a bit of work, and I'm not sure how easy that would be, but it might be worth someone investigating one of the clients to see what could be done. Another solution would be to add logic to the backend parser, which I plan to abstract out to make it easier to use in more applications. However, both of these solutions could still be attributing reports to the wrong author, and we would be back in the same position, just from a different perspective.

The Admin site can now allow authors/testers to mark reports to be ignored from the system. But I will look to see what can be done so that the system can re-attribute reports rather than ignore them. It's still fixing the problem in the wrong place in my opinion, and we're still relying on authors to do the work, but at least we won't lose the reports.

In Summary, I think testers have a responsibility to direct reports to the right author, and if they don't then stop using trial/development releases as pre-requisites.

As the person making the Test::More changes, I have to agree with this idea:

Failures due to alpha versions of Test::More should not be red marks against OTHER modules that in no way took action to cause the problem.

If I had known they did show up there I probably would have been reluctant to release these alphas. The alphas are valuable and help me find a lot of issues... but I don't want other people to be dinged from them. I myself judge modules based on the pass/fail ratio in cpan testers.

At the very least these should be unknown, not fail (in metacpan/cpan display) At best they would be a fourth category that probably should not even be displayed without asking for it.

As for calling my changes overzealous, time will tell, I have no intention of marking it stable until the perl-qa and toolchain gangs are onboard, hopefully that level of approval works for everyone. I never had any plans to release this as stable without lots and lots of eyes double-checking me.

Leave a comment

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).