Test2/Test::Builder Update from the QAH

By Chad 'Exodist' Granum on April 22, 2016 12:33 PM

Yesterday was the first day of the QA Hackathon in Rugby, UK. The first item on the agenda was a discussion about Test2 going stable. This blog post will cover the important points of that discussion.

For the impatient, here is a summary:

Test2 is going to be part of the Test-Simple dist. It will not be a standalone dist.
The next stable Test-Simple release will include Test2 and a Test::Builder that runs on Test2.
The release date for the next stable Test-Simple, which includes Test2, will be no sooner than Friday May 6'th, which is our planned release date.

The QAH discussion focused around a single question: "What is the best path forward for Test::Builder when we consider both end-users and test tool authors?"

Arguments for updating Test::Builder to use Test2:

It avoids the need to split the testing ecosystem in two, with every Test:: module forked into a Test::Builder and Test2 version.
It reduces the number of possible configurations to maintain and test. Some of these configurations were known to be problematic.

Arguments for keeping Test2 and Test::Builder separate:

We are changing things out from under people, the only way to opt-out is to not upgrade.
Darkpan is a black box, we cannot predict potential breakages.
There are a small number of cpan modules we know will break

When the room came to a vote the result was overwhelmingly in favor of updating Test::Builder.

The discussion also resulted in a few action items that prevent a stable release during the QAH:

Test2 will be integrated into the Test-Simple distribution
A test will run that reports any modules that are broken, or need to be updated as a result of the Test2 upgrade.
Test2 will have a post-testing check that warns you if you have a Test2/Test::Builder version mismatch.
Test2 will have a post-testing check that warns you about any known-broken module versions loaded during testing (but only if the tests failed).

These are important action items that help to alleviate the concerns of those who were against updating Test::Builder. All of these items are now done, and a dev release has been uploaded to cpan. Assuming the dev release does not find any show-stopping bugs it will be released as stable no sooner than Friday May 6th which is the planned release date.

Thank you to DreamHost for sending me to the QAH!

21 comments

Tagged as:

QAH, QAH 2016, test-simple, Test2, Test::Builder, Test::Stream

21 Comments

Peter Rabbitson | April 22, 2016 2:21 PM | Reply

Just about a year ago I wrote down my thoughts on the Test::More situation. My position remains entirely unchanged: a codebase which until very recently contained gems like this is unfit for the unfathomably wide use Test::More enjoys.

I continue to maintain that Chad Granum had no direct fault in this, as he can not be blamed for the "exploratory personality" which is an integral part of who he is.

I also recognize that this train has inexorably left the station. For posterity the ones responsible for the impending largely invisible wave of breakage are:

Michael Schwern, who architected (either knowingly or not) the current situation by transferring maintainership to a "non-boring" developer.
Ricardo Signes, who neglected his responsibilities to the wider user base and did not exercise his veto power available to him via being a co-owner of the namespace by virtue of being pumpking.
A number of particpants of the 2015 and 2016 QA Hackathons, who did not voice their reservations for political reasons (you know who you are, I elect to not embarrass you by listing your names).
The Perl Foundation Grant Committee, who did not perform due diligence in investigating the dissent around the work they were funding.

Dark times lie ahead :/

Ether | April 22, 2016 3:11 PM | Reply

It's too bad you chose not to attend this year. We actually had quite a lengthy discussion and debate about the different paths forward, and I think the final decision does accurately reflect all the concerns that were raised and the preparations made to mitigate them.

Matt S Trout (mst) | April 22, 2016 3:13 PM | Reply

So, while I've definitely disagreed with Exodist at various stages of the process (FVO 'disagreed' that included at least one fairly lengthy yelling session which, in hindsight, might have been more likely to make him quit than to listen to me, which I present not because I'm particularly proud of it but because I want to make clear that I've hardly been a soft touch here), I do believe the following to be true:

1) Test2 is shaping up to the point where while there will probably be some teething issues, it's going to do an impressively good job of stability/compatibility given how hard the problem at hand is.

2) Test2 is better than what we currerntly have, and nobody else is working on anything that I expect to be better than Test2.

3) Having everything basically get Test2 automatically will make it much easier to gradually improve things without modifying other currently stable code.

4) We'll be trying Test2 in blead again at some point, and if that snows p5p under with 'blead breaks cpan' reports then we'll back it out again and go figure out what went wrong.

5) A notable theme of conversations at this QAH has been "we miss our friend ribasushi from before he slipped into burnout and became disproprortionately bitter and negative about everything, and hope once he finally takes some time away we'll eventually get our friend back".

-- mst

Sawyer X | April 22, 2016 3:28 PM | Reply

Having attended the mentioned meeting that took place yesterday at this hackathon, I can safely say that the conversation revolved around "What do people care about? What worried anyone? What other issues do people believe are left unresolved?"

People raised issues. A discussion occurred. Ideas were raised. A decision was made.

It's crucial to note that disagreement does not equal no resolution. Having a dissenting opinion does not mean nothing happens, or that the dissent's opinion is now the resolution.

Just as in court, and as with friends, and as in any group, you can disagree and offer something else, and it will still not get accepted. The point is to address, not necessarily to accept. I do believe issues were thoroughly and respectfully addressed with a serious tone. I'm sorry they weren't what you wanted.

Aristotle | April 22, 2016 6:07 PM | Reply

I think the final decision does accurately reflect all the concerns that were raised and the preparations made to mitigate them.

The point is to address, not necessarily to accept. I do believe issues were thoroughly and respectfully addressed with a serious tone.

I don’t know what it means for a decision to reflect all concerns. As the lone vote of dissent at the table, I know the issues Leon and I raised were certainly not addressed, much less mitigated. They were heard, and their gravity debated to some extent. A single proposal that would have addressed them was heard, and was rejected.

What the decision does reflect is the consensus opinion of the near-unanimous majority at that table to move forward without addressing those concerns, accepting any consequences as a cost of doing business.

It came down to differing beliefs about the magnitude of those costs. (As seen by mst stated beliefs seen above, as an example.) Facts will now be the judge of those beliefs.

Aristotle | April 22, 2016 6:08 PM | Reply

Facts will now be the judge of those beliefs.

(Not the degree of agreement. Not the level of respect. Not the tone of the discussion.)

Matt S Trout (mst) | April 22, 2016 6:46 PM | Reply

So far as I can see, in the absence of a viable third way, the options on the table were, basically,

1) Don't merge it

2) Merge it and see what happens, and if necessary unmerge it until we can resolve the resulting problems

I'd note we did exactly 'unmerge it until' once already, and people are fully prepared to do so again if required.

I believe when sawyer said 'addressed' he meant that he believed the concerns had been acknowledged and factored into the decision making process - however the end result was a majority-albeit-not-universal decision that, on the whole, option 2 is the better way to move forwards, and that we expect the benefits to outweigh the costs. By doing the merge on CPAN with a year's lead time before the next stable release of perl core, we should have the maximum chance to more accurately measure those costs, and if they do turn out to be significantly worse than we currently believe them to be, the merge can be reversed while that's looked into.

I agree entirely that it's "differing beliefs about the magnitude of those costs", but with the additional caveat that we're intending to take the time to collect more information in order to validate those beliefs in so far as possible before committing core to this path, but that at this stage "suck it and see" is fundamentally the only route to getting an effective read on how the costs will play out out in the wild rather than as simulated in our heads.

xdg | April 24, 2016 2:24 AM | Reply

Sadly, I wasn't able to attend the QAH this year. I'm disappointed that this is the outcome for Test2/Test::Builder, and, had I been there, I would have argued against it.

I think the pendulum has swung too far towards an acceptance of risk for "way upriver" distributions. Even testing Test2 against CPAN is insufficient; the amount of DarkPAN code is vastly greater.

Notwithstanding the tremendous work Chad has put into Test2 and compatibility, I think Test2 should be opt-in. I would prefer to see Test2 kept separate, uncoupled to Test:::Builder. It could even ship with the Perl core. Distributions that had problems with the old Test-Simple framework could migrate to it, but everything else that works today could remain unaffected.

KENTNL | April 24, 2016 10:43 AM | Reply

> So far as I can see, in the absence of a viable third way, the options on the table were, basically

I recall there was moves to propose a 3rd way, the "don't integrate yet, but have a way to smooth over our transition more progressively before we finally commit".

I regret somewhat not being there to see that discussion myself, because from where I sit, based on the amount of chatter about that proposal, its like it never happened.

( Though I'm assured it very much did happen, I haven't heard of any talk of it outside the few people involved with its proposal , which is strange for me )

The idea of that proposal being:

We can delay a final commitment to make sure we've got it right
We can add integration tests to make sure it works before its "everyone gets this"
People who want this "now" can "Get it" without "everyone gets it"

And that feels like it covers all the bases for "real people who want to use TB based code and T2 based code together".

The only residual that that proposal omitted, is the "force everyone to use T2" stage, which we can /still do/ under that proposal, we just don't have to be in such a rush.

Matt S Trout (mst) | April 24, 2016 1:56 PM | Reply

> I recall there was moves to propose a 3rd way, the "don't integrate yet, but have a way to smooth over our transition more progressively before we finally commit".

We've already spent an extra year moving progressively and smoking out issues. We definitely needed that extra year, but it's already happened. Further attempts to do so are likely, sadly, to result in more delay than gain, I think.

> And that feels like it covers all the bases for "real people who want to use TB based code and T2 based code together".

Dev releases have been useful but at this stage having a proper TB-on-T2 release on CPAN is going, I suspect, to be necessary to shake out the remainder of the problems.

> The only residual that that proposal omitted, is the "force everyone to use T2" stage, which we can /still do/ under that proposal, we just don't have to be in such a rush.

We're not in a rush. We've been so much not in a rush people are starting to make perl6 jokes about the damn thing. But over the course of the process Exodist has learned at least a decent percentage of the correct paranoia level of a decent toolchainer, and it wasn't ready last year, and it's hopefully ready now.

So we're going to try it and find out, and if everything catches fire we'll re-upload the pair of independent tarballs that were already prepared specifically in case it turned out to be the wrong thing to do, look at what happened, and then figure out what needs to happen before we consider trying again.

Christian Hansen | April 24, 2016 3:37 PM | Reply

1) If it ain't broke, don't fix it!
2) Opt-in not opt-out!
3) Don't punish upstream authors/maintainers for your new fancy framework! Opt-in not Opt-out!

--
chansen

Christian Hansen | April 24, 2016 3:43 PM | Reply

These are fundamental concepts in any R&D, why should we be different?

Matt S Trout (mst) | April 24, 2016 6:10 PM | Reply

Oh, chansen, I see you're having a 'yelling at people based on incomplete information' day again (though this comment is still better than our IRC comment telling the new pumpking to 'get your shit together' on his first day when he hasn't even really started yet).

1) It is broke
2) The 'explicit opt-in' approach works only until people start porting existing extensions on CPAN, which a number of Test:: authors have expressed an interest in, and then users would see far, far weirder breakages than what we're currently doing is likely to cause
3) The Test::Builder API continues to work, so I fail to see how we're 'punishing' anybody.
4) You could've got involved in this process ages ago, coming along after the fact and shouting at people is not, actually, going to change or improve anything.
5) If you want to keep ranting, there's always /r/cperl, but please don't do it here.

blog.urth.org | April 24, 2016 7:48 PM | Reply

Several people have suggested making Test2 opt-in, but I don't see how that could possibly achieve anything of value.

I work on two test modules that will greatly benefit from Test2. These are Test::Class::Moose (TCM) and TAP::Formatter::TeamCity (TeamCity).

Neither of these modules would see much (if any) benefit from an opt-in approach.

When using TCM, you need to use other test modules like Test::More, Test::Fatal, etc. to actually do any testing. TCM is essentially a harness, but it doesn't provide the low-level "is this the expected value" type testing pieces.

For TCM to work with Test2, every single module you use to output tests must also use Test2. This means that a TCM built on top of Test2 would only be useful to people who were willing to write their entire test suite with other modules that had opted into Test2.

This would be viable for green field code bases that could choose to use Test2::Suite (which is pretty damn great), but presumably most existing TCM users are using Test::More and friends.

So to use Test2 in TCM (which has allowed me to _hugely_ improve the internals and fix many bugs with parallel testing), I'd either have to make a new distro (T2CM?) or tell everyone using TCM that the next release will break their entire test suite. Neither of these options seem very appealing, though I suppose the new T2CM distro is a better bad option. That said, I don't really want to maintain two separate distros!

The TeamCity formatter is in a similar situation. Unless the entirety of a test suite is emitting T2 events, I cannot write a decent TeamCity formatter. At $WORK, where we use this formatter, we already have a huge test suite built on top of TCM, Test::More, etc. We are really not likely to rewrite it all to use Test2::Suite any time soon (though I hope to at least use it for _new_ code going forward).

So that leaves us back with the two options that mst brought up. Do nothing or merge it and see what happens (with the option of unmerging in the face of catastrophe). T2 brings many huge improvements for test module authors. It's a huge step forward for the Perl testing ecosystem. I think the benefits it offers are worth some risk.

Leon Timmermans | April 24, 2016 11:55 PM | Reply

I think most of the discussion on that table, as well as here, was about the wrong question entirely. As such

The fundamental question is "it is better to have a united or a split ecosystem, and for whom".

For the end-users, a split is clearly the less risky option, and there doesn't seem to be any direct advantage to them in keeping the ecosystems unified. On the short term I only see negatives, though on the long term there may be positives (if Test2 manages to give better reporting, and harnesses catch up to it).

For module authors, a similar thing applies: No short term advantages for current usages, though it does seem to make new usages possible.

And then there are the testing library authors. For most of them, a split is nothing less than a pain in the ass. While some modules may be obsoleted by Test2::Suite (and hence not necessarily need a port), many will not be.

In the grand scheme, I still think a split is the best path forward. Quite frankly I don't quite understand what's so bad about a new framework after 15 years. Yet the discussion did make me realize a new dimension that I hadn't fully grasped yet before: a split will need support and diligence from the testing library authors and module authors. That level of support is a second unknown variable in this equation (the first being the amount of stuff that breaks now), and we have even less data on this.

This second unknown left me with less certainty than I started that discussion with, though I still think the merge isn't the best we can do; but I think I utterly failed to clearly express this nuance of what I was meaning to say.

blog.urth.org | April 25, 2016 4:06 AM | Reply

Leon Timmermans said:

For the end-users, a split is clearly the less risky option, and there doesn't seem to be any direct advantage to them in keeping the ecosystems unified.

I think the advantage for end users is that they can use things that test module authors produce. For example, with the TeamCity formatter I referenced in my earlier comment, it is nearly impossible to make a better version in the current ecosystem. Parsing TAP is incredibly painful, because TAP sucks (I need to write a blog post on this) and simply doesn't have enough information in it.

With Test2, I can create a much less buggy, much more useful TeamCity formatter. This will directly benefit anyone who wants to use existing test tools like Test::More under TeamCity.

This logic applies to many other testing tools that will benefit from Test2. Being able to write more reliable, more flexible, better tested test tools with less work is a win for everyone who wants to use those tools.

Aristotle replied to comment from Matt S Trout (mst) | April 26, 2016 3:36 AM | Reply

The 'explicit opt-in' approach works only until people start porting existing extensions on CPAN, which a number of Test:: authors have expressed an interest in, and then users would see far, far weirder breakages than what we're currently doing is likely to cause

Just so this statement is not left unchallenged, let me be on record that in verbatim form this argument is nonsense; although it may be meant as a shorthand summary of a longer argument which does make sense.

Aristotle replied to comment from blog.urth.org | April 26, 2016 3:42 AM | Reply

For TCM to work with Test2, every single module you use to output tests must also use Test2.

This, however, I will call unqualified nonsense. Let me demonstrate to you why:

Test::More will not be changing at all. Yet it will be able to coöperate with Test2-based modules just fine, by using a Test::Builder whose guts have been swapped for a Test2 wrapper.
If it were true, that would mean that Test2 could not be released as part of Test::Builder without a simultaneous re-release every single test module on CPAN, ported to Test2. Otherwise CPAN would break the moment that Test2-based Test::Builder was released.
Making that statement as untrue as possible is why one entire extra year went by.

Thus the claim that Test modules would have to be written specifically against Test2, in order to be able to work with other Test2-based modules, is simply factually incorrect.

There are many test modules that easily work against Test::Builder in both its old incarnation and its Test2-based form, with no code changes.

There are very few test modules that truly need the new infrastructure provided by Test2.

So technically it would easily be possible to let the .t pick whether it wants the old or new form of Test::Builder, based on what modules it intends to load. Most test modules would require no changes of any kind to be able to work in either environment.

Only a handful of modules would need to be restricted to run on just the traditional Test::Builder or just its Test2-based reincarnation.

(And so as I just belatedly realise, the proposal shouldn’t even have been summarised with “bifurcating the ecosystem”-type phrasing. That is in fact an incorrect, misrepresentative framing. At the table, Rik asked me if that was a fair phrasing and I acceded; my bad. I should have objected.)

blog.urth.org | April 26, 2016 4:29 PM | Reply

Aristotle said:

Thus the claim that Test modules would have to be written specifically against Test2, in order to be able to work with other Test2-based modules, is simply factually incorrect.

That wasn't what I was trying to claim. Maybe I don't understand what you and Leon wanted to happen.

My point was simply that Test2-using tools will not cooperate with code that uses the existing pre-Test2 Test::Builder. Of course, if Test::Builder is using Test2 under the hood, then tools using Test::Builder do not also have to change (but AFAICT no one is saying they should).

Aristotle replied to comment from blog.urth.org | April 26, 2016 7:37 PM | Reply

Maybe I don't understand what you and Leon wanted to happen.

Yes, sorry, I’ll be publishing a long-form, detailed version of the proposal very soon. I’ve realised the proposal is (either largely or entirely) non-understood – the abridged version I presented on the fly at that table was insufficient. (I had hoped but failed to have it out there sufficiently early before the QAH that people could digest it in detail.)

My point was simply that Test2-using tools will not cooperate with code that uses the existing pre-Test2 Test::Builder.

Yes, I realise that now upon re-reading with fresh eyes. I see that “every single module you use to output tests” can be understood to mean “any module such as Test::Builder”.

It was the ambiguity of “output tests” combined with the “every single” quantifier that threw me off. “Every TAP emitter” or something would have been clearer to me.

Anyway, I think we’re on the same page now.

Of course, if Test::Builder is using Test2 under the hood, then tools using Test::Builder do not also have to change (but AFAICT no one is saying they should).

Right. The core point of the proposal is that every test library remains built on top of Test::Builder (unless it has more specific needs), and .t files can then individually pick whether Test::Builder’s guts are the old ones (default) or the Test2 wrapper shim. There are more details about how to make this two-pronged approach work at the ecosystem level (and I’ll be publishing them ASAP), but that is the gist: that test files get to pick.

Chad 'Exodist' Granum replied to comment from Aristotle | April 26, 2016 8:17 PM | Reply

One of the initial options discussed as early as last year was this:

Test2 and Test::Builder remain separate. Tools pick one a build with it. To use both together you load a shim such as Test2::Legacy, which replaces Test::Builder with the alternative guts.

This was the alternative being discussed in the Test2 meeting on day 1 of the 2016 hackathon. We decided it was not the right plan. My main reason for opposing this plan is that nothing stops people from updating old modules to use Test2, and also auto-loading the shim to insure they do not break things that depend on them.

In this scenario you can never be sure that updating your test modules will not make the choice for you. If you use Test::Moose::More for instance it is currently on Test::Builder. Lets say the author updates Test::Moose::More to use Test2. If he does so he will break a lot of tests that also use other Test::Builder modules. So in order to avoid the problem he loads Test2::Legacy automatically. Now when you update Test::Moose::More, suddenly all your .t files load Test2::Legacy even though you did not directly ask for it.

In this scenario you have no idea when or if the switch will happen unless you watch all your test deps like a hawk every time you update from cpan. You have no idea which pebble will fall causing many of your test files to avalanche into Test2.

The proposal we went with instead was to make new Test::Builder use Test2. This means that there is only 1 point for the switch to happen. If you update Test-Simple then you get new guts. If you do not update you get old guts. If you pin your Test::Builder version than any downstream deps that require Test2 will fail to install and you do not need to worry about accidental updates.

The proposal we adopted means there is a single point of concern, which Test-Simple version you have installed. With the alternate that I myself mentioned back in 2015 in Berlin, and was still on the table table last week, you have no idea when or if your deps will change and suddenly switch you unwillingly.

This goes further. If we update Test::Builder to use Test2 (the current plan) than there is one thing for Test tool authors to target. If we keep them separate they need to insure their modules work with both the old and the compatibility shim, or they can just choose to break without the shim. This would lead to any number of possible combinations of test modules that can just explode one day due to simple changes.

Making Test::Builder use Test2 means we have 1 breakage event. Using an optional shim means we have any number of breakage events in the future as people maintain the test tools. The only way to prevent that would be to not have a shim, and force people to maintain both namespaces. But if I don't release the shim I am certain someone else eventually will, and then we have the same problem, possibly with even more things breaking if the shim is not as good.

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Chad 'Exodist' Granum

I blog about Perl.

More info »

Chad 'Exodist' Granum