A decade in CPAN toolchain
I’m not going to object to Module::Build leaving the core. I’m sure there are good reasons, I just wish I knew what they are. I am, however, slightly disappointed to find that Schwern was wrong ten years ago and that ExtUtils::MakeMaker wasn’t doomed.
Schwern wasn’t wrong and MakeMaker remains doomed all these years later. It’s still around only because there hasn’t been anything to take its place. Module::Build looked like it was going to be that usurper – but didn’t work out.
Note that the reason that, between EUMM and M::B, M::B is the one leaving the core, is that EUMM is necessary to build the core and M::B is not. The reason for that is that no one bothered to port the existing MakeMaker-dependent infrastructure to Module::Build. And that never happened because M::B never gained the necessary features (XS support, mainly) fast enough for anyone to want to – because it wasn’t sufficiently much better than EUMM for anyone to want it enough to add the features.
However, EUMM is about as marginally maintained nowadays as M::B. Both are doomed, though their type of doomedness is one that’s accompanied by remarkable staying power. (Break-the-CPAN status tends to have that effect.) RJBS is on record that, should EUMM ever become unnecessary to building the core, it will make its exit stage left much the same as M::B is making now.
So… what happened?
In short, M::B never truly delivered on its promises. The idea behind M::B was to make it easier for authors to customize the jobs that EUMM was used for:
- helping with packing distributions for release
- performing the build and installation
Customising EUMM was difficult because it required writing code to generate bits of Makefile. (Worse: portable Makefile. Which implies: portable shell.) And how do you make several EUMM extensions cooperate in hooking into the right bits of EUMM to do their job? Not well.
Thus suggested itself the premise of M::B: ditch make
– write the entire thing in Perl, where it’s easy to write portable code. Other than that? Follow the design of EUMM so the tools can continue to work the same way.
Now, when these jobs – managing the release process, and managing the build+installation process – are entrustred to make
, it is quite reasonable to put them together in a single Makefile – otherwise you have to fight make
. (If your first thought for such a task goes to make
, then splitting the Makefile probably isn’t even going to occur to you.) And in order to abstract away common Makefile code between distributions, it is quite reasonable to write a library for generating this single Makefile. Therefore, the entire design of EUMM, in which a script generates another script that contains all the functionality needed by both the author and the user of a module, follows logically from reliance on make
.
But M::B? Its premise was to ditch make
, yet it otherwise copied this design verbatim. The central constraint driving the design vanished, yet M::B remained beholden to it – merely replacing generated Makefiles with generated Perl programs.
(OK, that is a big “merely”. Let us take a moment here to acknowledge how much of an improvement that alone is.)
The consequence of that is, among many things, that in M::B – just as in EUMM –, customizations must be made indirectly, within the first script which generates the other script. You cannot write plugins that provide extra functionality directly. In M::B, much like EUMM, you have to write code that cooperates with the generator of the other script such that the generated script ends up containing the functionality you want. M::B improves on EUMM primarily by not making you generate portable Makefiles, as well as by giving you better hooks into the generation process. The ultimate shape its architecture took, however, is rather strangely factored and unreasonably difficult to extend in a composable fashion, for many of the same reasons that EUMM’s is. M::B is much easier than EUMM to extend in minor ways, yet it suffers the same low ceiling.
This failure of reimagination left a vacuum for some tool to fill.
And there things remained for quite a long time.
Now they have changed: Dist::Zilla and friends have taken the stage. All tools of this type abandon the generated-code architecture (thus making them far easier to extend directly) as well as any aspirations to the build process (thus making it much less costly to use CPAN modules within them and in plugins, plus making it much easier for authors to adopt them – essentially taking them out of the toolchain). That road led straight into an explosion of plugins on CPAN like nothing there ever was for M::B.
Meanwhile, they can boilerplate an installer for a distribution, based on either EUMM or M::B, making the distinction moot. And so EUMM sticks around. Its throne remains up for grabs but a challenger has yet to appear. I think there is a number of reasons for this, but the major one is that few people actually care to do clever things during build and installation: for most authors, simply shipping a default Makefile.PL
(preferrably, by letting Dist::Zilla spit it out) is all they need. That means that while authors mostly don’t need EUMM per se (because anything that can do the basic job will do), they also have no finding-an-alternative itch to scratch, even as the tools shamble on in life support mode.
EUMM was not displaced so much as disrupted.
M::B failed to break EUMM’s dominance because it never offered anything fundamentally better, but Dist::Zilla and friends simply sidestepped the issue and made it irrelevant. The old tools live on, relegated to a marginal role.
It would just still be nice if we could get rid of EUMM someday.
And at this point, it should be noted that M::B braved many of the obstacles waiting for whoever was going to try to displace EUMM. The establishment of configure_requires
it motivated was the prerequisite for its very ejection from core. M::B succeeded in just about every aspect – except its core value proposition, bitter though that is. Thus, even as M::B-the-module failed, M::B-the-effort was a resounding success, and the ecosystem has a lot to thank that effort for.
In a sense, then, M::B’s failure was its greatest success: though it only dented EUMM’s dominance, it created the conditions to break it. A recent example is the specification of a communication protocol between CPAN shells and distribution installer scripts, which has opened the door for any number of alternative installers to make their entrance. There is at least one contender already, Module::Build::Tiny, even though that one is expressly not looking to be it – nor probably has to be. (Nevertheless: it just added basic support for building XS!)
Maybe something will happen in due time.
(Thanks to Joel Berger, Tatsuhiko Miyagawa, David Golden and Ricardo Signes for reading drafts of this.)
P.S.: I deliberately omitted all discussion of Module::Install in this article. While it played a role in all these developments, conceptually it was on an evolutionary sideline of its own that does not ultimately affect the aspects of the matter that I wanted to concentrate on.
Well said sir!
Moving forward, there remains one problem: people expect to build, test and install in separate phases. Because of this, some amount of state must be serialized, waiting between phases. One of our key tools in Perl is the subroutine reference or better yet, the closure. These are used to perform many tricks, do complicated processes and act as callback to events, however, their weakness is that they cannot be easily serialized. Until the community embraces single-step installation or until closures can be well serialized, no Pure-Perl tool will emerge over EUMM, because we cannot use Perl to its fullest.
-- my hard-won $0.02
That explains the need for a three-stage install, but Module::Build performs a four-stage install for seemingly no good reason other than copying EUMM.
It's not just that people expect it, it's that sysadmin will demand it. Friendliness to the end-user is just as important as friendliness to the author, if not more so.
FYI, I made no comment on if the toolchain should be in phases, just what the upshot is if that is the case.
I don’t know, Joel.
It’s true that state has to be serialized to disk somehow – though it doesn’t necessarily have to be done explicitly:
make
derives its state from the filesystem when it starts up, but never (needs to) serialize it to disk. Now I can imagine that the filesystem alone may not be sufficient to infer all relevant state from, in which case some amount of the internal state of the objects in the graph may have to be serialized explicitly. But I certainly don’t see why the wiring of the object graph would need to be serialized.Case in point, as far as I can tell at least, Dist::Zilla seems to do just fine without the need to serialize any object graph structure, just like
make
never needs to.Is the apparent problem here really caused by anything more innate than the generate-another-script architecture of Module::Build?
Of course, Dist::Zilla is itself a "generate-another-script" architecture, writing the Perl in either MB or EUMM form. Its true that this is possibly sufficient, however it just changes EUMM's process of creating makefiles from Perl into making Perl files from Perl. Yes this is an improvement, but its still building before building.
The system I would envision is closer to MB's original goal or even M::B::Pluggable (which is a great starting point for a future system). Imagine the ability to add around modifiers to build processes or dynamically hook to events in parts of the chain. dzil achieves the feel of this by creating this magic before build time, but the promise of Module::Build was to act at build time in a dynamic way.
I think that EUMM’s continued domination is evidence that acting at build time in a dynamic way is something that almost nobody actually needs. (You are, by the nature of your interest in the toolchain, one of the few exceptions.)
So to my thinking, why isn’t the CPAN clients itself the installer, with a built-in implementation of the standard convention? For most distributions, it would suffice to ship a static
install.ini
or some such. (ProbabyMETA.json
would take this job, come to think of it.) That configuration could then also specify that installation has a dynamic aspect, for which, would the installer please invoke e.g. Alien::Base (if that is whatinstall.ini
calls for) to handle that part? Thanks.This might not have been reasonable when CPAN was very new and the conventions of how things are installed were only just being established. They are well established at this point.
Thus, while it is true that Dist::Zilla generates scripts, the fact that these files contain code rather than configuration is, to my mind, an accident of history. Even though Dist::Zilla is a generate-…something architecture, it is so in a materially different way from
Build.PL
andMakefile.PL
. It operates during the release process only, during which it does not generate scripts for its user to run during the release process – it generates scripts for someone entirely different to run at some entirely different time and place, and due to this separation of concerns, its own design is agnostic to the structure of those outputs in a way that the design of EUMM and M::B cannot be about theirs.Now, to get back to the question, even in a dynamic build process, why would such an object graph’s wiring need to be serialized to disk? I don‘t see why it wouldn’t suffice to do exactly as Dist::Zilla does, and wire up the objects from scratch every time based on static configuration, and then have them derive their state from either the state of the filesystem or some explicit data store, as needs may dictate.
Am I missing something?
I think a reason I would want closures (or really event callbacks) would be partly to avoid searching for state on the filesystem or using a data store. I think probably the reason we are not meeting on a concept here is that the concept I envision does not exist except inside my head :-)
That said, I like your first suggestion here. 95% of all modules can probably be configuration only (indeed that is what Alien::Base attempts to allow for Alien:: modules). Miyagawa's cpanfile format for dependencies (or META.*) are certainly enough for most. I agree wholeheartedly, these are things the cpan client could do! Great idea!