A decade in CPAN toolchain

Dave Cross:

I’m not going to object to Module::Build leaving the core. I’m sure there are good reasons, I just wish I knew what they are. I am, however, slightly disappointed to find that Schwern was wrong ten years ago and that ExtUtils::MakeMaker wasn’t doomed.

Schwern wasn’t wrong and MakeMaker remains doomed all these years later. It’s still around only because there hasn’t been anything to take its place. Module::Build looked like it was going to be that usurper – but didn’t work out.

Note that the reason that, between EUMM and M::B, M::B is the one leaving the core, is that EUMM is necessary to build the core and M::B is not. The reason for that is that no one bothered to port the existing MakeMaker-dependent infrastructure to Module::Build. And that never happened because M::B never gained the necessary features (XS support, mainly) fast enough for anyone to want to – because it wasn’t sufficiently much better than EUMM for anyone to want it enough to add the features.

However, EUMM is about as marginally maintained nowadays as M::B. Both are doomed, though their type of doomedness is one that’s accompanied by remarkable staying power. (Break-the-CPAN status tends to have that effect.) RJBS is on record that, should EUMM ever become unnecessary to building the core, it will make its exit stage left much the same as M::B is making now.

So… what happened?

In short, M::B never truly delivered on its promises. The idea behind M::B was to make it easier for authors to customize the jobs that EUMM was used for:

  1. helping with packing distributions for release
  2. performing the build and installation

Customising EUMM was difficult because it required writing code to generate bits of Makefile. (Worse: portable Makefile. Which implies: portable shell.) And how do you make several EUMM extensions cooperate in hooking into the right bits of EUMM to do their job? Not well.

Thus suggested itself the premise of M::B: ditch make – write the entire thing in Perl, where it’s easy to write portable code. Other than that? Follow the design of EUMM so the tools can continue to work the same way.

Now, when these jobs – managing the release process, and managing the build+installation process – are entrustred to make, it is quite reasonable to put them together in a single Makefile – otherwise you have to fight make. (If your first thought for such a task goes to make, then splitting the Makefile probably isn’t even going to occur to you.) And in order to abstract away common Makefile code between distributions, it is quite reasonable to write a library for generating this single Makefile. Therefore, the entire design of EUMM, in which a script generates another script that contains all the functionality needed by both the author and the user of a module, follows logically from reliance on make.

But M::B? Its premise was to ditch make, yet it otherwise copied this design verbatim. The central constraint driving the design vanished, yet M::B remained beholden to it – merely replacing generated Makefiles with generated Perl programs.

(OK, that is a big “merely”. Let us take a moment here to acknowledge how much of an improvement that alone is.)

The consequence of that is, among many things, that in M::B – just as in EUMM –, customizations must be made indirectly, within the first script which generates the other script. You cannot write plugins that provide extra functionality directly. In M::B, much like EUMM, you have to write code that cooperates with the generator of the other script such that the generated script ends up containing the functionality you want. M::B improves on EUMM primarily by not making you generate portable Makefiles, as well as by giving you better hooks into the generation process. The ultimate shape its architecture took, however, is rather strangely factored and unreasonably difficult to extend in a composable fashion, for many of the same reasons that EUMM’s is. M::B is much easier than EUMM to extend in minor ways, yet it suffers the same low ceiling.

This failure of reimagination left a vacuum for some tool to fill.

And there things remained for quite a long time.

Now they have changed: Dist::Zilla and friends have taken the stage. All tools of this type abandon the generated-code architecture (thus making them far easier to extend directly) as well as any aspirations to the build process (thus making it much less costly to use CPAN modules within them and in plugins, plus making it much easier for authors to adopt them – essentially taking them out of the toolchain). That road led straight into an explosion of plugins on CPAN like nothing there ever was for M::B.

Meanwhile, they can boilerplate an installer for a distribution, based on either EUMM or M::B, making the distinction moot. And so EUMM sticks around. Its throne remains up for grabs but a challenger has yet to appear. I think there is a number of reasons for this, but the major one is that few people actually care to do clever things during build and installation: for most authors, simply shipping a default Makefile.PL (preferrably, by letting Dist::Zilla spit it out) is all they need. That means that while authors mostly don’t need EUMM per se (because anything that can do the basic job will do), they also have no finding-an-alternative itch to scratch, even as the tools shamble on in life support mode.

EUMM was not displaced so much as disrupted.

M::B failed to break EUMM’s dominance because it never offered anything fundamentally better, but Dist::Zilla and friends simply sidestepped the issue and made it irrelevant. The old tools live on, relegated to a marginal role.

It would just still be nice if we could get rid of EUMM someday.

And at this point, it should be noted that M::B braved many of the obstacles waiting for whoever was going to try to displace EUMM. The establishment of configure_requires it motivated was the prerequisite for its very ejection from core. M::B succeeded in just about every aspect – except its core value proposition, bitter though that is. Thus, even as M::B-the-module failed, M::B-the-effort was a resounding success, and the ecosystem has a lot to thank that effort for.

In a sense, then, M::B’s failure was its greatest success: though it only dented EUMM’s dominance, it created the conditions to break it. A recent example is the specification of a communication protocol between CPAN shells and distribution installer scripts, which has opened the door for any number of alternative installers to make their entrance. There is at least one contender already, Module::Build::Tiny, even though that one is expressly not looking to be it – nor probably has to be. (Nevertheless: it just added basic support for building XS!)

Maybe something will happen in due time.


(Thanks to Joel Berger, Tatsuhiko Miyagawa, David Golden and Ricardo Signes for reading drafts of this.)

P.S.: I deliberately omitted all discussion of Module::Install in this article. While it played a role in all these developments, conceptually it was on an evolutionary sideline of its own that does not ultimately affect the aspects of the matter that I wanted to concentrate on.

8 Comments

Well said sir!

Moving forward, there remains one problem: people expect to build, test and install in separate phases. Because of this, some amount of state must be serialized, waiting between phases. One of our key tools in Perl is the subroutine reference or better yet, the closure. These are used to perform many tricks, do complicated processes and act as callback to events, however, their weakness is that they cannot be easily serialized. Until the community embraces single-step installation or until closures can be well serialized, no Pure-Perl tool will emerge over EUMM, because we cannot use Perl to its fullest.

-- my hard-won $0.02

"people expect to build, test and install in separate phases"

That explains the need for a three-stage install, but Module::Build performs a four-stage install for seemingly no good reason other than copying EUMM.

people expect to build, test and install in separate phases

It's not just that people expect it, it's that sysadmin will demand it. Friendliness to the end-user is just as important as friendliness to the author, if not more so.

FYI, I made no comment on if the toolchain should be in phases, just what the upshot is if that is the case.

Of course, Dist::Zilla is itself a "generate-another-script" architecture, writing the Perl in either MB or EUMM form. Its true that this is possibly sufficient, however it just changes EUMM's process of creating makefiles from Perl into making Perl files from Perl. Yes this is an improvement, but its still building before building.

The system I would envision is closer to MB's original goal or even M::B::Pluggable (which is a great starting point for a future system). Imagine the ability to add around modifiers to build processes or dynamically hook to events in parts of the chain. dzil achieves the feel of this by creating this magic before build time, but the promise of Module::Build was to act at build time in a dynamic way.

I think a reason I would want closures (or really event callbacks) would be partly to avoid searching for state on the filesystem or using a data store. I think probably the reason we are not meeting on a concept here is that the concept I envision does not exist except inside my head :-)

That said, I like your first suggestion here. 95% of all modules can probably be configuration only (indeed that is what Alien::Base attempts to allow for Alien:: modules). Miyagawa's cpanfile format for dependencies (or META.*) are certainly enough for most. I agree wholeheartedly, these are things the cpan client could do! Great idea!

Leave a comment

About Aristotle

user-pic Waxing philosophical