What could a completely different CPAN client do?

I'm about to leave for Vienna for the 2010 Perl QA Workshop, so now it's time to start thinking about my secret project. I've saved it especially for something to do on the plane.

A couple months ago, David Golden and I got together to talk about what a new CPAN.pm client would look like. Now, remember that both of us have our noses deep in the CPAN.pm source, and both of us have thought, on several occasions, that we should refactor CPAN.pm. Gabor, who is also going to be in Vienna, even went so far as to separate each package into its own file back in October 2008.

This was shortly after cpanminus, and that new client certainly is bright and shiny and removes quite a bit of pain for many people. The problem is that it really doesn't do anything differently. There are slight differences, and hardcoded decisions, but it's basically CPAN.pm or CPANPLUS without the knobs and dials. It's sorta like my iPhone: it's amazingly great if it's exactly what you want, not so great if it isn't, and it isn't designed for normal people to go mucking around in the innards. You have to trust that Steve Jobs or Miyagawa made the right decisions for you (and statistically that's a pretty good bet).

So, David and I first looked at the design decisions of CPAN.pm without thinking about its replacements. What are those decisions, intentional or otherwise, good or bad, forced by history or not:

  • There is only one archive, although copied and replicated
  • The repository is trusted
  • PAUSE creates the index files, and everyone downloads them
  • Once you choose a repository, that's what you'll use for everything
  • It has to work out of the box with a standard Perl distribution
  • The client installs modules for other programs to use immediately
  • Distributions install into your Perl library directories as soon as the client can do so
  • The client handles work as soon as it can
  • Distributions come in archive files
  • You only need to know about the latest distribution
  • Modules are grouped into distributions, and you install all modules in a distribution to get any one of the modules
  • Newer dependency distributions trump already installed versions and users never specify versions
  • For the most part, installations are interactive
  • You'd only ever be running on client instance at a time
  • The code mostly supports a particular client, so there isn't a general API for making completely different clients
  • Everything is based on a filesystem structure
  • The client is a command-line program

Now, there are plenty of clients to handle the situations with those constraints. Do we have to keep those same constraints? Of course we don't, especially since we aren't considering that another client would supercede any other client. However, some of them have been sacred cows because the clients assume that they will always be true.

You may have already heard about some of what a new repository might look like. Mark Overmeer has done quite a bit of work for CPAN6 (or C6PAN) and has a very impressive plan for what a new repository will look like (and thus, what a new client would have to support). I think his plan is quite comprehensive and a good top-down design, but a bit more than I want to handle. By the way, even if none of his plans work out, we still got XML::Compile out of it. That has to be one of the hidden gems of CPAN.

You may have also been following my work on MyCPAN, DPAN, or BackPAN Archeology. Those projects are based around those design decisions too because they are descriptive rather than prescriptive. They work with today's reality instead of the future's promise.

Once David and I thought about the reasons the current clients are like they are, we moved on to think about what a new client be built on. What would our design decisions be? That night we came up with a short list of possibilities:

  • We can use non-core modules. This is the most important decision because it enables so many others without a lot of work. We get to not use XML because we don't want to, not because we can't.
  • We don't have to make a client that we want everyone to use.
  • We can use multiple repositories in the same run as a normal, non-failure feature
  • We can use any version of any distribution we can get
  • We can pull from non-archive sources, such as source control
  • We can have transactions so you can stop a botched install without leaving behind half of a broken installation
  • We can use web services to pull just the information we need, perhaps just-in-time
  • Before we do any work, we can build a tree of what we think we'll have to do to install a distribution (yes, some of this will be decorated during the run)
  • We can have source control as a major feature

That's as much as I'm going to think about before I get on the plane tomorrow. My goal by the end of my time in Vienna is a good plan for implementing a new CPAN client with a clear understanding of its philosophy and the target user. I want to know that before I type any code (although I'll end up typing code probably).

8 Comments

Thanks for the mention for cpanminus -- and actually, that "CPAN client that works for normal people without any configuration" vs. "CPAN clients that hackers who want to hack and extend, aka Jailbreak :)" has been on my TODO list for cpanminus.

You can read more about it at http://bulknews.typepad.com/blog/2010/03/the-future-of-cpanminus.html but basically cpanminus has various hooks and plugin architecture, and there are already a bunch of plugins that allows you to, say, download a distribution from git and configure/build/test using Dist::Zilla. But probably that's not something normal people need for cpanminus.

To borrow your iPhone analogy, I've been thinking about the "Android" CPAN client, extracted out of the current cpanminus codebase + hooks. I'm happy to work with you guys designing, thinking about the requirements and actually coding it.

See you in Vienna.

my wishlist:

  • integrated with the native package manager
  • or at least separation of compilation and installation phases so I can compile in one machine and install in several others (that usually are not connected to the internet or/and don't have a full tool chain).

The one thing I dislike about the current cpan infrastructure is the way it deals with meta data (it should more static when possible, more dynamic when required). I think that will have to change for a new client to be truly more useful than the existing ones.

Salvador++

We hear repeatedly that CPAN.pm is not a package manager... then why is it installing things? It should really have a mode designed interact with dpkg or apt or yum or rpm or tar or whatever.

Hi brian! An uninstaller perhaps...?

I would love to have some standard for installing tests, too. That way, we can work towards optionally being able to run tests against the entire installation. Not all tests would be able to be run and I assume some tests might be accidentally destructive. Plus, if text X fails (with the same failure result) both before and after a different module is installed, then we can at least know that there is no change.

dh-make-perl has a flag for using CPAN as your source for building and installing a perl module. Personally I feel it is wiser that a CPAN client is OS agnostic and those who work with perl in the OS adapt the client to their platform. This provides for better separation of concerns.

for the sake of clarity, dh-make-perl is a tool to build deb packages on debian and debian derived systems.

Leave a comment

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).