What could a completely different CPAN client do?
I'm about to leave for Vienna for the 2010 Perl QA Workshop, so now it's time to start thinking about my secret project. I've saved it especially for something to do on the plane.
A couple months ago, David Golden and I got together to talk about what a new CPAN.pm client would look like. Now, remember that both of us have our noses deep in the CPAN.pm source, and both of us have thought, on several occasions, that we should refactor CPAN.pm. Gabor, who is also going to be in Vienna, even went so far as to separate each package into its own file back in October 2008.
This was shortly after cpanminus, and that new client certainly is bright and shiny and removes quite a bit of pain for many people. The problem is that it really doesn't do anything differently. There are slight differences, and hardcoded decisions, but it's basically CPAN.pm or CPANPLUS without the knobs and dials. It's sorta like my iPhone: it's amazingly great if it's exactly what you want, not so great if it isn't, and it isn't designed for normal people to go mucking around in the innards. You have to trust that Steve Jobs or Miyagawa made the right decisions for you (and statistically that's a pretty good bet).
So, David and I first looked at the design decisions of CPAN.pm without thinking about its replacements. What are those decisions, intentional or otherwise, good or bad, forced by history or not:
- There is only one archive, although copied and replicated
- The repository is trusted
- PAUSE creates the index files, and everyone downloads them
- Once you choose a repository, that's what you'll use for everything
- It has to work out of the box with a standard Perl distribution
- The client installs modules for other programs to use immediately
- Distributions install into your Perl library directories as soon as the client can do so
- The client handles work as soon as it can
- Distributions come in archive files
- You only need to know about the latest distribution
- Modules are grouped into distributions, and you install all modules in a distribution to get any one of the modules
- Newer dependency distributions trump already installed versions and users never specify versions
- For the most part, installations are interactive
- You'd only ever be running on client instance at a time
- The code mostly supports a particular client, so there isn't a general API for making completely different clients
- Everything is based on a filesystem structure
- The client is a command-line program
Now, there are plenty of clients to handle the situations with those constraints. Do we have to keep those same constraints? Of course we don't, especially since we aren't considering that another client would supercede any other client. However, some of them have been sacred cows because the clients assume that they will always be true.
You may have already heard about some of what a new repository might look like. Mark Overmeer has done quite a bit of work for CPAN6 (or C6PAN) and has a very impressive plan for what a new repository will look like (and thus, what a new client would have to support). I think his plan is quite comprehensive and a good top-down design, but a bit more than I want to handle. By the way, even if none of his plans work out, we still got XML::Compile out of it. That has to be one of the hidden gems of CPAN.
You may have also been following my work on MyCPAN, DPAN, or BackPAN Archeology. Those projects are based around those design decisions too because they are descriptive rather than prescriptive. They work with today's reality instead of the future's promise.
Once David and I thought about the reasons the current clients are like they are, we moved on to think about what a new client be built on. What would our design decisions be? That night we came up with a short list of possibilities:
- We can use non-core modules. This is the most important decision because it enables so many others without a lot of work. We get to not use XML because we don't want to, not because we can't.
- We don't have to make a client that we want everyone to use.
- We can use multiple repositories in the same run as a normal, non-failure feature
- We can use any version of any distribution we can get
- We can pull from non-archive sources, such as source control
- We can have transactions so you can stop a botched install without leaving behind half of a broken installation
- We can use web services to pull just the information we need, perhaps just-in-time
- Before we do any work, we can build a tree of what we think we'll have to do to install a distribution (yes, some of this will be decorated during the run)
- We can have source control as a major feature
That's as much as I'm going to think about before I get on the plane tomorrow. My goal by the end of my time in Vienna is a good plan for implementing a new CPAN client with a clear understanding of its philosophy and the target user. I want to know that before I type any code (although I'll end up typing code probably).