Watch out for [Help Vampires](http://slash7.com/2006/12/22/vampires/).
April 2010 Archives
I'm only a year late with this post; I wrote it in the airport in Moscow on the way home, then it never made it onto the blog. I need to published it before the next YAPC Russia, which is June 12-14 in Kiev this year.
This might be the only YAPC::Russia 2009 report you read, if only because I was the only presenter in english and maybe the only one writing about it in english. While everyone else's english got better over the conference, I think my Russian got worse.
The best reason to go to a conference is to meet new people, to see new places, and, oh, sometimes to see new code. The more interactions that we can get between the different parts of the physical world the better. If you always go to the same conference, you might be limiting the people you can meet.
Andrew Shitov, Ruslan Zakirov, and Niam Shafiev did a lot of work to get me to Moscow and take care of me while I was there. I hadn't met them before my trip, but they still did quite a bit to help. I like that Perl people can go to many places in the world and instantly have a group of people to hang out with. Curiously, standing in line at passport control coming into Moscow, I saw Chris Dibona in line in front of me. I didn't get to catch up to him at the baggage claim, but I did email him to confirm that he actually was in town. We didn't get a chance to meet up though. That's okay, because there is plenty to do in Moscow, even if your are jetlagged.
Curiously, right around the corner from the entrance to Red Square is a genuine American diner. My hosts didn't quite grok my fascination with the Starlite Diner, but it's something that I miss from New York and can't get in Chicago. Not only that, they had the Giro D'Italia on the TV and I needed my sports fix. It's authentic American diner food too.
DeepText sponsored my Mastering Perl class and everyone got to attend for free. Even though I presented in English, there were about 60 people in the class. The Russians seem to have a sane schedule, like starting a conference at 10am. I don't know who ever told other conferences that programmers want to start any earlier, but that just has to stop.
Russia can be a difficult trip, even before you start. You can't just go to Russia; someone has to invite you. You get the official invitation with the right government stamps then take that to a Russian consulate to get a visa. Surprisingly, there isn't a Russian consulate in Chicago, even though Houston gets one? In the end, a friend turned me on to Travisa. You give them your documents, they get your visa for you. Travisa is worth the money because they take care of all the hassle and most of it happens through FedEx.
However, my trip to Moscow was definitely worth the work. I thought that the locals must be boasting when they talk about Moscow's subways, but not only are they amazingly efficient but also very beautiful even in my very poor photography. I would have liked to take more pictures, but if you're standing still you're messing up the system.
The US subways have nothing on this.
Since the visas are expensive, a YAPC::EU in Moscow probably won't happen any time soon. Several people mentioned trying a YAPC in Ukraine, which requires no visa and has cheaper flights. I really want to go to Kiev, so I'm hoping that they put in a bid for 2011 (like they did for 2010). Apparently Ukraine has really good sushi because people kept ordering "ukranian sushi" at the favored lunch restaurant. Maybe other places have bacon-wrapped sushi. Do other places have green beer every day, though?
There were many interesting talks, but I only really understood some of the pictures. If I closed my eyes, the talks were "русски русски русски CPAN русски русски русски Perl русски русски ". There was enough code that I was able to see some new modules and get the basic idea. Most of the talks were understandable from the code on the slides, and I adjusted my own slides for "Making My Own CPAN" to be mostly pictures, using text very sparingly and mostly only for things such as file paths.
At the end of the day, they played this curious game which translated as "interactive session". I think it's a tradition, and everyone was really into it. People shouted out things that they thought might happen in Perl, but probably not. The idea was to come up with a long list to seed the upcoming game
- collect a list of probable events
- pare the list to a reasonable number
- randomly partition the set
- assign each set to a group
The conference split into teams, and each team got one partition whose events would together represent an unlikely situation if only through the combination of their probabilities. Each teams then had to construct the story that would incorporate each event.
For instance, I suggested these events that I thought have about even odds:
- Programming Perl gets a 4th Edition
- Moose is core in Perl 5.12
- Adam Kennedy releases his 400th distribution
People thought that Adam releasing his 400th module was actually quite likely, so it didn't make it into the final list.
The set of events my team got were:
- Russian companies start promoting Perl
- Perl is taught in Russian elementary schools
- Someone writes a Perl CMS that doesn't need developer support to use
- CPAN becomes simple and understandable
I forget the story that we invented since I merely contributed and most of the discussion was in Russian, but it was an interesting exercise. I at first concocted a story of wild, Terminator-like computer world dominance and Russian programming prodigies from Siberia, but the goal was to come up with something more thoughtful. If you wanted those things to come true, what would you have to do to make them happen?
Besides that, Ruslan took a look at my MyCPAN code and suggested several helpful changes and even submitted patches after I returned from my trip.
I can't make it to Iceland for the Nordic Perl Workshop, which I think is all but officially cancelled now while Eyjafjallajökull does its thing. NPW might reschedule for later in the year. Maybe we can turn this into a virtual conference, though. I'll see what I can do to give my talk here and upload the audio and video. Maybe other people can do the same, or at least upload their slides.
Most of the problem in the re-routings of airlines. I was flying from Boston to Keflavík on IcelandAir, but now those direct flights are cancelled. The reroutings are through Glasgow and other odd places with extremely long transfers. By the time I got to Reykjavík, it would be time to come home again. Curiously, I think this would have affected my travel to anywhere in Europe since the backlog of passengers is clogging planes, and flights from the US to anywhere in the Nordic area are still affected.
IcelandAir is offering full refunds, although it looks like they have to process them by hand. The Hotel Loftleiðir has a full refund policy if you cancel 24 hours before arrival, but their website is wonky so I had to write to them by email at email@example.com.
I don't have time this week to say more on this, but I need to get my notes off my mind:
Lots of people ask about the difference between Miyagawa's new CPAN client and mine. The short answer is that his exists and mine doesn't, and his might do everything that I need so I'm going to wait.
Schwern wants to make a Perl version of Git accomplishments, or whatever that thing is called. You get badges for doing things in Git.
I want to make a cpan(1) profile that does the same thing as cpanminus. That is, a CPAN.pm configuration file that makes the same decisions. Anyone want to make that for me? I'm trying not to get distracted so I can work on other things.
Marcel is making a PAUSE web service. It's going to be integrated with the live PAUSE databases and start as a read only service.
I'm working on a similar web service with a much broader scope. I hope to have a demo thing up very soon. It's going to include the PAUSE info, but much more too.
Andy Armstrong (I think) and BooK talked quite a bit about different ways to manage Perl library directories with Git. There was lots of branching, etc, and you could commit to the central repo and other machines could clone. I didn't take notes, but there were many use cases that were interesting. Essentially, you can have a virtually infinite number of module combinations, and each gets its own SHA-1 that you can checkout any time that you like.
I had been looking around for the ancient Perl sources to add to my collection, but Schwern told me that they are already in the Perl git repository. I just have to checkout the correct tag:
$ git tag -l perl-1.0 perl-1.0.15 perl-1.0.16 perl-2.0 perl-2.001 perl-3.000 perl-3.044 perl-4.0.00 perl-4.0.36
Now I just need to get these to compile on my MacBook Air.
I've suggested that Schwern needs to make a git archive of the internets now.
This is the video from the morning stand-up on everyone's progress for the second day of the 2010 Vienna Perl QA Workshop.
This is the video from the morning stand-up on everyone's progress for the first day of the 2010 Vienna Perl QA Workshop.
I'm about to leave for Vienna for the 2010 Perl QA Workshop, so now it's time to start thinking about my secret project. I've saved it especially for something to do on the plane.
A couple months ago, David Golden and I got together to talk about what a new CPAN.pm client would look like. Now, remember that both of us have our noses deep in the CPAN.pm source, and both of us have thought, on several occasions, that we should refactor CPAN.pm. Gabor, who is also going to be in Vienna, even went so far as to separate each package into its own file back in October 2008.
This was shortly after cpanminus, and that new client certainly is bright and shiny and removes quite a bit of pain for many people. The problem is that it really doesn't do anything differently. There are slight differences, and hardcoded decisions, but it's basically CPAN.pm or CPANPLUS without the knobs and dials. It's sorta like my iPhone: it's amazingly great if it's exactly what you want, not so great if it isn't, and it isn't designed for normal people to go mucking around in the innards. You have to trust that Steve Jobs or Miyagawa made the right decisions for you (and statistically that's a pretty good bet).
So, David and I first looked at the design decisions of CPAN.pm without thinking about its replacements. What are those decisions, intentional or otherwise, good or bad, forced by history or not:
- There is only one archive, although copied and replicated
- The repository is trusted
- PAUSE creates the index files, and everyone downloads them
- Once you choose a repository, that's what you'll use for everything
- It has to work out of the box with a standard Perl distribution
- The client installs modules for other programs to use immediately
- Distributions install into your Perl library directories as soon as the client can do so
- The client handles work as soon as it can
- Distributions come in archive files
- You only need to know about the latest distribution
- Modules are grouped into distributions, and you install all modules in a distribution to get any one of the modules
- Newer dependency distributions trump already installed versions and users never specify versions
- For the most part, installations are interactive
- You'd only ever be running on client instance at a time
- The code mostly supports a particular client, so there isn't a general API for making completely different clients
- Everything is based on a filesystem structure
- The client is a command-line program
Now, there are plenty of clients to handle the situations with those constraints. Do we have to keep those same constraints? Of course we don't, especially since we aren't considering that another client would supercede any other client. However, some of them have been sacred cows because the clients assume that they will always be true.
You may have already heard about some of what a new repository might look like. Mark Overmeer has done quite a bit of work for CPAN6 (or C6PAN) and has a very impressive plan for what a new repository will look like (and thus, what a new client would have to support). I think his plan is quite comprehensive and a good top-down design, but a bit more than I want to handle. By the way, even if none of his plans work out, we still got XML::Compile out of it. That has to be one of the hidden gems of CPAN.
You may have also been following my work on MyCPAN, DPAN, or BackPAN Archeology. Those projects are based around those design decisions too because they are descriptive rather than prescriptive. They work with today's reality instead of the future's promise.
Once David and I thought about the reasons the current clients are like they are, we moved on to think about what a new client be built on. What would our design decisions be? That night we came up with a short list of possibilities:
- We can use non-core modules. This is the most important decision because it enables so many others without a lot of work. We get to not use XML because we don't want to, not because we can't.
- We don't have to make a client that we want everyone to use.
- We can use multiple repositories in the same run as a normal, non-failure feature
- We can use any version of any distribution we can get
- We can pull from non-archive sources, such as source control
- We can have transactions so you can stop a botched install without leaving behind half of a broken installation
- We can use web services to pull just the information we need, perhaps just-in-time
- Before we do any work, we can build a tree of what we think we'll have to do to install a distribution (yes, some of this will be decorated during the run)
- We can have source control as a major feature
That's as much as I'm going to think about before I get on the plane tomorrow. My goal by the end of my time in Vienna is a good plan for implementing a new CPAN client with a clear understanding of its philosophy and the target user. I want to know that before I type any code (although I'll end up typing code probably).
At the end of last week, I sent mail to all CPAN authors letting them know that they could delete older distributions from author directories. CPAN is pretty large: last week it was almost 8 GB, and today it's almost down to 7 GB. That puts the Schwartz Factor at about 1/7th. We can do better.
It's not the size that's the problem so much: disk is cheap (still, no need to waste it). The more distributions CPAN stores, the more the masters need to check when a mirror wants to rsync. For the most part, mirrors only need to delete the files that disappeared and want the ones that appeared. The other files (aside from the PAUSE files such as CHECKSUMS) should not have changed. The
rsync program doesn't know anything special about CPAN though, so it still does all its work.
Tim Bunce, inadvertantly I think, started a long thread about possibly automating the PAUSE purging process. There are all sorts of technical suggestions and fights about filesystems. While they figure all that out, though, you can help those mirrors by cleaning up the ancient distributions in your PAUSE directory.
And, if you didn't get my email, sent individually to every CPAN author (asking the Perl NOC first, of course), you might also take the opportunity to update your PAUSE profile.
If you've already done your bit to make PAUSE smaller, you can still help by spreading the word. I'm sure a lot of CPAN authors didn't read my mail. :)
A Stackoverflow answer that encourages the questioner to use Moose has a long code example because it has a lot of code and formatting since there is a lot of repeated typing:
has 'PacketName' => ( is => 'rw', isa => 'Str', required => 1, ); has 'Platform' => ( is => 'rw', isa => 'Str', required => 0, ); has 'Version' => ( is => 'r', isa => 'Int', required => 1, );
I'd like to see that look something like little language this instead, perhaps in a INI style format. Assume that the attributes are
required (or whatever the defaults should be) and only express the deviations:
# file is fooclass.moose or something [has] PacketName Platform not_required Version isa Int is r
Ultimately, I'd like to see a non-code text description of a class come from configuration rather than code. I've talked to random Moose people around this and they've never shot it down (to my face :). I've done this for non-Moose things, some of which I talk about in Mastering Perl, but that you can also see in some of my work on Data::Constraint and Brick.
I know there are a lot more fancy things that you can do and you might use code to figure some of that out, so you wouldn't use this in those cases. However, in the newbie, simple class case, it's a lot less scary to see a little code than a lot of code.
It's not very high on my to do list, though, so steal the idea if you like it.