Meta::Hack - MetaCPAN Upgrade

A small group of us got together for a 4 day hackathon (17th to 21st of Nov 2016) in Chicargo, with the goal of switching https://metacpan.org over to the new https://fastapi.metacpan.org/v1/. This has been a massive project, upgrading our core backend from Elasticsearch 0.20.2 to 2.4.0. The project has taken over 2 and a half years to get to where we are now.

Oh, I should point out, day 2, we achieved our goal - after that we then fixed issues, added more features and made plans!

For MetaCPAN I take on a sysadmin style role, even though my in day job I'm manager / code reviewer mostly.

I had already set up Elasticsearch 2.4.0 on a cluster (we had the old version on a single node before!) and at Meta::Hack reviewed the configuration with Brad (seems I mostly got it right!).

Day -1 (on the plane)

I've been working on a branch for a few months which utilises Fastly's (our content distribution network (CDN)), caching features much better, for both the API and website. I finished most of the initial set up somewhere over the Atlantic Ocean, ready to be reviewed...

Day 1 (CDN adding caching)

My Fastly branches were integrated and deployed... this then lead to several realisations and further releases of CatalystX::Fastly::Role::Response, MooseX::Fastly::Role and MetaCPAN::Role all of which had been developed for MetaCPAN specifically.

I also built a purge utility script for the rest of the team to be able to use, this includes being able to purge just text/html files, should we release new versions of css. At the same time I switched to using tokens rather than api keys which will help our security.

Day 2 (deployment day)

I spent quite a bit of time rehersing the go-live (migrating data from the old server to the new) and building a play book.

I built new self signed certificates for our web and api backends, deployed these to Fastly (who were great at helping debug a config issue) and got those deployed and intergrated.

We did the migration (Fastly means it's a 2 click operation to switch back, but that wasn't required) and we spent the rest of the day looking at error logs and making minor fixes.

Day 3 (logging)

I worked with Mickey on getting https://clientinfo.metacpan.org set up as he needed it for MetaCPAN::Client 2.0.

Now we are running Elasticsearch in a cluster, we wanted to be able to run the API and Web servers also with automatic fail over. The first step towards this is the simple task of automatically collating our logs:

  • I converted our Elasticsearch puppet module to take arguments about number of nodes to expect etc, and then set up one of our spare servers to act as a single node cluster to stream our logs into.
  • Working with Brad I set up rsyslog client and server, including SSL certs.
  • Brad is finishing his rsylog->Elasticsearch module (which he released to CPAN 10 mins before he had to leave, so we'll carry on working on that together online).

I gave a quick talk on our new Fastly set up, so others can intergrate it into their future work.

Day 4 (more logging and content-type fixes)

I spent most of the day looking at Elasticsearch snapshots, so we can not only create full backups, but also be able to restore to a test environment when we are doing work on the API that requires a full dataset.

We also spent quite a time discussing:

  • Where our focus should go next... better search results.
  • How we should get there... Joel has almost finished moving the web search to an API end point which will make it a whole lot easier to test and is making many of the options configurable (as to how much weight they have in the search).
  • What we need to do... include further metrics (dev only releases/river/favs/release date/lowering anything with DEPRECATED etc) that can be set to contribute a lot, or a little to the search result ordering.

There were a whole bunch of other things that I worked on, from altering content-type headers (converting x-script.perl to text/plain so browsers don't start downloading the files), fixing up https://explorer.metacpan.org/ to use the new backends, taking part in discussions and adding proxy URL paths to make MetaCPAN::Client work better.

Over all

It was a great to meet up with everyone, many of who I've never met in person before. To be in the same room made this a very productive few days, and even though I wasn't coding I feel much more confident in where we need to take the code base.

Our fantastic hosts Joel and Doug work for ServerCentral. ServerCentral was kind enough to give us office space for the 4 days and paid for several meals whilst we were in the office, so we could just carry on working.

We got to have Chicargo Pizza, we didn't make the visit to a very tall building or a Jazz club, we did achieve a huge amount in 4 days. We also have a good plan for the next phase of the project, which I'm very much looking forward to seeing happen.

Our sponsors have been amazing, making this event and what it has achieved possible... meta::hack wouldn't have been possible without the generosity of our sponsors. If you get a chance, please take a moment to thank them for helping our community. Our platinum sponsors were Booking.com and cPanel. Our gold sponsors were Elastic, FastMail, and Perl Careers. Our silver sponsors were ActiveState, Perl Services, and ServerCentral. Our bronze sponsors were Vienna.pm, easyname, and the Enlightened Perl Organisation (EPO). Thank you all.

Leave a comment

About Ranguard

user-pic London Perl developer