CPAN Testers Summary - September 2011 - Construction Time Again
September has turned out to have been a very difficult and traumatic month for CPAN Testers. Our server problems have now been resolved, but it has meant more time than was intended has been devoted to rebuilding the website eco-system. It's still not complete, but we are getting there.
The first sites to be reinstated were the CPAN and BACKPAN mirrors. Soon after the Statistics site and Development sites were back online. The database rebuilds took considerably longer than expected, partly because the archive of the original NNTP reports had become corrupted at some point. The checks to ensure all interconnected parts of the databases were correctly referenced also took time. A full backup was then taken so we could start from a known point once all the moving parts were restarted. Just in case anything went wrong again!
The uploads and metabase feed parsers were turned back on, and although both were not the same code that was running on the old server when it died, they both highlighted problems with the server. At first I thought one of the disk was not performing as well as it should have done, but on investigation from the guys at Bytemark, we discovered the kernel that had been installed came with an IO blocking fault. Once upgraded we started to see a much better response time.
Following the reintroduction of the performance tuning code, the metabase feed parser is now processing 4 days worth of reports in a single day. We're currently less than 15 days behind, so hopefully by next week we should be almost realtime.
The builder for the Reports site was switched on to get through the several million requests we had stacked up from the rebuild processes, but with only just over 30,000 unique requests to process it only took a short while to catch up. As of right now I'm turning the builder on only for short periods, to give the metabase feed parser as much processing resources and disk access as possible. There are currently less than 5,000 unique requests for the builder to process, which once we're fully operational, the builder should be able to process in less than a day.
As we are getting much closer to being back to full speed, you might have noticed that the Reports site is back online as of this morning. Both the dynamic and static sites are now available, however the dynamic site no longer accepts requests from robots and spiders. As some don't follow the robots.txt file, this is being done within the Apache config files. For the time being the static site will accept requests from bots, but if there is any abuse, they will also be denied access.
Once upon a time I had hoped the requests from bots would help to build the site, but they have ended up being more of a hindrance than a help. Also the idea that the search engines would help the ranking for Perl within some dubious lists, hasn't really materialised. As such the loss of bots is likely to have more of a benefit than keeping them on side. Interestingly, I have noted that Microsoft have stayed true to their word and removed us from the bot lists. It was just sad that it took such high profile posts to get them to take stock.
The reports emails and the release database are still to do, as are some other small corners of the eco-system. I hope to get to all of these as soon as possible over the next month, and will keep you posted as these come back online.
Many thanks to the support and understanding of everyone. It's been a traumatic exercise, but the encourangement and help we have received has been very much appreciated. If you spot any problems in the next month, it is probably something we are already aware of. If it isn't resolved by the end of the month, please let me know or post to the CPAN Testers Discussion list.
A final bit of news that has vindicated my prophecy during YAPC::Europe, is that we did indeed break the 1 million reports barrier, for reports posted in a single month. Looking at the figures so far, I'm not expecting this to be repeated for September, as I suspect some have not been so active due to the downtime for CPAN Testers last month. However, it now highlights that we can indeed cope with such a high volume of reports. It will be interesting to see whether October picks up the pace again :)
Cross-posted from the CPAN Testers Blog