CPAN Testers Server Update - 29/09/2011
The last couple of weeks have been distracting for reasons that have nothing to do with CPAN Testers. Sadly Facebook and Google decided to enact their Naming Policy on my accounts, as apparently people in the real world can't possibly exist with mononyms! In the discussions that followed on-line, Peter Edwards sent me a link to a rather interesting article. Prior to the CPAN Testers server problems, I had started to add the Facebook Like and Google+ Plus buttons to pages across the CPAN Testers sites. Having read this article I have now decided to remove them. Although the initial idea behind these buttons seemed to be quite a nice promotional tool, the subversive way they are being used is an invasion of your privacy that I don't wish to be a part of. Just to be clear, this decision has nothing to do with my issues with the respective Naming Policies, but everything to do with the invasion of privacy. I recommend you read the article for further information.
Having struggled with the server over the last few weeks, with the database build and reindexing taking weeks rather than days, in my last update I asked for help. Ioan Rogers stepped forward and did some analysis of our set-up. He helped to pin-point some system apps we didn't need to have running, but in general the set-up did seem fine. However, the disk IO was still a problem. Using 'htop' and 'iotop' we could see that the RAID management software was at 99% or more most of the time, even when there wasn't much being written to disk. We then installed 'atop', which highlighted the issue considerably. One of the disks was running at over 100% of resources to deal with IO, while the other was fluctuating around 4%.
Thinking the disk itself was a problem I contacted the guys at Bytemark, to see whether they could see a problem with the physical disk. On investigation they identified a kernel bug that was blocking IO unnecessarily. A kernel upgrade and reboot successfully cured the problem. My thanks to Ioan Rogers for the inital help and advice, and a big thank you to the guys at Bytemark, James Lawrie & Chris Cottam.
Having got the server back on track, I then turned my attention to the performance fixes I had started to add to the feed parser. I am pleased to say that once again thanks to Devel::NYTProf, the feeder code is now parsing 3 hours worth of reports in roughly 20 minutes. Having restarted a week prior to the disk crash, the feeder started from the reports posted on the 20th August. At this rate I expect us to be fully up to date by next week. I'm already planning to have the Reports website back online possibly over the weekend.
The Statistics website and Development websites are back online, although there are still links and data that isn't up to date. The Preferences site is likely to take a little longer as I now have to apply again for an SSL certificate, but as I'm not switching the emails just yet, this shouldn't be too much of a problem just at the moment.
All being well we should be almost fully operational next week. Apologies it's taken so long to get everything back online, but rest assured no reports have been lost.
Cross-posted from the CPAN Testers Blog