Parisienne Walkways - 2012 QA Hackathon (Part 2)
And so to the final part of my notes from the 2012 QA Hackathon.
CPAN Testers Report Status
After asking several times, Andreas thought he finally understood what the dates mean on the Status page for the CPAN Testers Reports. He started watching and making page requests to see whether his requests were actioned. On Day 3 he pointed out that the date went backwards! Once he'd shown me, I understand now why the first date is confusing. And for anyone else who has been confused by it, you can blame Amazon. SimpleDB sucks. It's why the Metabase is moving to another NoSQL DB.
The date references the update date of the report as it entered the Metabase. The last processed report is the last report that was extracted from the Metabase and entered into the cpanstats DB. Unfortunately, SimpleDB has a broken concept of searching. It will return results before the date requested, and regularly return the sorted results in an unsorted order. As such the dates you see on the Status page may go backwards in time! I'm not going to try and fix this, as it will all work as intended with the new system.
Missing Reports
There have been several questions relating to missing reports over the past few years. Sometimes it just needs me to refresh the indices, but in other cases it may be due to the fact that SimpleDB omits reports from a request. Did I mention SimpleDB sucks? In a request to the Metabase, I will ask for all the reports from a given date. The results are limited to 2500, due to Amazon's own restriction. In the returned list it will often omit entries, due to its ignorance of sorting in the search request. I have gone through the Metabase code on several occasions and can verify it does the right thing. SimpleDB just chooses to ignore the complete search request and returns what it *thinks* you want to know.
Ribasushi questioned me about one of his modules that had been released recently, which still had no Cygwin reports listed, even though he sent a few himself. Further investigation revealed that they are indeed missing from the cpanstats DB. Although they did enter the Metabase, they never came out again.
To resolve this I have been revisiting the Generator code to rework the reparse and regenerate code to enable search requests for missing periods, in the hope that this will retrieve most of the missing results. If it doesn't, then I will be asking David to produce a definitive list for me, and I will make specific requests for any missing reports. The Generator code has been updated in GitHub to include all the performance improvements that have been in live for some time too.
Erronously Parsed Reports
Every so often the parsing mechanism fails and stores the wrong data within the cpanstats DB. These days it seems to only affect the platform, OS version and OS name. I'm not quite sure what is happening, as reparsing the report locally again produces the correct results. This uses the same routine to parse the report, so why they occasional fail remains a mystery. However, to combat this, I now have a script that can run and search periodicly for this erroneous data and attempt to reparse the results. It can then alert me when it can't fix it and I can investigate manually. The have been occasions where the report can't be parsed due to the output being corrupted on the test machine, which unfortunately we can't always resolve. Sometimes there are enough clues within other parts of the report that point to a particular OS, but sometimes we just have to leave it blank.
It seems in putting some of this code live before leaving the hackthon, I accidentally reintroduced a bug. Slaven was quick to spot it and tell me about it, but unfortunately it was too late for me to fix it, as I needed to leave and catch my flight home. It should be fixed by the time you read this though, so all should be back to your regular viewing pleasure :) With the new script I've written, it should hopefully find and fix these errors in the future, as well as alerting me to fix the bug again!
Thanks Again
So that was the 2012 QA Hackathon. The show ended with a group photo, although a few were missing due to their early departures home, but I think we got most of us in. Including Miyagawa, who was taking the picture. The traditional thanks yous and good byes ensued and then Andreas and I headed off to begin our adventure getting the airport! The next hackathon, the 2013 QA Hackathon, will be in London. We'll have the domain pointed to the right place just as soon as Andy gets the website up and running. I look forward to a lot more involvement for next year, as we have been steadily growing in numbers each year. There has already been some significant output, but the event is much more than that. It's a chance to take to people face to face, discuss ideas and plan for the future. Expect more news for CPAN Testers soon.
Once again I would like to thank ShadowCat Systems for getting me here, and for being a great supporter of the QA Hackthons, as well as many other Perl events over the years. Thanks too to Laurent Boivin (elbeho), Philippe Bruhat (BooK) and the French Perl Mongers for making the 2012 QA Hackathon happen. The Hackathon wouldn't have happened without the generosity of corporations and the communities that donate funds. So thank you to ... The City of Science and Industry, Diabolo.com, Dijkmat, DuckDuckGo, Dyn, Freeside Internet Services, Hedera Technology, Jaguar Network, Mongueurs de Perl, Shadowcat Systems Limited, SPLIO, TECLIB", Weborama, and $foo Magazine. We also have several individuals to thank too, who all made personal contributions to the event, so many thanks to Martin Evans, Mark Keating, Prakash Kailasa, Neil Bowers, 加藤 敦 (Ktat), Karen Pauley, Chad Davis, Franck Cuny, 近藤嘉雪, Tomohiro Hosaka, Syohei Yoshida, 牧 大輔 (lestrrat), and Laurent Boivin
Meanwhile, Dan & Ethne would also like to thank Booking.com for their silly putty ;)
Cross-posted from Memoirs of a Roadie
Leave a comment