Perl Toolchain Summit 2017
For the second year, I have had the great privilege of attending the Perl Toolchain Summit (PTS, formerly called the QA Hackathon QAH). This year it was held in Lyon, France, and three cheers for the organizers; it was an amazing event!
Last year I unexpectedly became involved in the Meta::CPAN project, even to the point of hosting the first (annual?) Meta::Hack a few months ago. This year, I continued to work with them, however, the need was greater in the CPAN Testers realm and so there I went.
CPAN Test Reporter
On the first day I answered a challenge from Breno G. de Oliveira (garu). He is the author of several test reporting tools, including the client for the popular cpanminus install tool. Currently the reporter must parse the output log from cpanm to understand what happened during the tests. Indeed that is only possible if it makes a log, which it does not do during verbose mode (the output simply goes to the console). Given my history (exaggerated to be sure :-P) of involvement with the new Test2 testing architecture, which is based on events, he was wondering if it was possible to get a dump of all the events generated during an installation.
I am a web developer, and a Mojolicious user, so as that is my hammer, this problem looked like it needed a socket for each test to report to. Using some truly horrific hacks by the end of the day I had a mechanism by which cpanminus would collect a dump of all events produced, serialized to JSON. The worst of these hacks is that I had to overload much of the external process control so that rather than simply wait for the child process, it instead non-blockingly waits allowing the server to accept connections.
On the second day, and after a good night’s rest, I realized that the approach I had used was terribly over-engineered. I spent the majority of the day paring down the hacks and replacing bits until now each test file opens a new self-named file in a temporary directory and dumps its events as JSON there. The outer process then slurps all the files and aggregates them to a single JSON document as before. Importantly, by doing it this way, I only need to inject what is essentially an “around test” hook into cpanminus. All of the run overload become unnecessary.
With this proof of concept complete all that is needed is the ability to enable the dump of the events and the “around test” hook in cpanm. I reached out to Test2 author Chad Granum who told me that such a feature is essentially in the works anyway, but I hope to coordinate with him to ensure that the end result is useful for this reporter. Chad had shown spectacular devotion to providing features that the community needs, so I have no doubt that this should be easily accomplished.
The cpanm hook is perhaps a little more tricky. The application used to present hooks but they have since been removed. Further, cpanm development effort has essentially being directed to its upcoming successor Menlo. Menlo promises to be more extensible but it isn’t ready for prime time yet. All the same, Shoichi Kaji graciously offered to investigate the prospects of adding such a hook. For now however, it seems that this project might have to rely on the hack (as pseudo-sanctioned in the Menlo documentation) of a (fairly innocuous) monkey patch.
CPAN Testers Backend
Thanks to the new leadership of Doug Bell (preaction), the CPAN Testers service is getting a thorough cleaning both front and back-end. No slight intended to Barbie, but the age of the infrastructure was starting to show. The service was originally backed by e-mail, then later was improved to use Amazon SimpleDB; a choice that was made to support throughput. In the intervening years other technologies have advanced greatly and SimpleDB is now an expensive bottleneck, both time and money. To save on read costs (yes reads from SimpleDB cost money) the data is copied into MySQL for use locally.
The decision was reached (while I was working on the above) to remove SimpleDB; however it isn’t that easy. It isn’t known if users use that data or if there are, who and why. Therefore, in the short to medium term, the data can be replicated. As I mentioned before reads were literally expensive, so end-users haven’t had direct access to SimpleDB anyway, their access was via MySQL-backed cache. This means removing SimpleDB is an exercise in creating services that mimic the cache’s current behavior. Later once we have metrics on usage, these services may (and likely eventually will) be removed as users move to newer/friendlier APIs.
This task was handed to me, as I surfaced from cpanm hacking right at this point. For the remainder of the summit I completed two of the three boxes I needed to check while the third is waiting on me finding a few free hours to complete. Hopefully end users won’t see any affect from this work, but it will help simplify the backend and open the door for more interesting stuff. If you are interested, read more at preaction’s PTS 2017 reaction post.
CPAN Cover Job Queue
Finally, while working on that, Paul Johnson (pjcj), of Devel::Cover fame, asked if I could help build a job queue to back the generation of test coverage reports that he provides on http://cpancover.com. His intention is to quickly detect a release and have coverage reports run within minutes rather than the current hours that this process takes. As I dug into his current process it seems that a stumbling block is how he gets his list of distributions to be tested. It isn’t (as I had hoped) a progressive list of recent uploads, but instead a list of all the current modules on CPAN. I now suspect (but can’t quite confirm) that the reason it takes so long to update is that it is running all the test coverage of the current state of CPAN on each iteration. That seems like it would need even more time than he gives it though, so perhaps I am wrong.
In any case I hope to be able to help him with such a queue and make his process more efficient. I’m only just starting in on this however, so watch this space.
I can’t completely express how valuable this summit is to the Perl toolchain. I’m sure nearly everyone will agree that having 4 days blocked out of our busy schedules to work on these important infrastructure pieces would be reason enough. The fact that we have so many skilled developers, holders of arcane (and not so arcane) knowledge, commit bit/push access holders, PAUSE admins, etc in the same room can allow development that might take months of asynchronous work instead take days, hours or even minutes.
As I am not key to any one toolchain project, I am humbled to have been invited. Yet, for my free-agency I am exposed to many different projects; last year Meta::CPAN, this year CPAN Testers. Were I not in the room with the people who can get me from zero to contributing, I might not have gotten involved or done so much more slowly. I am deeply indebted to this event for letting me take part, allowing me to help in this way.
Of course there are many many people to thank. I want to thank my employer ServerCentral for allowing me the paid time to attend. I’d like thank the organizers again: Neil Bowers, Philippe Bruhat, and Laurent Boivin! And of course this event couldn’t take place without the support of the sponsors, thank you so very much!