No program is an island, entire of itself; every program is a piece of the system, a part of the main. I don't think it is important to certainly be able to run a Perl program I wrote in 1998, unmodified, on an operating system written for hardware that hasn't been manufactured for 20 years. Every other part of that system needs to also work, not only the Perl interpreter. When the EDO RAM chips die, I'll have to find replacements or my Perl program is useless. When the ISA video card and network cards die, I'll have to find replacements or my Perl program is useless. When the OS or some other program on the machine stops working, my Perl program is useless. I don't need to be certain that I could run a useless program.
So, in truth, there is no certainty. We can keep maintaining our outdated technology and dance to obsolescence while the ecosystem moves on without us. Or, we can accept that uncertainty is all that exists in technology and take measured, considered steps into that uncertain future, leaving some parts of the past behind.
]]>Thanks to cPanel and Booking.com for their continued sponsorship of this event!
]]> To be quite honest, I didn't know what to get started on. Last year at the MetaCPAN Hackathon, I built a Github user auth for the new site in order to start letting users link their PAUSE accounts to authorize them to manage the reports for their distribution. Then at the Toolchain Summit, I built out a Docker-based development environment so I could test that system. But it felt a lot like putting the cart before the horse: Users can't manage reports until they can see reports.Then the site had some trouble with the database that took entirely too long to sort out (story of my life). Frankly, the database is just too big, and a lot of the data stored in the database is not read often enough to need a relational database. I decided that it would be a lot easier to do a lot of things if the database were a bit smaller, and posted about my plans to the mailing list. I got good feedback, and came up with a plan to simply remove the most static data from the database. But, to do that, the website would need to be able to read the data from its new home...
So I turned the landing page mockup from the http://beta.cpantesters.org site into a live demo: http://beta.cpantesters.org/web. This page shows the latest uploads to CPAN and the pass/fail counts received so far. This is the only page that is not a mockup currently, but I'll be working on the other pages over the coming months.
Additionally, I created a simple Docker workflow for hacking on CPAN Testers. My hopes for this are:
It's number 3 that I hope will be able to help fill in any gaps in reporting that might be created when I start pruning what data is stored in the database: Anyone could easily sync the raw reports and run whatever local reporting they would like.
If you'd like to help out, try out the new Docker development environment and give me some feedback. Then, take a look at our open projects for ideas on where you can help.
]]>1. My auto-correct "helped" me by turning $I into $I
. This site says that can be turned off by adding autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"
to the textarea: https://davidwalsh.name/disable-autocorrect
2. The Tab key changes focus. There are in-browser editors you could use to fix that, but I find that the friendliest thing to do is just to allow the Tab key to insert a tab. This StackOverflow post has a few ways to do it: https://stackoverflow.com/questions/6140632/how-to-handle-tab-in-textarea
Great job so far! I look forward to seeing what else you come up with!
]]>This talk comes from my series of blog posts for the 2018 Mojolicious Advent Calendar: A Website For Yancy, A View To A POD, and You Only Export Twice.
]]>If someone were to make a Yancy::Backend::MongoDB (for the official driver) or a Yancy::Backend::Mango (for the Mango driver), I would gladly add it to the core distribution. There is a standard set of tests that all backends should pass, and I can help anyone who wants to write a backend to get their tests passing either via a Github Pull Request or via e-mail (doug@preaction.me) or via IRC on irc.perl.org #yancy.
A MongoDB backend would enable very simple collections to be edited: A document could have only simple field/value pairs, not complex values like objects or arrays inside. To allow editing more complex data, the editor would likely need to enhanced to support it. To start, the editor could just allow hand-editing the JSON in the object/array, but in the future, I would want a full form to edit internal objects. This is on the roadmap, but it is not started (I will likely start this as part of adding relationships to the data).
]]>ysql
to query the database for a count of the records in the cpanstats
table for the "forks" distribution matching version "0.36".
doug@cpantesters3:~$ export DSN="dbi:mysql:dbname=cpanstats;mysql_read_default_file=$HOME/.cpanstats.cnf"
doug@cpantesters3:~$ ysql --dsn "$DSN" --count cpanstats --where 'dist="forks" and version="0.36"'
---
value: 3293
Then I needed to compare that to the API that the Matrix is using. I fetched the JSON array from the API using curl
, parsed it into YAML using yfrom json
, and then counted the number of records in the array using yq
's length
function:
doug@laptop:~$ curl http://www.cpantesters.org/distro/f/forks.json | yfrom json | yq '{ value: length }'
---
value: 321
So there's a discrepancy: The database has 3293 records, but the API only has 321 records. Let's also check the new API from http://api.cpantesters.org to try to narrow the scope of the problem:
doug@laptop:~$ curl http://api.cpantesters.org/v3/summary/forks/0.36 | yfrom json | yq '{ value: length }'
---
value: 3293
Since the new API has the same number of records as the database, now I know it's a problem specifically with the API that the Matrix is using. This API is a bunch of statically-generated files from back when disk space was cheap, CPU and memory were extremely expensive, and loading CGI applications was costly. So some part of this data is not being written to the static files.
Slaven's e-mail says that there were no records from before 2018-02-10. Maybe it was a specific report that had a problem and caused only the reports after it to be generated. I found what report it was using by downloading the file, parsing the JSON into YAML, flattening the array (.[]
), slicing the hashes inside to get only the fulldate
, id
, and guid
fields, turning that into a CSV with yto csv
which I then sort by the first field (fulldate
).
doug@laptop:~$ curl http://www.cpantesters.org/distro/f/forks.json | yfrom json | yq '.[] | { fulldate: .fulldate, id: .id, guid: .guid }' | yto csv | sort -t, -k1 | head -3
201802101425,3ae9e620-0e6e-11e8-a10e-9d908e183046,91949984
201802101443,ce1d059c-0e70-11e8-a10e-9d908e183046,91950270
201802121326,e7611678-6c71-1014-a17f-d9730956514b,92013272
A quick glance at that summary row doesn't reveal anything clearly wrong with the data, nor does looking at the row before it.
In the end, the most likely scenario here is that the file got deleted at some point and then when it was regenerated, only some of the data made it. Now I just need to find out how to regenerate these files, and the problem is fixed!
The Yertl toolkit isn't finished, but it has some useful tools for simple, dirty, data plumbing operations. The more I use Yertl, the more I see it could help me, and the more I think to add to it to make it even more useful (clearly I need a better hash slice syntax for yq
, for example, and probably a sort function).
Try Yertl out yourself and let me know what I can do to make it more useful to you!
while ( defined( my $line = <$fh> ) ) { ... }
becomes simply
for my $line ( <$fh> ) { ... }
(and no performance slowdown while Perl tries to read the entire file into an array).
I tried making this as a tie
, but the internals do not allow it to work in all the places that arrays need to work (for performance reasons, mostly). https://github.com/preaction/Tie-Iterator#why-a-core-iterator
For me, this year's Toolchain Summit was wildly productive, and as always, for every task I completed, two new tasks are revealed to take their place. If anyone would like to help, we could use web developers, backend data developers, devops help, API documentation help, and more. There are little tasks to do over a weekend, or big tasks to take ownership of. Contact me at doug@preaction.me and let me know what you'd be interested in.
Before I get into the full report of what I completed at the summit, I'd like to thank all of the sponsors for this event: NUUG Foundation,Teknologihuset, Booking.com, cPanel, FastMail, Elastic, ZipRecruiter, MaxMind, MongoDB, SureVoIP, Campus Explorer, Bytemark, Infinity Interactive, OpusVL, Eligo, Perl Services, Oetiker+Partner. Without sponsorship, this important work could not get done.
]]> New APIsWhile CPAN Testers primary goal is testing CPAN modules on a variety of different OSes and Perl versions, it is also used to test development Perl versions to see if they are backwards compatible with existing CPAN code. This project is called "Blead Perl Breaks CPAN", and has been helpful in keeping Perl 5 stable.
Before the summit, Todd Rinaldo announced that he and Nicolas R. were building a dashboard to show the current state of Blead Perl against CPAN, and track the work being done. In order to help make this dashboard work, I added some new APIs to http://api.cpantesters.org:
These APIs, along with all APIs on api.cpantesters.org are hosted on multiple servers behind a load balancer, and so are more reliable.
For new code, I will always recommend using api.cpantesters.org, but I am determined to ensure that the existing CPAN Testers ecosystem continues working so that the real work of keeping Perl and CPAN stable and useful can continue.
To that end, I built a new Mojolicious application to serve the individual test reports (like this example report for Minion-Backend-mysql). This has greatly improved performance for this page, which was the most expensive and most heavily-trafficked page on the site.
I plan to continue improving the performance of some of the current website's pages, including any JSON, XML, and RSS pages. I also have a plan to start banning the robots that are widely known to be poorly-behaved (ignoring robots.txt for example), which should also help system stability and performance.
Stability maintenance is still my primary role on the CPAN Testers project. The first day of the summit, Todd and Nicolas helped me clear out some of the error messages the database has been getting. I spent some time on the second and third days cleaning things up.
POSIX::_exit()
in Minion::Job to prevent DESTROY blocks from running. I patched the Minion code on CPAN Testers, but upstream is not yet fixed.The CPAN::Testers::Report Metabase JSON object contains strings that contain JSON objects which also contain strings that contain JSON objects (Metabase::Facts are recursive data structures). The amount of backslash escaping required to deserialize this JSON was making Mojo::JSON take up gigabytes of memory to parse, eventually being killed by the system's out-of-memory killer. The Cpanel::JSON::XS module does not have this limitation, and now neither does CPAN Testers.
Additionally, the process being killed by the OOM killer was leaving multi-megabyte /tmp files around, which led to an outage a few weeks ago when the /tmp directory filled up (800 Gigabytes of 30 Megabyte files). One outstanding task is to add monitoring around disk space usage on all the CPAN Testers machines to prevent this problem happening again.
All told, the event was a great success for me. A special thanks to all the organizers, whose tireless work in the months before the event makes sure everything goes smoothly, and whose work during the event makes sure everyone is fed, comfortable, and productive. I'm looking forward to the next one!
]]>Giving feedback on PrePAN is difficult: Feedback on building good CPAN-style distributions is automated in CPANTS (https://cpants.cpanauthors.org/) and PAUSE itself will not index dists it can't understand. The value PrePAN adds is in design feedback and solutioneering. This kind of feedback requires people have domain knowledge or to have used other solutions.
I still watch PrePAN, but I find I don't have the domain knowledge necessary to comment on most of the modules published there: I don't use AWS, so I can't know if a new modules solves a problem better. Often, I find that feedback comes when I've already released my code and written a bunch of announcement blog posts and getting started tutorials.
So, I'd say don't sweat it. It's useful to you, and you're offering it in the hope that it might be useful to another, as all open source software hopes to be.
]]>resources
folder, and then add them to the static file and template lookup paths using __FILE__
and Mojo::File
.
This is the same thing the Minion admin app does (a resources folder, and adding to static/template paths).
]]>