Virtual Spring Cleaning Interlude: A herd of yaks, all waiting to be shaved

In my long-term quest to host all of my data on my systems, one of the major points is to replace the note-taking app Google Keep with something that allows me to take my notes back to me. I've looked at various open-source apps for taking and synchronizing notes, but they either feel like overdesigned monsters that don't fit my workflow (Laverna) or don't have good synchronization from mobile phone to the server.

So, I've been writing my own, which is an interesting travel investigating Javascript, HTML and offline Javascript applications running in the browser. Wanting to put up an online demo has made me paranoid with regards to (g)zip bombs in HTTP requests and responses, and I found no easy way to limit the request size before decompression.

My first target is HTTP::Message, because it is used by LWP. I implemented a rough and ugly prototype for limiting the size of a response before decompressing. I'm not a big fan of the API, but in my limited tests, it was easy to set a global or per-message limit of the size of responses and the decompression would die when the resource limit was reached.

The patch is unapplied, but I'm considering simply releasing it as HTTP::Message::Paranoid or HTTP::Message::ResourceLimit , just to get the code out onto CPAN. Many modules allow specifying the package used for request handling and that module can monkey-patch HTTP::Message anyway.

Another slow-burning project I have is a module for automatic convenient content extraction from web pages. An interesting project for that is ftr-site-config who maintain a set of web page description and extraction. I contributed some syntax corrections to some of their files, because my parser choked on them.

Maybe I should automate the syntax check for the project, to give some kind of continous integration. One problem with web scraping is that sites constantly change, so a test suite that talks to the outside world requires permanent attention and review.

Leave a comment

About Max Maischein

user-pic I'm the Treasurer for the Frankfurt Perlmongers e.V. . I have organized Perl events including 7 German Perl Workshops one YAPC::Europe.