May 2011 Archives

Bundlefly - Make Your Bundles Fly

Bundlefly is a hack I've written to build a graph of a bundle's distributions and install them in optimal order. It accelerates the installation of entire library suites for new Perl builds and perlbrew instances. As with App::PipeFilter, it may end up on CPAN if there's interest.

Autobundle snapshots are comprehensive by design. They list all installed modules at a particular point in time. We should rarely be asked to confirm "unsatisfied dependencies" while installing them. The dependencies are almost always somewhere in there.

To compound the suck, we're often asked to install the same fundamental dependencies repeatedly. ExtUtils::MakeMaker and Test::More immediately come to mind. We shouldn't be asked once, yet we're asked several times by the end of the day.

One problem is that autobundle snapshots list distributions alphabetically, and CPAN's shell installs them in that order. Test::More, a distribution used to test a large portion of CPAN, is installed relatively late—after it's already been prepended to the install queue as a dependency of several other distributions.

Bundlefly's dependency graph allows it to install dependencies before dependents. The only "unsatisfied dependencies" one should ever see are those introduced since the last CPANDB build and which aren't listed in the autobundle snapshot.

App::PipeFilter 0.001 on its way to CPAN

Thanks, everyone who was interested in App::PipeFilter. Your interest motivated me to clean it up and send the very first release to PAUSE. It'll be on CPAN once your favorite mirror catches up. Check out the snazzy documentation, which consumed most of my time since my last announcement.

Patches and pull requests welcome!

App::PipeFilters gets multiline parsing and JSON::Path

Tonight I added support for multiline JSON input to all the App::PipeFilter tools. This is great for data sources that are beyond one's control, such as those found on the web. But I haven't found one to use as an example, so you get this instead:

% curl -s 'http://search.twitter.com/search.json?q=pipefilters' |
jpath -o '$..from_user' -o '$..text' |
jmap -i col0 -o from -i col1 -o text |
json2yaml

... produces output like this:

--- 
from: perlironman
text: "Rocco Caputo (rcaputo): App::PipeFilters - JSON in the Shell http://bit.ly/mGgnOX"

See what I did there with JSON::Path expressions? The jpath filter can extract fields from deep within JSON objects (but jcut will be faster for simple JSON objects).

Here's another example:

% curl −s 'http://api.duckduckgo.com/?q=poe&o=json' |
jpath −o '$..Topics.*.FirstURL' −o '$..Topics.*.Text' |
grep −i perl |
jmap −i col0 −o url −i col1 −o title |
json2yaml

... produces:

−−−
title: Perl Object Environment, a library for event driven multitasking for the Perl programming language
url: http://duckduckgo.com/Perl_Object_Environment

App::PipeFilters - JSON in the Shell

I've just put App::PipeFilters on github for review before I inflict them on CPAN.

They're a small set of UNIX command line tools for working with structured data. In particular, JSON objects, one per line. They're compatible with many UNIX command line tools like sort and uniq. From the README:

% head −1 sample.json
{"network":"freenode","channel":"#perl","nick":"dngor","karma":"120"}

% jcut −o network −o channel < sample.json | sort | uniq
{"network":"efnet","channel":"#perl"}
{"network":"efnet","channel":"#poe"}
{"network":"efnet","channel":"#reflex"}
{"network":"freenode","channel":"#perl"}
{"network":"freenode","channel":"#poe"}
{"network":"freenode","channel":"#reflex"}
{"network":"magnet","channel":"#perl"}
{"network":"magnet","channel":"#poe"}
{"network":"magnet","channel":"#reflex"}

The new repository contains just a few tools for working on JSON data. The goal is to let UNIX do most of the work. ☺

  • jcut is like cut(1) but understands named fields
  • jmap renames JSON fields
  • json2yaml reads JSON and writes YAML, which may be easier for some to read.
  • jsort is like sort(1) but -k names JSON fields
  • mysql2json reads mysql(1) batch files (-B flag) and writes JSON

Sleep Sort with POE

Sleep sort is described in this silly 4chan thread. I don't guarantee work safety.

It's essentially an insertion sort into time itself. A timer is created for each numeric value to sort, and the order in which they occur determines the outcome.

I've implemented a parallel version using POE.

#!perl

use warnings;
use strict;
use POE;

POE::Session->create(
    inline_states => {
        _start => sub {
            $_[KERNEL]->delay_add(ding => $_, $_) foreach @ARGV;
        },
        ding => sub {
            print "$_[ARG0]\n";
        }
    }
);

POE::Kernel->run();

And some sample output, using time(1) to show that it sleeps about as long as the largest value to sort.

% time perl sleep-sort.pl 9 4 2 8 3 7 1 0 4 2 8 
0
1
2
2
3
4
4
7
8
8
9
real   9.11
user   0.08
sys    0.02

About Rocco Caputo

user-pic Among other things I write software, a lot of which is in Perl.