You have nothing to lose but your chains!

Perhaps a misleading title. Seeing as this is not a political blog but a Perl one, I’m going to talk about method chaining, not worker’s unions.

Method chaining is the practice of consecutively calling methods on the return of a previous method. This comes in primarily two flavors. The first isn’t as common in Perl, though it is used extensively in Mojolicious, is when a method has nothing useful to return, it can return itself. This allows for say chaining setter methods $self->set_foo("FOO")->set_bar("baz"), or chaining related test methods

my $t = Test::Mojo->new;
$t->get_ok('/page/1/')
   ->status_is(200)
   ->text_like('#id' => qr/foo/);

While this is useful, it’s not my topic today. I’m going to talk about the more simple form, calling a method that returns an object, then calling a method on it, and so on.

This type of chaining can be seen when doing a complicated download/extraction/transform using Mojo::UserAgent, Mojo::DOM and others. In a StackOverflow question, a user is asking a question about parsing HTML to find URLs and making them absolute. You might notice that the OP doesn’t use chaining in places they can. This is understandable, it is not a common paradigm in Perl. You can see that the OP transforms his results into a list and processes them using a for-loop.

Sure you might stop me here ask, what does this question of style have to do with the OP’s question? Nothing, on the surface, but looking deeper, he is using the URI library to parse out the results of a call via Mojo::UserAgent. He then notes that though he is aware of Mojo::URL but he doesn’t see the benefit of using it.

Part of the utility itself is the chaining nature! Most Mojo classes implement an api that is compatible with a chaining style. You can see in my response, this is how I might implement a script to get absolute urls for all the stylesheets on the http://mojolicio.us homepage:

use Mojo::Base -strict;
use Mojo::UserAgent;

my $url = 'http://mojolicio.us';

my $ua = Mojo::UserAgent->new->max_redirects(10);
my $tx = $ua->get($url);
my $base = $tx->req->url;

$tx->res
  ->dom
  ->find('link[rel=stylesheet]')
  ->map(sub{$base->new($_->{href})->to_abs($base)})
  ->each(sub{say});

Notice that my script handles the possibility that there might be redirects before receiving the response and thus causing the absolute links to be incorrect. The chaining seen works as follows: from the Transaction I get a Response, from that a DOM. find returns a Mojo::Collection object which is a blessed array reference; this one contains multiple DOM objects. map creates a new Collection with the result of its callback applied to each item. each then runs a callback on each item (much like map though its intent is different).

The body of the map gets the item itself as $_, the base url is used as a shortcut to create a new URL object, created from the href attribute and then getting a new URL object, made absolute by the $base url itself.

There is a calm beauty in that style, but yet it could be better. See the way that I must use the URL constructor ($base->new(...)) almost as a circumfix operator? Wouldn’t it be nice if that could be chained too? But $_->{href} returns a string, not a URL object. It would be possible to put logic into Mojo::DOM to extract certain attributes and return URL objects, but that it getting into the realm of being overly specific in an api that is very general. Tabling that distasteful option, there is still one avenue remaining.

We can simply add a url method to the Perl string type itself! In this way we could simply call $_->{href}->url->to_abs($base), thus preserving the chain. Can this be done? Indeed, it is easier than you might think!

This article serves as a (rather longwinded!) announcement of Mojo::AutoBox and it’s companion for one-liners ojoBox.

use Mojo::Base -strict;
use Mojo::UserAgent;
use Mojo::Autobox;

my $url = 'http://mojolicio.us';

my $ua = Mojo::UserAgent->new->max_redirects(10);
my $tx = $ua->get($url);
my $base = $tx->req->url;

$tx->res
  ->dom
  ->find('link[rel=stylesheet]')
  ->map(sub{$_->{href}->url->to_abs($base)})
  ->each(sub{say});

With this module, the above modifications are possible. Other handy methods are added for parsing and emitting JSON via Mojo::JSON, string operations via Mojo::ByteStream, and functional transforms using Mojo::Collection. The most clever real-world case I could think of was extracting HTML from a JSON document, parsing it to find the first <a> tag, extracting it’s href attribute, parse it as a URL, extract its host, then via ByteStream say the result, if only so that I don’t have to go back to the top and put say at the beginning, when in reality, it is the final thing to be done.

use Mojo::Base -strict;
use Mojo::Autobox;

# "site.com\n"
'{"html": "<a href=\"http://site.com\"></a>"}'
  ->json('/html')
  ->dom->at('a')->{href}
  ->url->host
  ->byte_stream->say;

I can do a similar operation on the command line, extracting all the <a> tags from the Mojo homepage, and printing the hosts line by line:

$ perl -MojoBox -E 'g("http://mojolicio.us")->dom->find("a")->each(sub{$_->{href}->url->host->b->say})'

Where the g function is provided by the ojo module, which ojoBox loads for you, and the b “string method” is an alias for byte_stream method seen above.

I hope this has whetted your appetite for chaining method calls, and now that methods that return strings and array references no longer bind you, your chains are more freeing than ever!

Happy Perling!

2 Comments

Hi Joel

Nice article. But tell me, in the 3rd code example, did you accidentally copy the 2nd code example? They look identical to me.

Leave a comment

About Joel Berger

user-pic As I delve into the deeper Perl magic I like to share what I can.