Mapping the MOP to Moose

I spent much of last week on vacation with the family so very little actual coding got done on the p5-mop, but instead I did a lot of thinking. My next major goal for the p5-mop is to port a module written in Moose, in particular, one that uses many different Moose features. The module I have chosen to port is Bread::Board and I chose it for two reasons; first, it was the first real module that I wrote using Moose and second, it makes heavy use of a lot of Moose's features. I actually expect this port to be much more involved and require more actual design changes then the other ports have, simply because not all the Moose features used will be easily translated into MOP idioms. So in preparation for this I have been doing a lot of thinking about mapping the MOP to Moose and even started writing some documentation for it.

One of the first things I did while writing/thinking about this was (after much discussion on #p5-mop) to remove the built_by trait in favor of just using the normal builder syntax. After which the lazy trait was modified to no longer take the name of a method, but instead to just internally modify the default value of the attribute. As I have mentioned in the past, the goal is to keep things simple and minimal, so the removal of unnecessary features is always a good thing.

This eventually lead to a discussion about what lazy actually meant and how it should be handled internally. In Moose, we make a distinction between an attribute that is undefined and one that has not yet been initialized. In the MOP however, we decided that variables declared with has should behave much like variables declared with my or our, neither of which make a distinction between undefined and not-yet-assigned-to. For this reason we have decided that an attribute marked as lazy will trigger initialization both if it is undef or if it has not yet been initialized. The really nice thing about this is that it made two other moose features; predicate and clearer so easy to implement as methods that there was no need to make traits for them.

Of course, the best way to really illustrate all this is with code. Here is a simple Moose class.

package Cache;
use Moose;

has fetcher => (is => 'ro', required => 1);
has data => (
    is        => 'ro', 
    lazy      => 1,
    builder   => '_fetch_data',
    predicate => 'has_data', 
    clearer   => 'clear'

sub _fetch_data {

It implements a (very naive) cache, it takes a CODE ref that feeds the cache, has a predicate to test if there is any data in the cache, a clearer to flush the cache, and because the 'data' attribute is lazy, it will automatically grab the next value once the cache has been cleared. There is practically zero code here, everything is handled by Moose's code generation features.

Now, here is what that same code would look like using p5-mop.

class Cache {
    has $fetcher = die '$fetcher is required';
    has $data is ro, lazy = $_->_fetch_data;

    method has_data { defined $data }
    method clear { undef $data }

    submethod _fetch_data { $fetcher->() }

The $fetcher attribute is required (same as the Moose class) because if a value is not supplied to the constructor the builder code will run and die. The $data attribute is lazy and has a read-only accessor on it, and it's builder code will call the _fetch_data submethod (Note the use of $_, the "topic" variable in the builder to represent the current instance). And lastly you can see the very simple clearer and predicate methods.

Overall I prefer the aesthetics of the p5-mop version over the Moose version, it will be interesting to see if this still holds true for a real world codebase.


lazy = $_->_fetch_data;

that just seems weird to me.. what is $_ in this case? I find this api to be a little weird compared to say lazy => 'fetch_data', or lazy => &_fetch_data;

submethod? is this going to be a strange way of saying private method?

will this new syntax perhaps allow parameters to lazy builders? I would love to have parameters to lazy builders. Motivation for this would be something like ... pass a database id, lazy load the object from the database, passing the id would not cause the lazy loader to be called, only an attempt to retrieve the object. (this may be a bad idea, impossible/impractical)

so "lazy" means, don't do this work until this value is requested (you know this but just for clarities sake). Also there are many ways to solve this particular example, I'm just thinking of yet another one.

So let's say I have an API integration with CouchDB or some other ROA+RESTful web service.

Here's how I could do this in Moose

package Order;
use Moose;

has _lwp => (is => 'ro', required => 1);

has _item_id => ( is => 'ro', isa => 'ArrayRef' );

has item => (
is => 'ro',
lazy => 1,
default => sub { #uber shortened
my $self = shift;
my $res = $self->_lwp->get( $self->_item_id )
Item->new( json_decode( $res->content ) );

so of course we can do

my $o = Order->new({ _item_id => 1 });

my $item = $o->item;

but it might be nice to be able to write

Order->new({ item => 1 });
my $item_obj = $o->item;
# or
my $item_obj = $o->item( 1 );

This looks remarkably similar to coerce from, except I don't want the coercion to be proactive. Don't do it until I ask for the value. Maybe what I want would be better described as lazy coercion. Although it seems to me that at some point I had a case for wanting an attribute (so I could cache the value easily, to have a parameter because I wanted to switch the response based on parameter, but not necessarily ever attempt to reload again )

Again, this may be a bad idea or out of p5-mop scope.

I notice in your example you do do { ... $item_id ... }. No need to call $self then? and $item_id is lexically scoped to the object instance?

Concerning the alternate approach to define a "required" attribute I must say that I really dislike the readability of this.

has $fetcher = die '$fetcher is required';

This codeline will suggest to the reader of the code, that we will die here, no matter what, we did not say that this attribute is required anywere.

I really prefer the more explicit approach of using a "is required" declaration, making obvious what the coder wants to achieve with this section of the code.

"has $fetcher = die '$fetcher is required';"

This works for me, I did get it right off the top, but then again I've lurked on the p5mop channel and look at the test case checkins. I see the value of not inventing new syntax plus I think (and stevan can correct me if I am wrong) one of the purposes of this blog was to explain how p5mop is taking a more clear approach to what required means. In Moose it was often an edge case to distinguish between not initialized and initialized but undefined. For p5mop it seems the choice was made to try and solve this off the top by using a more clearly declared intention. At first blush I think this is a good thing, given the number of times I've hit the edge case described. I do think though if we can find actual code reasons the approach is in question, Stevan would certainly hear it.

My first thought was to propose a trait:

class Cache {
    has $fetcher is default { die '$fetcher is required' };
    has $data is ro, lazy, default { $_->_fetch_data };
method has_data { defined $data } method clear { undef $data }
submethod _fetch_data { $fetcher->() } }

And I like that a lot. I don’t think anyone could be confused over what this means. If you really chafe at the verbosity, maybe simply add special syntax for the builder, by making it an optional trailing code block?

class Cache {
    has $fetcher { die '$fetcher is required' };
    has $data is ro, lazy { $_->_fetch_data };
method has_data { defined $data } method clear { undef $data }
submethod _fetch_data { $fetcher->() } }

Though in this case, binding $self within the builder rather than $_ does start to seem more natural…

Either way, using = to mean closure construction rather than assignment just weirds me out, because of the way Perl mixes compilation and execution. Consider the syntactically analogous …

package Cache {
    my $fetcher = die '$fetcher is required';
    my $data = $_->_fetch_data;
sub has_data { defined $data } sub clear { undef $data }
sub _fetch_data { $fetcher->() } }

… in which Perl would throw an exception “on sight”. But inside class, on has lines, somehow assignment isn’t actually assignment.

Now I’d get used to it, sure. But I don’t think it’ll ever feel natural. Marking the code as a block, OTOH, IMO makes it naturally understood that it’ll run “at some point” and not necessarily when the declaration is executed.

So, I hope it's not too much off topic, but what I liked in Ruby, is that attributes variables have a leading sigil '@'. I don't like the syntax (the sigil), but I like the idea: it makes it obvious to the reader that the variable is not a simple lexical one. Can we ( pretty-please) have something like that ?

You know how some coders lack imagination regarding variable names

has $fetcher = ...

... 200 lines later ...

method foo {
my ($fetchers, $fetching_data, $data_fetched, $fletch) = ...;
$fetchers->($fletch, $fetcher->($fetching_data));

The example is stupid, but you see how it can be difficult to pinpoint the attribute. Which is IMO a very important feature of an OO syntax. Also, companies using Perl will then enforce "coding styles" that imply prefixing all attributes with "attr_" or something... Any thoughts ?

Rather than
"has $fetcher = die '$fetcher is required';"
I was hoping for
"has $fetcher || die '$fetcher is required';"

Is it true that that won't work because you have to allow for undef as an acceptable value?

Leave a comment

About Stevan Little

user-pic I blog about Perl.