Mapping the MOP to Moose

By Stevan Little on August 4, 2013 1:12 AM

I spent much of last week on vacation with the family so very little actual coding got done on the p5-mop, but instead I did a lot of thinking. My next major goal for the p5-mop is to port a module written in Moose, in particular, one that uses many different Moose features. The module I have chosen to port is Bread::Board and I chose it for two reasons; first, it was the first real module that I wrote using Moose and second, it makes heavy use of a lot of Moose's features. I actually expect this port to be much more involved and require more actual design changes then the other ports have, simply because not all the Moose features used will be easily translated into MOP idioms. So in preparation for this I have been doing a lot of thinking about mapping the MOP to Moose and even started writing some documentation for it.

One of the first things I did while writing/thinking about this was (after much discussion on #p5-mop) to remove the built_by trait in favor of just using the normal builder syntax. After which the lazy trait was modified to no longer take the name of a method, but instead to just internally modify the default value of the attribute. As I have mentioned in the past, the goal is to keep things simple and minimal, so the removal of unnecessary features is always a good thing.

This eventually lead to a discussion about what lazy actually meant and how it should be handled internally. In Moose, we make a distinction between an attribute that is undefined and one that has not yet been initialized. In the MOP however, we decided that variables declared with has should behave much like variables declared with my or our, neither of which make a distinction between undefined and not-yet-assigned-to. For this reason we have decided that an attribute marked as lazy will trigger initialization both if it is undef or if it has not yet been initialized. The really nice thing about this is that it made two other moose features; predicate and clearer so easy to implement as methods that there was no need to make traits for them.

Of course, the best way to really illustrate all this is with code. Here is a simple Moose class.

package Cache;
use Moose;

has fetcher => (is => 'ro', required => 1);
has data => (
    is        => 'ro', 
    lazy      => 1,
    builder   => '_fetch_data',
    predicate => 'has_data', 
    clearer   => 'clear'
);

sub _fetch_data {
    (shift)->fetcher->()
}

It implements a (very naive) cache, it takes a CODE ref that feeds the cache, has a predicate to test if there is any data in the cache, a clearer to flush the cache, and because the 'data' attribute is lazy, it will automatically grab the next value once the cache has been cleared. There is practically zero code here, everything is handled by Moose's code generation features.

Now, here is what that same code would look like using p5-mop.

class Cache {
    has $fetcher = die '$fetcher is required';
    has $data is ro, lazy = $_->_fetch_data;

    method has_data { defined $data }
    method clear { undef $data }

    submethod _fetch_data { $fetcher->() }
}

The $fetcher attribute is required (same as the Moose class) because if a value is not supplied to the constructor the builder code will run and die. The $data attribute is lazy and has a read-only accessor on it, and it's builder code will call the _fetch_data submethod (Note the use of $_, the "topic" variable in the builder to represent the current instance). And lastly you can see the very simple clearer and predicate methods.

Overall I prefer the aesthetics of the p5-mop version over the Moose version, it will be interesting to see if this still holds true for a real world codebase.

15 comments

Tagged as:

moose, p5-mop, perl

15 Comments

Caleb Cushing ( xenoterracide ) | August 5, 2013 1:26 AM | Reply

lazy = $_->_fetch_data;

that just seems weird to me.. what is $_ in this case? I find this api to be a little weird compared to say lazy => 'fetch_data', or lazy => &_fetch_data;

submethod? is this going to be a strange way of saying private method?

will this new syntax perhaps allow parameters to lazy builders? I would love to have parameters to lazy builders. Motivation for this would be something like ... pass a database id, lazy load the object from the database, passing the id would not cause the lazy loader to be called, only an attempt to retrieve the object. (this may be a bad idea, impossible/impractical)

Stevan Little | August 5, 2013 2:06 AM | Reply

In the builder, $_ is the current instance. We were originally thinking of making $self available to builders, but $_ is much much easier to implement and if you think of $_ as being the "topic" variable, then the "topic" of a builder is the current instance (similar to $_ being the "topic" of a map iteration).

Also, lazy and the builder are actually two independent concepts. The builder is just general initialization code, anything to the right of the = is basically wrapped up in a CODE ref and stored as the builder. Lazy in this case is just a trait that looks at the attribute, removes it's default and then schedules the default for later execution. So you can see that they work together, but they are actually two separate mechanisms.

As for submethod, this is borrowed from Perl 6, it basically means a method that does not get inherited. It is useful for things like BUILD and DEMOLISH which you would never actually want to be inherited since they are class specific. In this case I used it for the _fetch_data builder because I didn't see the need for that to be inherited (although it can be overridden).

As for the idea of "parameters to lazy builders" I am not sure I understand, can you provide a more detailed example, perhaps with code?

Caleb Cushing ( xenoterracide ) | August 5, 2013 4:09 AM | Reply

so "lazy" means, don't do this work until this value is requested (you know this but just for clarities sake). Also there are many ways to solve this particular example, I'm just thinking of yet another one.

So let's say I have an API integration with CouchDB or some other ROA+RESTful web service.

Here's how I could do this in Moose


package Order;
use Moose;

has _lwp => (is => 'ro', required => 1);

has _item_id => ( is => 'ro', isa => 'ArrayRef' );

has item => (

    is => 'ro',

    lazy => 1,

    default => sub { #uber shortened

       my $self = shift;

       my $res = $self->_lwp->get( $self->_item_id )

       Item->new( json_decode( $res->content ) );

    },

);

so of course we can do


my $o = Order->new({ _item_id => 1 });

my $item = $o->item;

but it might be nice to be able to write


Order->new({ item => 1 });
my $item_obj = $o->item;
# or
my $item_obj = $o->item( 1 );

This looks remarkably similar to coerce from, except I don't want the coercion to be proactive. Don't do it until I ask for the value. Maybe what I want would be better described as lazy coercion. Although it seems to me that at some point I had a case for wanting an attribute (so I could cache the value easily, to have a parameter because I wanted to switch the response based on parameter, but not necessarily ever attempt to reload again )

Again, this may be a bad idea or out of p5-mop scope.

Stevan Little | August 5, 2013 4:29 AM | Reply

The equivalent of your Moose class would be ...

class Order {
has $lwp = die '$lwp is required';
has $item_id;

has $item is ro, lazy = do {
Item->new( json_decode( $lwp->get( $item_id )->content ) );
}
}

... and would function pretty much exactly the same as what you wrote. In order to produce something that functions like you are looking for, you would only need to write a few more lines of code ...

class Order {
has $lwp = die '$lwp is required';
has $_item;

submethod BUILD (%args) {
$self->item( $args{'item'} )
if exists $args{'item'};
}

method item ($id) {
$_item //= Item->new( json_decode( $lwp->get( $id )->content ) );
}
}

... this would handle both the lazy accessor and passing in an item id via the constructor.

Caleb Cushing ( xenoterracide ) | August 5, 2013 5:11 AM | Reply

I notice in your example you do do { ... $item_id ... }. No need to call $self then? and $item_id is lexically scoped to the object instance?

Roland Lammel | August 5, 2013 7:58 AM | Reply

Concerning the alternate approach to define a "required" attribute I must say that I really dislike the readability of this.

has $fetcher = die '$fetcher is required';

This codeline will suggest to the reader of the code, that we will die here, no matter what, we did not say that this attribute is required anywere.

I really prefer the more explicit approach of using a "is required" declaration, making obvious what the coder wants to achieve with this section of the code.

john napiorkowski | August 5, 2013 1:38 PM | Reply

"has $fetcher = die '$fetcher is required';"

This works for me, I did get it right off the top, but then again I've lurked on the p5mop channel and look at the test case checkins. I see the value of not inventing new syntax plus I think (and stevan can correct me if I am wrong) one of the purposes of this blog was to explain how p5mop is taking a more clear approach to what required means. In Moose it was often an edge case to distinguish between not initialized and initialized but undefined. For p5mop it seems the choice was made to try and solve this off the top by using a more clearly declared intention. At first blush I think this is a good thing, given the number of times I've hit the edge case described. I do think though if we can find actual code reasons the approach is in question, Stevan would certainly hear it.

Stevan Little replied to comment from Caleb Cushing ( xenoterracide ) | August 5, 2013 1:56 PM | Reply

Caleb – Yes, all the attribute are available in method bodies as pseudo-lexical variables. I say "pseudo" because they are really what I call "instance" scope, meaning they will always have the value that corresponds to the current instance, but otherwise behave just like lexicals.

Stevan Little replied to comment from Roland Lammel | August 5, 2013 2:11 PM | Reply

Roland – to be honest, I am not always 100% comfortable with the bare code either, however, it does make sense once you think about it a little. For instance, this (I assume ) would be obvious ...
has $foo = 10;
The value of $foo would be 10 in all your instances. So from here is shouldn't be too much of a stretch to see that this code ...
has $foo = Bar->new;
Would result in $foo having an unique instance of Bar in all you instances. At first glance you might think that they will all have the same instances of Bar, but if you map it back to the literal, they would not all have the same "instances" of 10, would they? No, it would be a new copy of it each time. At this point, I think it is obvious that anything on the right hand side of the = will be captured and "replayed" for each instance creation. What is actually done is that it gets wrapped inside a CODE ref, so technically it is equivalent to ...
has $foo = sub { Bar->new };
So, now, taking this one step further, this maybe doesn't seem to odd now ..
has $foo = die '$foo is required';
since it really is just this ...
has $foo = sub { die '$foo is required' };
That all said, you are welcome to write a 'required' trait, it would be quite simple. But I will point out that having a ton of traits on an attribute tends to start looking really crowded very quickly.
has $foo is ro, lazy, required, handles({ foo => 'bar' }), ...;

Aristotle | August 5, 2013 8:25 PM | Reply

My first thought was to propose a trait:

class Cache {
    has $fetcher is default { die '$fetcher is required' };
    has $data is ro, lazy, default { $_->_fetch_data };

    method has_data { defined $data }
    method clear { undef $data }

    submethod _fetch_data { $fetcher->() }
}

And I like that a lot. I don’t think anyone could be confused over what this means. If you really chafe at the verbosity, maybe simply add special syntax for the builder, by making it an optional trailing code block?

class Cache {
    has $fetcher { die '$fetcher is required' };
    has $data is ro, lazy { $_->_fetch_data };

    method has_data { defined $data }
    method clear { undef $data }

    submethod _fetch_data { $fetcher->() }
}

Though in this case, binding $self within the builder rather than $_ does start to seem more natural…

Either way, using = to mean closure construction rather than assignment just weirds me out, because of the way Perl mixes compilation and execution. Consider the syntactically analogous …

package Cache {
    my $fetcher = die '$fetcher is required';
    my $data = $_->_fetch_data;

    sub has_data { defined $data }
    sub clear { undef $data }

    sub _fetch_data { $fetcher->() }
}

… in which Perl would throw an exception “on sight”. But inside class, on has lines, somehow assignment isn’t actually assignment.

Now I’d get used to it, sure. But I don’t think it’ll ever feel natural. Marking the code as a block, OTOH, IMO makes it naturally understood that it’ll run “at some point” and not necessarily when the declaration is executed.

Stevan Little | August 5, 2013 9:41 PM | Reply

Aristotle – My issue with the default BLOCK approach is that it would require special parsing, which is something we are trying to avoid. And before you say it, we cannot put a & prototype on the 'default' trait subroutine, because trait subroutines get the meta-object they apply to as their first argument, so that won't work.

My issue with the bare block syntax is that it too would immediately execute since that is what bare blocks do in Perl. And if that is not the case, then once again we are introducing special parsing and then to add to that, a change in semantics.

I will point out that there is a TON of prior art for my decision to use the `=` sign and bare code on its right hand side. To start with, this is how Perl 6 does it, secondly this is how languages like Java, C#, Scala, etc. all handle their attributes default values. Your package example really is in no way analogous, because a package is not a class and a has declaration is not a my declaration, they are syntactically and semantically very different things.

As for "getting used to it" and it "ever feeling natural", I suspect that existing Perl programmers will ultimately get used to it, but (and perhaps more importantly) new Perl programmers coming from other languages will find it much more comfortable then having to wrap everything in an explicit closure.

Damien "dams" Krotkine | August 6, 2013 1:09 PM | Reply

So, I hope it's not too much off topic, but what I liked in Ruby, is that attributes variables have a leading sigil '@'. I don't like the syntax (the sigil), but I like the idea: it makes it obvious to the reader that the variable is not a simple lexical one. Can we ( pretty-please) have something like that ?

You know how some coders lack imagination regarding variable names

has $fetcher = ...

... 200 lines later ...

method foo {

  my ($fetchers, $fetching_data, $data_fetched, $fletch) = ...;

  $fetchers->($fletch, $fetcher->($fetching_data));

}

The example is stupid, but you see how it can be difficult to pinpoint the attribute. Which is IMO a very important feature of an OO syntax. Also, companies using Perl will then enforce "coding styles" that imply prefixing all attributes with "attr_" or something... Any thoughts ?

Stevan Little | August 6, 2013 3:18 PM | Reply

Damien – I agree, i would love to have twigils like Perl 6 ($.foo and $!foo), but I am not sure that is easily done with the Perl 5 parser, it is certainly not possible in the prototype.

I agree that this is tricky, it actually bit me a few times during the Plack port. Many other languages (I will bring up Java, C# again) have this issue, which they solve by having a way to disambiguate, typically by using the pseudo variable this, which is something we could look into.

Ron Savage | August 6, 2013 11:47 PM | Reply

Rather than
"has $fetcher = die '$fetcher is required';"
I was hoping for
"has $fetcher || die '$fetcher is required';"

Is it true that that won't work because you have to allow for undef as an acceptable value?

Stevan Little replied to comment from Ron Savage | August 7, 2013 2:45 PM | Reply

Ron – Yes, that wouldn't work for that reason. The reason the '=' version works is simply because when no value (undef or otherwise) is passed into the constructor, the default value is evaluated. In this case, the evaluation of that default value results in it die-ing.

I am starting to wonder if perhaps this shouldn't be a trait (which under the hood would actually just do this) simply because the syntax seems to be so disturbing to some people.

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Stevan Little

I blog about Perl.

More info »

Stevan Little