Moose-iplicity


The other day I ran across a pattern in our codebase at $work, and I was mightily impressed.  It solved a problem I had run into before, but the solution here was much more elegant than mine, so I’m officially stealing it.  And sharing it with you.

The problem that we’re looking at today is building Moose attributes.  Now, in and of itself, there’s nothing exciting going on there.  Perhaps we might have an attribute whose value comes out of a config file, so we might build it like so:

package Thing
{
    use Moose;
    use Config::General;

has foo => ( is => 'ro', isa => 'Str',
lazy => 1, builder => '_foo_from_config' );

sub _foo_from_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my %config = Config::General::ParseConfig($config_file);
return $config{'Foo'};
}

# more stuff here
}

That’s certainly simple enough.  Now, if our client accesses the foo attribute, we hop on out to the config file, read it, parse it, and give the client what they want.  And, if they never call $thing->foo, then we never waste time reading in the config file.  Easy peasy.

But what happens when there’s more than one bit of data in the config that we care about?  Well, we can certainly expand on the basic pattern easily enough:

package Thing
{
    use Moose;
    use Config::General;

has foo => ( is => 'ro', isa => 'Str',
lazy => 1, builder => '_foo_from_config' );

sub _foo_from_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my %config = Config::General::ParseConfig($config_file);
return $config{'Foo'};
}

has bar => ( is => 'ro', isa => 'Str',
lazy => 1, builder => '_bar_from_config' );

sub _bar_from_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my %config = Config::General::ParseConfig($config_file);
return $config{'Bar'};
}

# more stuff here
}

And that will definitely work.  But it’s a bit unsatisfying.  I mean, if the client never asks for foo or bar, that’s fine.  If they ask for one but not the other, still fine.  But what happens if they ask for both?  Suddenly there we are reading the whole config file in twice.  That’s silly.

The solution, of course, is to set up a little system of “cascading lazy attributes”: that is, building one attribute triggers the building of another attribute (and, if you’re lucky, it stops right there ... although you could extend this metaphor out as far as you need to).  Then you can trivially make two attributes both cascade to the same root attribute, meaning it only gets built once.  Probably that “root” attribute is private, since it only really exists to build the other (public) attributes with.  So you end up with something along the lines of this:

package Thing
{
    use Moose;
    use Config::General;

has _config_data => ( is => 'ro', isa => 'HashRef',
lazy => 1, builder => '_read_config' );

sub _read_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my $config = { Config::General::ParseConfig($config_file) };
return $config;
}

has foo => ( is => 'ro', isa => 'Str',
lazy => 1, default => sub { shift->_config_data->{'Foo'} );
has bar => ( is => 'ro', isa => 'Str',
lazy => 1, default => sub { shift->_config_data->{'Bar'} );

# more stuff here
}

Great: now the config file is only read once, and the two attributes are built off the temporary hash that we’re storing.  Of course, a config file often lends itself being stored as a hash, sensibly.  Sometimes your root attribute is just a raw holder of various seemingly unrelated things.  Here’s another example, simplfied from some actual $work code:

package CompanyName::Calculator
{
    use Moose;
    use List::AllUtils qw< apply >;

has _thing_builder => ( is => 'ro', isa => 'ArrayRef',
lazy => 1, builder => '_get_all_things' );

sub _get_all_things
{
my $self = shift;
my $by_cust = {}; # customer_id => [ thing, ... ]
my $sec = {}; # secondary_thing_id => 1
my $things = # thing_id => thing
{
map { $_->thing_id => $_ }
apply { $sec->{$_->secondary_id} = 1 if $_->has_secondary }
apply { push @{$by_cust->{$_->customer_id}}, $_ }
$self->schema->resultset('Thing')->search( {} )
};
return [$things, $by_cust, $sec];
}

has all_things => ( is => 'ro', isa => 'HashRef',
lazy => 1, default => sub { shift->_thing_builder->[0] );
has things_by_customer => ( is => 'ro', isa => 'HashRef',
lazy => 1, default => sub { shift->_thing_builder->[1] );
has secondary_things => ( is => 'ro', isa => 'HashRef',
lazy => 1, default => sub { shift->_thing_builder->[2] );

# more stuff here
}

So things can get messy pretty fast.  It’ll get worse if people start adding code in between the root attribute and the “real” attributes, but it’s already not great.  Here I’ve put the root attribute first, on the grounds that it might be nice to know just WTF _thing_builder is when you’re reading those default subs for all_things et al.  But, on the other hand, the whole time you’re reading about what _thing_builder does, you have no idea why it’s doing it, because you don’t know about all_things etc yet.  Having the attributes have no obvious connection to each other makes these issues inevitable.

So I’ve continued to follow this pattern, even though I keep thinking there must be a better way.  I just never could quite visualize what that better way would look like.  And then I found it.

I’m quite lucky at my current job in many ways, but probably the most important one is that my boss, as well as my boss’s boss (our CTO), are both working programmers.  This provides a lot of advantages, and one of them is that I get to learn quite directly from them by viewing their code in situ.  Our CTO in particular continues to write a healthy chunk of our codebase, and contributes to CPAN as well.  What I’m saying is, the man gots some skillz.  And every once in a while I run across something from him that I haven’t seen before.

Recently, I was analyzing some code with one of our business people.  I wanted to make sure I was explaining to him exactly what the code really was doing, as opposed to what I thought it was probably doing, or what he thought it ought to be doing.  Sometimes you don’t want to do this in front of people, because you’re figuring it out on the fly and of course sometimes you make a mistake about what the code is doing, or perhaps you feel a bit rushed because you don’t want to hold the other person up so you skim too quickly and miss things.  But if you have a good relationship with the person, this can be a great tool for understanding code: as I’m sure we all know, sometimes the best way to understand something is to try to explain it to someone else.  So I was doing that, and I ran across this pattern from our CTO, and I literally stopped right in the middle of my explanation and said “whoa ... man, I gotta remember this, because this is a really elegant solution to this problem I’ve had.”

What our CTO had done was to leverage Moose’s native traits to combine the builder attribute and the target attributes into one structure.  Now, native traits weren’t new to me in general.  I use them all the time, and they’re really wonderful—if you’re not familiar with them, you really should check them out, because they let you do all sorts of really cool things.  For instance, I mentioned that I simplified the code above; the real declaration of all_things looks a lot closer to this:

    has _all_things =>
(
is => 'ro',
isa => 'HashRef[Thing]',
lazy => 1,
traits => ['Hash'],
handles =>
{
all_things => 'values',
lookup_thing => 'get',
},
);

This way all_things returns a list of Thing objects instead of an arrayref, and I can also leverage the standard hash lookup.  So instead of this code:
    my @things = @{ $calculator->all_things };
my $one_thing = $calculator->all_things->{$thing_id};

I can write this code:
    my @things = $calculator->all_things;
my $one_thing = $calculator->lookup_thing($thing_id);

Now, that’s not significantly different—in fact, underneath the prettier syntax, it does exactly the same thing—but it’s clearer and less cluttered, and that makes it easier to read, and that makes it easier to maintain.  So using native traits in Moose is a tool in my toolbox that I use quite often, and I recommend you do as well.


But my CTO was using the native traits in a way that I hadn’t seen before.  To return to my first example, I could steal his pattern and rewrite my code to look like this:

package Thing
{
    use Moose;
    use Config::General;
    use Moose::Meta::Attribute::Native::Trait::Hash;

has _config_data =>
(
is => 'ro',
isa => 'HashRef[Str]',
lazy => 1,
builder => '_read_config',
traits => ['Hash'],
handles =>
{
foo => [ get => 'Foo' ],
bar => [ get => 'Bar' ],
},
);

sub _read_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my $config = { Config::General::ParseConfig($config_file) };
return $config;
}

# more stuff here
}

Now, this isn’t perfect, admittedly: it’s a bit more complex, and not as obvious what’s going on at first glance.  If my dependent attributes are more complex than simple scalars—if they’re hashrefs with their own native trait needs, for instance, like my expanded example for all_thingsthen this isn’t going to be feasible.

But, on the plus side, look at the benefits we’ve gained:

  • No more separation between root attribute and dependent attributes: everything is consolidated into a single attribute.
  • Obvious how to add more dependant attributes, and trivial to do so.
  • Less repeated code means less chance for things to get out of sync.  For instance, if I need to make foo and bar more interesting types than just Str one day, this version gives me a single place to change that instead of two.  And that benefit is multiplied the more dependent attributes I add.

So I was pretty pleased with this trick, and a bit jealous I didn’t think of it first.  Hopefully you’ll find it useful in your Moose hacking as well someday.










7 Comments

Its a neat trick but I wonder if tying the orchestration / object initialization logic in that way isn't a bit too much implementation details in the interface? You could more carefully separate it (and avoid tying yourself to Moose) but just having a factory method:

sub new_from_config { return map { $_ => $conf{$_} } qw/bar foo bat/ }

This still however ties the implementation details of your configuration infrastructure into your object. You can avoid that with either a Factory class to encapsulate instantiation with the configuration implementation (and then you can more easier change both the configuration tool or the target class without have to re-write all your code) or even use a OIC framework like Bread::Board (or if you are doing web development Catalyst has enough of an IOC built in to handle these common cases neatly).


I strongly agree with John Napiorkowski above.

Your "neat trick" leads to a very untidy design. You just created a single class which implements two very different, completely unrelated things:

  • some unspecified action for which foo and bar are needed, which is the sole purpose of Thing instances
  • initialization of your instances from the configuration

The instances of your Thing class now cannot be created the usual and predictable way:

my $thing = Thing->new(foo => ..., bar => ...);

but their attribute initializers are either magically autocreated from configuration (how you pass the configuration file?) or you can passed them like:

my $thing = Thing->new(_config_data => { foo => ..., bar => ...});

which is very cumbersome.

Also you prevented the config file for being read twice (once for foo, once for bar) for one instance, but it is still read once for every instance of Thing created.


You should definitely find some place in your application infrastructure other than Thing class to implement the particular (from configuration) object initialization.


Though I admire Moose there are features which usage is almost certain indicator of a bad design, complicated and dependent attribute initializers are one of them.


The sad thing about such "neat tricks" is that they can be further "improved" (and people will further discuss them). Two examples (it is an irony, do not follow them)

  • you can keep your _config_data attribute lazy (and dependent on other attributes) while forcing it to be evaluated in constructor via BUILD
        sub {
            my $this = shift;
            $this->_config_data;
        }
    

  • you can implement your config based atributes via parameterized roles

k, no universally bad design.

Most of my comment comes from the fact that I am very much dependency injection oriented.
For me the foo, bar are the dependencies of $thing thus they should be passed to it
- most naturally as simple and required arguments of constructor. Where the caller
of Thing->new(..) gets them is another thing.

When I accept this, there is no room for Thing to initialize the arguments and dereference them from the config bunch.
Thus there is no space (in this particular example) for Moose cleverness
like Moose::Meta::Attribute::Native::Trait::Hash, lazy, handles, ...

Sorry for typo. "k, no universally bad design" should be "Ok, no universally bad design"

Leave a comment

About Buddy Burden

user-pic 10 years in California, 21 years in Perl, 30 years in computers, 51 years in bare feet.