Moose-iplicity
The other day I ran across a pattern in our codebase at $work
, and I was mightily impressed. It solved a problem I had run into before, but the solution here was much more elegant than mine, so I’m officially stealing it. And sharing it with you.
The problem that we’re looking at today is building Moose attributes. Now, in and of itself, there’s nothing exciting going on there. Perhaps we might have an attribute whose value comes out of a config file, so we might build it like so:
package Thing
{
use Moose;
use Config::General;
has foo => ( is => 'ro', isa => 'Str',
lazy => 1, builder => '_foo_from_config' );
sub _foo_from_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my %config = Config::General::ParseConfig($config_file);
return $config{'Foo'};
}
# more stuff here
}
That’s certainly simple enough. Now, if our client accesses the foo
attribute, we hop on out to the config file, read it, parse it, and give the client what they want. And, if they never call $thing->foo
, then we never waste time reading in the config file. Easy peasy.
But what happens when there’s more than one bit of data in the config that we care about? Well, we can certainly expand on the basic pattern easily enough:
package Thing
{
use Moose;
use Config::General;
has foo => ( is => 'ro', isa => 'Str',
lazy => 1, builder => '_foo_from_config' );
sub _foo_from_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my %config = Config::General::ParseConfig($config_file);
return $config{'Foo'};
}
has bar => ( is => 'ro', isa => 'Str',
lazy => 1, builder => '_bar_from_config' );
sub _bar_from_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my %config = Config::General::ParseConfig($config_file);
return $config{'Bar'};
}
# more stuff here
}
And that will definitely work. But it’s a bit unsatisfying. I mean, if the client never asks for foo
or bar
, that’s fine. If they ask for one but not the other, still fine. But what happens if they ask for both? Suddenly there we are reading the whole config file in twice. That’s silly.
The solution, of course, is to set up a little system of “cascading lazy attributes”: that is, building one attribute triggers the building of another attribute (and, if you’re lucky, it stops right there ... although you could extend this metaphor out as far as you need to). Then you can trivially make two attributes both cascade to the same root attribute, meaning it only gets built once. Probably that “root” attribute is private, since it only really exists to build the other (public) attributes with. So you end up with something along the lines of this:
package Thing
{
use Moose;
use Config::General;
has _config_data => ( is => 'ro', isa => 'HashRef',
lazy => 1, builder => '_read_config' );
sub _read_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my $config = { Config::General::ParseConfig($config_file) };
return $config;
}
has foo => ( is => 'ro', isa => 'Str',
lazy => 1, default => sub { shift->_config_data->{'Foo'} );
has bar => ( is => 'ro', isa => 'Str',
lazy => 1, default => sub { shift->_config_data->{'Bar'} );
# more stuff here
}
Great: now the config file is only read once, and the two attributes are built off the temporary hash that we’re storing. Of course, a config file often lends itself being stored as a hash, sensibly. Sometimes your root attribute is just a raw holder of various seemingly unrelated things. Here’s another example, simplfied from some actual $work
code:
package CompanyName::Calculator
{
use Moose;
use List::AllUtils qw< apply >;
has _thing_builder => ( is => 'ro', isa => 'ArrayRef',
lazy => 1, builder => '_get_all_things' );
sub _get_all_things
{
my $self = shift;
my $by_cust = {}; # customer_id => [ thing, ... ]
my $sec = {}; # secondary_thing_id => 1
my $things = # thing_id => thing
{
map { $_->thing_id => $_ }
apply { $sec->{$_->secondary_id} = 1 if $_->has_secondary }
apply { push @{$by_cust->{$_->customer_id}}, $_ }
$self->schema->resultset('Thing')->search( {} )
};
return [$things, $by_cust, $sec];
}
has all_things => ( is => 'ro', isa => 'HashRef',
lazy => 1, default => sub { shift->_thing_builder->[0] );
has things_by_customer => ( is => 'ro', isa => 'HashRef',
lazy => 1, default => sub { shift->_thing_builder->[1] );
has secondary_things => ( is => 'ro', isa => 'HashRef',
lazy => 1, default => sub { shift->_thing_builder->[2] );
# more stuff here
}
So things can get messy pretty fast. It’ll get worse if people start adding code in between the root attribute and the “real” attributes, but it’s already not great. Here I’ve put the root attribute first, on the grounds that it might be nice to know just WTF _thing_builder
is when you’re reading those default subs for all_things
et al. But, on the other hand, the whole time you’re reading about what _thing_builder
does, you have no idea why it’s doing it, because you don’t know about all_things
etc yet. Having the attributes have no obvious connection to each other makes these issues inevitable.
So I’ve continued to follow this pattern, even though I keep thinking there must be a better way. I just never could quite visualize what that better way would look like. And then I found it.
I’m quite lucky at my current job in many ways, but probably the most important one is that my boss, as well as my boss’s boss (our CTO), are both working programmers. This provides a lot of advantages, and one of them is that I get to learn quite directly from them by viewing their code in situ. Our CTO in particular continues to write a healthy chunk of our codebase, and contributes to CPAN as well. What I’m saying is, the man gots some skillz. And every once in a while I run across something from him that I haven’t seen before.
Recently, I was analyzing some code with one of our business people. I wanted to make sure I was explaining to him exactly what the code really was doing, as opposed to what I thought it was probably doing, or what he thought it ought to be doing. Sometimes you don’t want to do this in front of people, because you’re figuring it out on the fly and of course sometimes you make a mistake about what the code is doing, or perhaps you feel a bit rushed because you don’t want to hold the other person up so you skim too quickly and miss things. But if you have a good relationship with the person, this can be a great tool for understanding code: as I’m sure we all know, sometimes the best way to understand something is to try to explain it to someone else. So I was doing that, and I ran across this pattern from our CTO, and I literally stopped right in the middle of my explanation and said “whoa ... man, I gotta remember this, because this is a really elegant solution to this problem I’ve had.”
What our CTO had done was to leverage Moose’s native traits to combine the builder attribute and the target attributes into one structure. Now, native traits weren’t new to me in general. I use them all the time, and they’re really wonderful—all_things
looks a lot closer to this:
has _all_things =>
(
is => 'ro',
isa => 'HashRef[Thing]',
lazy => 1,
traits => ['Hash'],
handles =>
{
all_things => 'values',
lookup_thing => 'get',
},
);
This way
all_things
returns a list of Thing
objects instead of an arrayref, and I can also leverage the standard hash lookup. So instead of this code: my @things = @{ $calculator->all_things };
my $one_thing = $calculator->all_things->{$thing_id};
I can write this code:
my @things = $calculator->all_things;
my $one_thing = $calculator->lookup_thing($thing_id);
Now, that’s not significantly different—
But my CTO was using the native traits in a way that I hadn’t seen before. To return to my first example, I could steal his pattern and rewrite my code to look like this:
package Thing
{
use Moose;
use Config::General;
use Moose::Meta::Attribute::Native::Trait::Hash;
has _config_data =>
(
is => 'ro',
isa => 'HashRef[Str]',
lazy => 1,
builder => '_read_config',
traits => ['Hash'],
handles =>
{
foo => [ get => 'Foo' ],
bar => [ get => 'Bar' ],
},
);
sub _read_config
{
my $self = shift;
my $config_file = $self->_get_config_file;
my $config = { Config::General::ParseConfig($config_file) };
return $config;
}
# more stuff here
}
Now, this isn’t perfect, admittedly: it’s a bit more complex, and not as obvious what’s going on at first glance. If my dependent attributes are more complex than simple scalars—all_things
—
But, on the plus side, look at the benefits we’ve gained:
- No more separation between root attribute and dependent attributes: everything is consolidated into a single attribute.
- Obvious how to add more dependant attributes, and trivial to do so.
- Less repeated code means less chance for things to get out of sync. For instance, if I need to make
foo
andbar
more interesting types than justStr
one day, this version gives me a single place to change that instead of two. And that benefit is multiplied the more dependent attributes I add.
So I was pretty pleased with this trick, and a bit jealous I didn’t think of it first. Hopefully you’ll find it useful in your Moose hacking as well someday.
Its a neat trick but I wonder if tying the orchestration / object initialization logic in that way isn't a bit too much implementation details in the interface? You could more carefully separate it (and avoid tying yourself to Moose) but just having a factory method:
sub new_from_config { return map { $_ => $conf{$_} } qw/bar foo bat/ }
This still however ties the implementation details of your configuration infrastructure into your object. You can avoid that with either a Factory class to encapsulate instantiation with the configuration implementation (and then you can more easier change both the configuration tool or the target class without have to re-write all your code) or even use a OIC framework like Bread::Board (or if you are doing web development Catalyst has enough of an IOC built in to handle these common cases neatly).
I strongly agree with John Napiorkowski above.
Your "neat trick" leads to a very untidy design. You just created a single class which implements two very different, completely unrelated things:
The instances of your Thing class now cannot be created the usual and predictable way:
but their attribute initializers are either magically autocreated from configuration (how you pass the configuration file?) or you can passed them like:
which is very cumbersome.
Also you prevented the config file for being read twice (once for foo, once for bar) for one instance, but it is still read once for every instance of Thing created.
You should definitely find some place in your application infrastructure other than Thing class to implement the particular (from configuration) object initialization.
Though I admire Moose there are features which usage is almost certain indicator of a bad design, complicated and dependent attribute initializers are one of them.
The sad thing about such "neat tricks" is that they can be further "improved" (and people will further discuss them). Two examples (it is an irony, do not follow them)
Hi, JNap! Long time no speak; hope things are going well out your way.
> You could more carefully separate it (and avoid tying yourself to Moose) but just having a factory method:
Well, sometimes avoiding tying yourself to Moose is desireable. But, if you're already hitched up to Moose pretty tightly, rewriting features which Moose gives you for free can just give you even more code to worry about having to maintain.
For instance, a
new_from_config
method as you suggest would be fantastic for a class full of attributes that all had to (or at least could) come out of a config file. But my example is two lazy attributes which are presumably just two of many. If the laziness is important, I don't want to write that myself when Moose has already done all the work for me. My example is much better for a class in which the majority of its attributes are passed in via the typicalnew
but a few of them should be drawn from a config file iff the client doesn't supply them.> This still however ties the implementation details of your configuration infrastructure into your object.
Yes, but inside the object. As long as I'm not exposing those implementation details to my client, I'm not as concerned. If my class needs to change, then I'm going to be changing the code of that class anyway. As long none of the client's code has to change, I consider that properly encapsulated.
Thanx for the comment Wayan. I'm always glad to see folks jumping into the discussion.
> Your "neat trick" leads to a very untidy design. You just created a single class which implements two very different, completely unrelated things:
>
> * some unspecified action for which foo and bar are needed, which is the sole purpose of Thing instances
> * initialization of your instances from the configuration
Well a couple of things here:
($1, $2, $3)
from a single regexp match._read_config
were a separate function, or a method in an entirely different class, that wouldn't really invalidate this example.> The instances of your Thing class now cannot be created the usual and predictable way:
Now that is a very good point, and it's probably the number one counterindication against using this pattern. But, on the other hand, sometimes not being able to pass in the attribute values is a plus. It just depends on your use case.
> ... (how you pass the configuration file?) ...
Probably in one of the many other attributes hidden by the "more stuff here" tag. :-)
> Also you prevented the config file for being read twice (once for foo, once for bar) for one instance, but it is still read once for every instance of Thing created.
Also a very good point. But, as I say, moving the reading of the config file out of this class doesn't really change things much. I still have a single data structure that I need to get from somewhere, whether that's a reader method, a singleton class, a service from something like Bread::Board, or whatever. And I only care about that structure because I need it to build the few attributes I really do care about. Maybe I could assume that however I'm getting that structure is a cheap retrieval for all but the first request, as in a typical singleton pattern, but that is an assumption I'd be making, and it means that if things change on the other end and retrieval becomes more expensive then my performance degrades without a single line of code being changed on my side. So that's a tradeoff I need to consider. Might be so unlikely as to be not worth worrying about ... or might not be.
> Though I admire Moose there are features which usage is almost certain indicator of a bad design, complicated and dependent attribute initializers are one of them.
I have to disagree with you there. There are very few things in programming which are universally bad. (I can't think of any off the top of my head, but I'm sure that as soon as I said they don't exist some smart person will toss one out in the comments. ;-> ) Just about every Moose feature is no exception: there are times when they are useful, and avoiding them is going to cost you unnecessarily, and there are times when they are dangerous and you're just buying trouble if you use them. But it's not some features which fall into the one camp and some which fall into the other. It's all features which fall into both camps ... just in different scenarios. You really have to look at every situation individually and analyze your options.
Of course, you can't analyze your options unless you know what all your options are in the first place. That's why it's good to know these little "tricks," even if they're not always a good idea. Because sometimes they are. And sometimes you're going to run across them, even if you wouldn't have done it that way yourself. So you need to have them in your brain—for pattern recognition, if nothing else.
k, no universally bad design.
Most of my comment comes from the fact that I am very much dependency injection oriented.
For me the foo, bar are the dependencies of $thing thus they should be passed to it
- most naturally as simple and required arguments of constructor. Where the caller
of Thing->new(..) gets them is another thing.
When I accept this, there is no room for Thing to initialize the arguments and dereference them from the config bunch.
Thus there is no space (in this particular example) for Moose cleverness
like Moose::Meta::Attribute::Native::Trait::Hash, lazy, handles, ...
Sorry for typo. "k, no universally bad design" should be "Ok, no universally bad design"
> Thus there is no space (in this particular example) for Moose cleverness
> like Moose::Meta::Attribute::Native::Trait::Hash, lazy, handles, ...
Well, sure: if you don't need lazy attributes to solve your problem, then you don't need this pattern. But, also, if you don't need objects, you don't need this pattern. ;-> (And we could continue in that vein forever, basically—if you don't need software, if you don't need a computer, if you don't need technology ...)