Perl101: Encapsulation via Protected Abstract Methods
Imagine you have an employee base class, but you know that the salary calculation will be different per employee type. You might decide to do this:
package Employee;
use strict;
use warnings;
use Carp 'croak';
sub new { bless {} => shift }
sub salary { croak 'You must override salary() in a subclass' }
1;
The idea is that you're providing this class to a bunch of other programmers and they're going to subclass Employee for their specific needs. There are a variety of things you would probably want to do differently for this class but for this post we'll just talk about the risks of making a public abstract method and how to minimize them. But first, some terms need to be defined.
First, an abstract method is simply a method without an implementation. It's designed to be overridden in a subclass by a concrete method. Second, in many OO languages, methods have different access levels. The three most common are:
- Public: anyone call call this method
- Private: only the class can call this method
- Protected: only the class and subclasses can call this method.
Perl's built-in OO doesn't really offer these access levels, but here's how many programmers implement them:
my $private_method = sub {
# because this sub is a lexical variable, its scope is restricted
# and it's not possible to call it outside of this scope
my ( $self, $num ) = @_;
return $num + 1;
};
sub public_method {
# because this sub name starts with a alpha, it's public
my ( $self, $num ) = @_;
# here's how you call private methods:
return $self->$private_method($num);
}
sub _protected_method {
# method names which begin with an underscore are understood not to be called
# publicly, but instead are treated as private or protected by developers,
# depending on how they're documented.
}
So given that, what do I mean when I say you should make your abstract methods protected? Well, consider how the Employee class should actually be used. You want to give it to other developers to subclass and later your payroll developers are going to do this:
my @employees = $company->get_employees;
my $month = $company->get_fiscal_month;
my $payroll = 0;
foreach my $employee (@employees) {
$payroll += $employee->salary($month);
}
Obviously this will be a lot more complicated, but that's the core of what you want and that's the core of what is wrong. You may have heard before that subclassing violates encapsulation and here's a perfect example of it. The salary() method is part of the public interface, but now you're letting anyone implement it any way they see fit. If the Employee class is your expert for how to model an employee, it's not much of an expert if it can't control what data it returns. What happens when someone returns a salary object instead of a number? What happens if a bug causes a negative salary to be returned? What if one developers salary method expects and argument of "hours worked" an another one expects the fiscal month? Neither tests nor documentation are the solution to this problem. Write your code to make sure the problem can't happen.
Languages with richer method signatures can avoid some of these problems, but even they can't adequately deal with the problem of a bug returning a negative salary or some other fundamental failure of business logic. What's important is that we want to be able to respect the Liskov Substitution Principle (LSP). Without all the high-falutin' words, the principle simply means that when you replace a class with a subclass, your code should still work correctly. LSP helps to define how this can occur, but how can we really have this when we're giving our class to others for subclassing? The trick is to have protected abstract methods:
package Employee;
use strict;
use warnings;
use Carp 'croak';
use Scalar::Util 'looks_like_number';
sub new { bless {} => shift }
sub salary {
my ( $self, $month ) = @_;
# validate arguments
unless ( $month->isa('Fiscal::Month') ) {
croak("... with an appropriate error message ...");
}
my $salary = $self->_salary($month);
# validate returned values
unless ( looks_like_number($salary) && $salary > 0 ) {
croak("... with an appropriate error message ...");
}
return $salary;
}
sub _salary { croak 'You must override _salary() in a subclass' }
1;
Do you see what's happening here? Employee is, once again, an expert for its problem domain. It declares what's important about salary, it lets subclasses calculate the salary, but then verifies that they're not being naughty. Being Perl, we really can't stop someone from overriding the salary() method directly, but by documenting that only protected methods should be overridden and being vigilant in this, we can make our systems more robust and leave classes to their true role of being domain experts. This can't stop all of the bugs, but it can make them easier to catch. Later, when some developer returns a salary of €1,000,000, you can add more sanity checking.
This seems like a lot of overkill and if you're the only developer on a small project, maybe it is. Much of what separates experienced developers from newer ones is the ability to make judgement calls like this. However, when you're working on larger systems and you want them to scale, defensive programming is important and protected abstract methods can improve the encapsulation of your classes.
This is a nice technique! I wish you gave it a better name - it is more about validation then having protected abstract methods. I can see this applied in languages where protected abstract method has already a well defined meaning.
great post -- your 101 series is superb.
from the peanut gallery : what about ending each post with a related "homework" question? That said, I was trying to think of an example that would apply to this post and couldn't :/
Not to distract too much from the meat of your post, but I wanted to suggest that you avoid using lookslikenumber in examples for beginners. It accepts, for example, "Inf" and "NaN", which will just lead to heartache when somebody gets allotted an infinite salary.
Thought I'd whip up an example using MooseX::Declare: https://github.com/rhesa/example-encapsulation
It doesn't actually use subclassing. I've chosen to make the abstract Employee a role. The input and output validation is done using types.
I'd like to know if my approach makes sense, so feedback is welcomed!
@rhesa: that's an excellent example and really shows the beauty of Moose (and MooseX::Declare).
You could also use Moose's augment/inner functionality to enforce a "Parent know's best" enforcement of methods. You can read about this in the Moose Manual: http://search.cpan.org/perldoc?Moose::Cookbook::Basics::Recipe6
In my mind it begs the question if we'd like MooseX::Declare to to be able to specify a full method signature in the requires. I imagine something like
use MooseX::Declare; role Employee { use Types qw(FiscalMonth Salary); requires salary(FiscalMonth) returns(Salary);
}
I don't know how hard it would be to add :)
}
That's something I've actually brought up on the Perl 6 lists before: requiring a method by name and not by full signature simply isn't enough for a robust, large-scale system. They talked about resolving that in Perl 6 (don't know if they have) and I would love to find a nice Perl 5 solution.
right now when I need to be that anal I'll do something like:
package MyRole; use Moose::Role; requires 'some_method';
around 'somemethod', sub { my ($orig, $self, @args) = @; ## Run code to validate @args; my @return = $self->$orig(@args); ## Run code to validate @return return @return; };
1;
But that's pretty hacky. I wonder if its not just simply extending Moose::Meta::Role::Method::Required to have a signature and then finding the code that checks for method existence and extend that? I just don't know where that is in the code :( I also wonder how that would come into play with the method conflict resolution code?