Encapsulation: recommended practice or sacred cow?

In the p5p discussions of Dave Rolsky's new Perl OO tutorial and OO recommendations, Johan Vromans and others have mentioned that in good OO programming, one should not violate encapsulation by directly accessing an objects underlying data structure.

$self->{foo} = "Look Ma! No encapsulation"; # don't do this, they say

In general that is true, but not always, We should avoid absolutist language, especially in tutorials. I'll come to an example of that.

For Nama, I generally use a setter with this syntax:

$self->set(foo => "bar");

The set method (inherited from a parent class based on Object::Tiny) makes sure the key is a legal one.

Because it looks distinct, I'm not likely to use it unless I want really write access to that attribute.

This simple approach allows me to manage object attributes culturally, i.e. without specifying them as read-only or read-write. In an app of 13k lines, the 'set' method appears just 110 times.

But it's still possible to directly modify an object in other ways:

my $self = Object->new( foo => [qw(this is a pen)] );
my $array_ref = $self->foo;
$array_ref->[3] = 'lobster'; 
print $self->as_string # "this is a lobster"

I am living with that.

The other point is that I believe there can be legitimate reasons to violate encapsulation. I recently found a good example. Nama allows you to create a new track object that refers to WAV files with a basename other than the track's own name.

For an ordinary track, the name matches the WAV file:

my $Mixdown = Track->new( name => 'Mixdown'); # associates with Mixdown.wav

Here is a track created by the 'link_track' command, that also associates with Mixdown.wav.

my $song = Track->new( name => 'song', target => 'Mixdown');

Here is a track created by the 'new_region' command, that indirectly associates with the same file.

my $final_song = Track->new( name => 'final_song', target => 'song' )

Here is the code I use so that $final_song->target returns 'Mixdown':

package Track;

sub target {
    my $self = shift;
    my $parent = $Track::by_index{$self->{target}};
    defined $parent && $parent->target || $self->{target};
}

Now there could be some discussion about my using the track name (as opposed to a Track object) as the value for the 'target' field. In short, this design decision has made it easy to serialize and to debug by dumping objects as YAML.

Regarding encapsulation, the point is that accessing $self->{target} is essential for this code to work.

Accessing the underlying hash provides the behavior I want.

I think it's simplistic to assume that every new user could or should find other ways to achieve this behavior without violating encapsulation.

Our tutorials should respect readers' intelligence. I think we can tell them what is good practice, without prescribing an idealistic straitjacket.

Even if I'm missing an easy solution that doesn't access the hash, why should I be required to find one?

For example, I could do it this way:

package Track;

sub _target{ $_[0]->{target} }

sub target {
    my $self = shift;
    my $parent = $Track::by_index{$self->_target};
    defined $parent && $parent->target || $self->_target;
}

But the code is no clearer and the first line still violates encapsulation.

I can see the importance of respecting encapsulation in large projects using others' OO libraries, but not all use cases fall into that category.

2 Comments

In your last example, you create a "private" accessor. That doesn't violate encapsulation because no one outside the class should know about it. It also "violates" the least number of times in the least number of places.

However, it's much easier to subclass things when there is a single place that touches your data structure. When you access the data structure directly all over the code, I can't easily change behavior with respect to that part of the data structure because I have no way to override that (well, unless there's tie black magic). Ideally, most method gets all of their info by calling other methods so a subclass doesn't have to completely replace methods that it can't affect by changing the one bit they really need to change. This isn't just OO purity. It makes things such as testing much easier.

Saying "It will never happen to me" is why so many CPAN dists are hard to extend.

With _target, you just have to know the name of the method instead of what it does or what part of the data structure it accesses. When you encode parts of the data structure directly into several methods, you have to make changes in several places if the data structure change.

Leave a comment

About Joel Roth

user-pic I blog about Perl.