Objects are experts

Note: this is part of an occasional series of posts aimed at newer programmers, or newer Perl programmers. I'm going to talk about what objects are, but even programmers who think they know the answer might benefit from some of this. That includes myself. You might take exception to some of what I write, so feel free to chime in with counter-arguments.

For many older programmers such as myself (I'm 42 as of this writing), their first computer program is often a variant of the following BASIC program:

10 INPUT "What's your name? ", N$
20 PRINT "Hello, "; N$
30 GOTO 10

That's already pretty interesting. In just three lines of code we have some idea about:

  • I/O (input/output)
  • Variables (N$)
  • Data types (the '$' indicates a string variable)
  • Flow control (the GOTO)

That's a pretty heady mix and you can cover a lot of fascinating topics with that. Where you go from there is pretty wide open, but it also tends to express what limitations you'll experience. For example, after BASIC, I went on to 6809 Assembler and then C. What's interesting about the progression from BASIC to Assembler to C is that all of these are procedural languages. If you accept a variant of the Sapir-Whorf Hypothesis, the languages you learn determine (strong variant) or influence (weak variant) how you perceive the world. At least with computer languages, I believe this is true. Heavy exposure to procedural languages can make it difficult to understand object-oriented ones (let's not even go to functional or logic languages, lovely though they are). Thus, we find ourselves having to explain to new programmers what an "object" is.

I hear many people trying to describe classes. The two most common descriptions I hear are:

  • Blueprints for objects
  • Descriptions of unifying data and behaviour

Neither of those descriptions is necessarily wrong, but neither is really clear. They don't describe the raison d'être of objects. I'll get there, but I want to talk about people management for a moment. One way of looking at employees is to break them down by skilled/unskilled and motivated/unmotivated categories. For example, a minimum wage employee at a fast food restaurant might be an "unskilled/unmotivated" employee. However, if you hire chromatic you're going to get a "skilled/motivated" employee (who quickly becomes a "skilled/unmotivated" employee when you try to turn him into a Crystal Reports programmer -- sorry they did that to you, chromatic!). You will have a very short management career if you, as a manager, think you can manage them the same way. Unskilled and unmotivated employees need more direction and supervision. Skilled and motivated employees tend to be handed tasks and then are left to their own devices.

A skilled and motivated employee should be thought of as an expert. There are, of course, degrees of expertise. Relative to most readers of this blog, I'm an "expert" in Prolog. Relative to Salvador Fandiño García, I can be classified as "unskilled/motivated". I might be expert enough to help you solve some problems, but I'm not expert enough to build large-scale Prolog systems. Thus, my expertise is relative to the problems you need solved. (Note: the remainder of this analogy will focus on the skilled/unskilled. You might find it interesting to think about motivated/unmotivated in relation to software agents or fuzzy logic).

So what's an object? An object is simply an expert for a problem you need solved. That's all.

Note the two important points I made clear about experts:

  • They're "experts" relative to the problem you need solved.
  • They don't need a lot of supervision.

Both of these points are very important in designing objects. The first point, "experts relative to the problem" simply means "have your objects do what you need and nothing more". Some people call this YAGNI.

The second point, "objects don't need a lot of supervision" is the really important point here. Objects are experts and you shouldn't have to tell an object step-by-step what to do. Here's a simple example of how not to create an object, using Moose. I use Moose because I want to focus on objects, not on Perl's syntax.

#!/usr/bin/env perl

use Modern::Perl;
{
    package Avatar;
    use Moose;
    has name     => ( is => 'ro', isa => 'Str' );
    has birthday => ( is => 'rw', isa => 'DateTime' );
    has age      => ( is => 'rw', isa => 'Int' );
    # other fields omitted for simplicity 
}

use DateTime;
my $avatar = Avatar->new({
    name     => 'Bob',
    birthday => DateTime->now,
    age      => 16,
});

say $avatar->name;
say $avatar->birthday;
say $avatar->age;

Right off the bat, you probably see at least one problem with that. Some of you might see more than one, but that's not quite true.

First, "name" seems odd. We usually have first and last names, right? Well, maybe "Avatar" is an avatar in a game and they only have a single name, not first and last names. Thus, having a single "name" probably satisfies our "expert relative to the problem" criteria. Don't split those names out if your problem domain doesn't require them. That being said, it's often not as clearcut as in this case.

However, what about "birthday" and "age"? We tend to think of age in years, so clearly someone with a birthday of "now" can't be sixteen years old! While this is probably wrong, it might not be. For example, let's again assume we're talking about a game and each account might only be allowed one avatar, so you make a decision to not have a separate "Account" object per "Avatar" object (I'm not arguing that this is a good or bad idea -- I'm using it for illustration). You've merged the two concepts and "age" refers to the age of the avatar (in game) and "birthday" refers to when the account was created. These are clearly poorly named, but they're not necessarily in conflict.

If, however, your "Account" and "Avatar" classes are separate and "birthday" really does refer to the avatar's birthday, than someone can set the age to anything they want and it doesn't match the birthday. the Avatar class is clearly not an expert in its problem domain because it gets things wrong.

To fix this, we delete the "age" attribute and turn it into a method:

    sub age {
        my $self = shift;
        return ( $self->birthday - DateTime->now )->years;
    }

Now you can no longer set the age and it's always correct. The avatar is now an expert in this area and you can trust its responses.

So what does this really mean for objects? Because they're experts, you should let them handle the problems that they should understand. I recently had to fix some code which looked like this (rewritten for this example):

my $version = Version->new({ parent => $episode });
if ( $episode->is_available ) {
    $version->set_broadcasts($broadcasts);
}

This was the only place we were calling this code and thus is was fine. However, the Version should be an expert on whether or not it has broadcasts (a "version" of an episode might be the original episode, edited for violent content, or edited for other reasons). Since it already knows what episode it's a version of, it should know if it has broadcasts. When I had to use this code in another place, I had two choices:

  • Cut 'n paste the code above
  • Let the version be an expert

You already know what I did:

my $version = Version->new({ 
    parent     => $episode,
    broadcasts => $broadcasts,
 });

The version object now returns broadcasts only if the episode is available. If more constraints are added, I don't have to hunt down every place I create a version object. I only have to add it to the Version::broadcasts method. It's an expert; it knows if it has broadcasts or not.

This is closely related to the tell, don't ask principle of objects. They know what they're doing. Let them do it.

I'll finish this by returning to to my management example. Let's say that you've hired a personal shopping assistant and you need some printer paper. Here's the procedural method of telling the assistant to buy the paper:

  • Here's $20.
  • Go to Office Warehouse.
  • Pick up the printer paper which fits the Mark III printer.
  • Take this paper to the cashier.
  • Pay the cashier the appropriate amount of money for the paper.
  • Return to me.
  • Give me the paper, change, and receipt.

That sounds really annoying and it's a micro-management style of management. However, it might be the approach to take if you have an unskilled employee.

Here's how you would handle this with a skilled employee (think "objects" here):

  • Here's $20. Go buy me Mark III printer paper.

The employee just knows the rest. Which would you rather program?

As a quick counter-example, and for bonus points:

use Modern::Perl;
use HTML::TokeParser::Simple;

my $new_folder = 'new_html/';
my @html_docs  = glob("*.html");

foreach my $doc (@html_docs) {
    print "Processing $doc\n";
    my $new_file = "$new_folder$doc";

    open my $file, ">", $new_file or die "Cannot open $new_file for writing: $!";

    my $p = HTML::TokeParser::Simple->new( file => $doc );
    while ( my $token = $p->get_token ) {
        if ( $token->is_start_tag('form') ) {
            my $action = $token->get_attr('action');
            $action =~ s/www\.foo\.com/www.bar.com/;
            $token->set_attr( 'action', $action );
        }
        print $file $token->as_is;
    }
    close FILE;
}

Note that I test whether or not something is a start tag. Generally you shouldn't want to have to do specific direction like that. Why did I do that and how would you rewrite this? (Hint: I subclassed off of HTML::TokeParser). And yes, this is a trick question :)

For further discussion, read The Authentication Fairy (disclaimer: I wrote that).

5 Comments

Just a nit: For anyone doing internationalized systems having a single name field isn't a sign of trouble. Having first name / last name pairs is.

Unless your program has to deal with some legacy system that demands first/last names you shouldn't be asking for them. Millions of people on this planet have no last name (I happen to be one of them), and get into trouble because of systems like these.

Too much validation is bad and will invariably fail. It's like assuming that nobody has an E-Mail address containing the character +, is living in a postal code you hadn't thought of, or has the character - in their name.

Not only should you let people enter their names as they like, but their real name name might be different from how you'd actually address them, whether that's due to a cultural thing, a power relationship, or a nickname.

Having a slightly odd name tends to make me aware of just how bad the situation is for most web things. I'm "brian d" to a lot of things, which probably isn't as annoying as being "Johannes van" or something like that.

In my perfect world, there's an address book where there is a difference between the entered name, the short name, and the displayed name. Most software seems to only have one of these and munges whatever you enter to fit what they think. Even if you don't like what people entered, it should stick around.

It's not really a tough problem, it's only thought to be tough because people set themselves up to fail.

Take E-Mail validation as an example, unless you're writing a MTA the right answer to "how should I validate E-Mail addresses" is "don't, just accept any input and pass it onto the MTA".

It's the same with names. If you're accepting names and using them internally in your system head -c 30 /dev/urandom is usually as good as any. It's only going to be used by the user and their friends to refer to themselves anyway.

When you're passing data onto some external systems you can usually just rely on the external system to barf if you don't give it data that it likes. I.e. don't try to validate credit card numbers, just try to do a transaction and report if it fails.

Just let me be the first one to say this:

Quit objectifying chromatic!

(I'm sorry, I had to!)

Excellent post! :)

Leave a comment

About Ovid

user-pic Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant. Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.