On the object metaphor

The field of computation has many many metaphors. Objects is one of them.

To be honest, I don't really understand object oriented programming. I understand procedural programming aka C.

Procedural programming is like treating your computer as a dumb assistant. You tell it _what to do_ in the exact order(program/script).

Recently I began thinking about OOP in terms of how it can expand the dumb assistant metaphor. So this post is an exercise of trying to articulate it albeit, poorly.

In the real world, objects are dumb things. They sit around and do nothing. It is always a person who finds use for them.

Classically speaking, things have properties and things can be arranged into classes. It is the properties of the *thing* that differentiate it from other things of it's class and things of other classes.

A *class* is an ideal form of an object. Classes don't exist in the real world. Objects do. Classes are merely a way in which human organize things, for the sake of convenience (book stores). This is why sometimes objects fit into different taxonomies.

To organize according to class, is an innate quality in human beings.

For eg) If you have a json file and an xml file, you might instinctively consider naming the variables as json_file or xml_file to indicate the class of the file.

One important thing happens in the real world due the classification into classes. I am sure, you have heard the phrase ... "just another action movie" or "just another chick flick". Once you start classifying things into classes, the objects loose significance of their own.

Every thing is unique, but in the eyes of the class, they are variations of the ideal form.

Now consider activities. Activities have a purpose, require a person and they too can arranged into various classes wrt this purpose. Just like objects, upon classification, each activity can become anonymous as something that someone can do.

Coming back to the programming world, can we use classes as a means to organizing complexity ?

code = dumb assistant
code does some algorithms with some data

One can see from the analogy of the book store that classes come with an inbuilt capacity to store things and classes have a name.

Therefore, in the classical sense, it is quite easy imagine a hash to hold a genre of
books or a table holding the books. A relational database table is a kind of a hash, so one view database design as an act of classification.

Similarly, by purpose, algorithms can be classified into packages(atleast in perl) which are loaded runtime. Most likely the purpose is going to be messing with data of type something.

This makes sense. The dumb assistant is now aware of classes of data and can perform a package of algorithms wrt each class of data. As a programmer, our job is make sure that this dumb assistant does not fuck up and apply the wrong algorithms to the wrong data by giving him _very_ precise instructions.

Thus, using classes in a program is a way of making the dumb assistant multi-task.

Modular programming specifically refers to classifying algorithms into packages for reuse, me thinks. c-style struct design/database design specifically implies thinking about data classification and reuse, me thinks.

Now, let us expand the bookstore metaphor. It makes sense, to have an volunteer for each section. Or, on "world book day" or something, we can imagine a volunteer representing her favorite section.

In fact with a bit of imagination we can imagine a volunteer for every book.

The volunteer can provide a number of additional functions. Because she has the knowledge of the section, she can give more information than what is merely visible.

If we were to tell the dumb assistant to get a book, the dumb assistant can now take the help of these volunteers.

Obviously in the programming world we or someone has to write these volunteers. On our instruction, the dumb assistant can make use of as many volunteers as we want.

Thus code = the dumb assistant taking the help of volunteers in the prescribed order
volunteer = someone whose is aware of a class of data and can perform algorithms wrt each class of data

Is'nt this shell scripting ?
Is'nt this delegation ?
Is'nt this message passing ?
Is'nt this the little people metaphor with more flexibility ?

I know I am not the only one who wants to confuse objects with subject metaphors :)

If you like linguistics, you will be aware of SOV. Maybe that's why we feel cozy when we use the $hey->do_something() syntax ?

12 Comments

Hi mr foo bar

My way of answering such questions is:

  • An object is a worker which provides access to the services embodied in the code and data of the class. This parallels how a database server provides services (say via a db or stmt handle) to let you access the data in the db. Likewise, it parallels how a web server provides services to let you access the web site’s static pages or CGI scripts. So you can call the class the server, and each object is created to deal with 1 client.

  • To turn that around, a class can only justify its existence by providing services. As an extreme counter example, I recall when C++ hit the deck someone published - seriously - a class to allow you to set and get the value of a variable of type char. Yep - you had to call a method to set, and another to get, that value. Compared to a simple assignment stmt, it’s massive overhead. It would have been OK if the demo was showing the syntax of a class, but it wasn’t.

  • You might prefer to call the object the server, rather than calling the class the server. I don’t mind either way, since it’s the understanding gained, about provision of services, that matters to me.

Cheers
Ron

The first point of objects is really not much more than namespacing. You want to define a set of operations that all require complex state. Instead of letting the caller keep track of dozens of variables for you, you package them into a complex type that you pass around to each of the operations. And now you can also avoid having to prefix all the subs with gtk_button_ to reduce the likelihood of identifier collisions, which makes things easier to read.

The second point, which really made OO click for me is Replace Conditional With Polymorphism. This refactor is the essence of OO design. Instead of having dozens of switch blocks scattered around your code, which examine a type flag in some complex type to decide what to do with a complex value, you give the complex type a dispatch table and indirect a call to a method that does that job through the table. Now, adding another type to your program no longer requires finding and changing dozens of switch statements pertaining to it; instead, you can just define a new class, which has all its methods in one place. (This is why Liskov substitution matters. If you have not adhered to that, then you don’t get the full extent of this benefit.)

I commend this summary of two key features of OO! As a recent OO initiate, I found that by converting a procedural coding project to OO, I got rid of conditionals that were cluttering (if not choking) the codebase.


A second benefit has been grouping some 250 global variables into 15 hashes. That helps other (potential) contributors understand what the variables refer to. That huge search-and-replace operation, although somewhat difficult, leaves the existing syntax essentially unchanged. As I am ready, each of these hashes will become a singleton object that will absorb (as methods) the subroutines corresponding to those objects.


Replacing hash accesses with OO accessors will be a future step; for now I have written a static checking tool to check for errors in hash keys that would have previously been caught by 'use strict' as references to undefined variables.


These pedestrian considerations have guided the OO-refactoring process for this project.

It’s different from a dispatch table because it’s the opposite, or maybe you’d call it an inside-out dispatch table.

With a dispatch table all the implementations of one operation are defined in one place for all possible types (and the different operations are scattered all around your codebase across different dispatch tables – same as with a switch statement, so a dispatch table differs from it only in name, it’s just a different way of writing a switch).

With objects, you implement all the different operations in one place for just one type.

In essence you slice up all those switch statements by their branches, then take those branches and scatter them to the wind… where they separate according to the type they pertain to, class Wheat from class Chaff. I.e. this:

switch foo {
    case Bar:  save_bar
    case Baz:  save_baz
    case Quux: save_quux
}
...
switch foo {
    case Bar:  load_bar
    case Baz:  load_baz
    case Quux: load_quux
}
...
switch foo {
    case Bar:  alter_bar
    case Baz:  alter_baz
    case Quux: alter_quux
}

becomes this:

foo.save
...
foo.load
...
foo.alter
...
class Bar {
    method save  { ... }
    method load  { ... }
    method alter { ... }
}
class Baz {
    method save  { ... }
    method load  { ... }
    method alter { ... }
}
class Quux {
    method save  { ... }
    method load  { ... }
    method alter { ... }
}

Notice what happened: each class contains former branches from all switch statements.

And whether you have static types has nothing to do with the utility of Liskov subtitution, in fact, you have to be more careful in a dynamically typed language.

The point is that if you have an abstract Stream class, from which you derive subclasses like Socket and File, then the write method in one of these should not expect a different number, order or meaning of parameters, nor should it do something semantically completely different like self-destructing the computer. Otherwise you cannot give objects of that subclass to someone who says they want an instance of Stream, because you have made random changes to the interface (or its semantics) of the superclass, and if they try to use your object according to their expectations based on that superclass, things will go horribly or – worse! – subtly wrong.

Duck typing or not is irrelevant to the validity of that principle.

The idea that OO reduces implementation complexity because conditionals are superseded by polymorphism doesn't stack up for me.

For example "Factories" are popular - you have a class which creates objects of different classes depending on what data you give the factory method. The method needs to do some discrimination on input:


sub factory {
my ($type_wanted) = shift;

## What is type_wanted? string? int? Other?
## Whatever it is you have to discriminate
## whether switch, if..then..else if ... or
## dispatch.
## Or ... why not make type_wanted a class that
## has a method 'discriminate' & put the rest
## of @ARGV in there?
$type_wanted->discriminate(@ARGV);
## Sure. So how does 'discriminate' work?

}

And soon enough you end up with this sort of thing
( http://en.wikipedia.org/wiki/Type_introspection#Perl ):


package Animal;
sub new {
my $class = shift;
return bless {}, $class;
}

package Dog;
use base 'Animal';

package main;
my $animal = Animal->new();
my $dog = Dog->new();

print "This is an Animal.\n" if ref $animal eq 'Animal';
print "Dog is an Animal.\n" if $dog->isa('Animal');

To eliminate 'if's I'll make 'the_box_i_belong_to' part of the remit of the
inheritance hierarchy that contains Animal and Dog.

sub Animal::the_box_i_belong_to { 'Animal' }
sub Dog::the_box_i_belong_to {
my $self = shift;
$self->SUPER::the_box_i_belong_to() }

## Later ..
print "Dog is an ${\($dog->the_box_i_belong_to())}.\n";

This works so long as your operation (print to whatever stdout is set to in this case) makes sense on the return type of the method. So - I'll make everything belong to a class that inherits from a super class that expects a 'print_the_box_i_belong_to' method.

Yeah - but print to where:
STDOUT?
STDIN?
?

No problem:


sub Animal::print_the_box_i_belong_to {
my ($thing_ref,$output_dest) = @_;
$output_dest->print('Animal');
}

sub Dog::print_the_box_i_belong_to {
my $self = shift;
$self->SUPER::print_the_box_i_belong_to(@_)
}

## Later ..
print ("Dog is an ") ,$dog->print_the_box_i_belong_to(my $fh = \*STDOUT);

Ok - but failure to print to a dvd drive and failure to print to a network socket will have different consequences. Should I then handle the different exceptions? How do I discriminate between them? A polymorphic 'acknowledge_exception' method?

Of course this is a ludicrously over-engineered way to print an arbitrary namespace string. Yet I've seen similarly contorted polymorphicisation of procedural logic. Hell - I've done it myself.

To take it to its logical extreme you'd have to have *all* your behaviour in polymorphic methods to make your OO code completely switchless. Right down to your 'if':


$obj->if(@list_of_condition_obj_clause_obj_pairs);

But this looks sort of functional!

"switchless programming"? To a point.


Since when is “doesn’t stack up” equivalent to “doesn’t work when taken to its absolute extreme (in a language that has cumbersome OO)”?

(Polymorphism is exactly how conditionals are implemented in Smalltalk, FWIW.)

@Aristotle: "The idea that OO reduces implementation complexity because conditionals are superseded by polymorphism .."
vs
"Since when is “doesn’t stack up” ..."

Never said it (polymorphism) doesn't "work". Merely that, in my opinion, the effort to replace conditionals via polymorphism can and does lead to counter-productive contortions. Hence it does not guarantee reduction in implementation complexity. The example is a not completely outlandish caricature of real life code I have written and/or worked with.

Not everyone shares your dim view of Perl's polymorphism support it would seem: (http://en.wikipedia.org/wiki/Polymorphism_in_object-oriented_programming) "Polymorphism in Perl is inherently straightforward to write because of the language's use of sigils and references."


mrstlee:

Hence it does not guarantee reduction in implementation complexity.

No single modelling technique does. You use it where appropriate and don’t where not.

mr foo bar:

I think smalltalk and self style OOP are completely different from the inside-out-dispatch table. It seems to me that they make the disptach-table and replace it with a more general dispatch middleman with objects as clients.

It’s not fundamentally different. They just make it meta-circular: essentially the inside-out dispatch table is itself an object too. (But don’t take this literally, I only mean it in terms of analogy.) At least, if that’s what you’re talking about. But the principles of why, when and how OO is desirable remain unaffected by that.

Leave a comment

About [deleted]

user-pic I blog about Perl.