no indirect considered harmful

By Reini Urban on February 26, 2013 8:47 PM

Several p5p members argue that using indirect method call syntax is considered harmful.

I argue that using indirect method call syntax is the best and sometimes only way to extend the language without changing core or the parser rules.

method_name ClassName @args;
method_name $obj @args;

ClassName->method_name(@args);
$obj->method_name(@args);

E.g. mst argues in "new Foo bad, 'no indirect' good" that the parser is too dynamic in deciding if something is a valid method call or not.

He gives three examples:

(1) Is a valid class name? If so, this is a method call.

use Foo::Bar;
new Foo::Bar @args; # calls Foo::Bar->new(@args)

(2) Is bareword a known subroutine name? If so, this is a sub call.

sub wotsit { ... }
wotsit { foo => 'bar', baz => 'quux' }; # calls wotsit({ ... })

(3) Stuff it, I'm guessing it's a method call.

wotsit { foo => 'bar', baz => 'quux' }; # tries to call a method on a hashref

BOOM!

The problem is that those p5p hackers new to perl don't understand why Larry created this indirect method call syntax at first hand. It was to free the parser and core from defining new keywords, such as 'new' or 'delete', and let the user create it at run-time. A parser and vm needs to be extendable. And the resulting language still needs to look natural.

The same logic applies to our not-yet-used type system. If you declare a lexical variable with a type between my and the name, it will dynamically lookup if the type is an existing class. Perl is a dynamic language, in which you can to extend classes and types.

People who advocate on using Foo::Bar->new(@args) over new Foo::Bar @args make it clear what they want, but loose on people looking from a broader view onto a language, which do not care if such a method is a method, builtin keyword, or function call. The look at the language, and Foo::Bar->new(@args) looks awful and backwards.

Without indirect method syntax you loose the ability to write code in a natural and understandable way. There are corner-cases in which the parser throws errors, when a class or sub is not known, like in mst's example: Can't use string ("foo") as a subroutine ref while "strict refs" in use at Two.pm line 7, because the parser is dynamic and does not know. The addition of no indirect adds the error message: Indirect call of method "Two::two" on a block at One.pm line 8.

Using no indirect is a great way to understand warnings, such as use warnings or use diagnostics. But arguing that the parser is wrong and this syntax should be deprecated is harmful. People should learn a little bit about language history first, before they start destroying the parts the do not understand.

schwern got it right by arguing pro use autobox which extends the notion of indirect method calls by checking the type of each object, and then you are able to easily overload and add methods.

In my upcoming functional perl p2 prototype even most keywords are methods, such as if, elsif, when, while. There's no need for parser to know a keyword, if the parser knows the type and structure of context. 'if' is a method of an expression, and the argument is the next block. print and all other perl keywords are no keywords anymore, they are methods implemented for various types. And they can be easily extended to handle more user-types. Such as e.g. bignum or complex support, PDL, FFI, ...

If you forbid indirect method syntax you cannot extend the language.

You increase coding safety by using unnatural but precise code. But then you can also argue pro LISP or Forth or ML, which uses a simple parser without too much syntax defined via keywords. With these languages it is at least possible to use macros to extend the language. Not so with perl.

11 comments

11 Comments

Matt S Trout (mst) | February 26, 2013 10:00 PM

The real life bug that I've created an equivalent to in that post completely screwed over a critical deployment schedule, and cost me one developer to burnout, one client to assumption of incompetence, and one large codebase to perl.

(I would violate NDAs if I explained the things that compounded the problem to the point where it was that bad; also, I don't want to remember!)

After two futile days of not being able to debug it, I worked around it at the time; I didn't even figure it out until about two years later when I was reading the toke.c source yet again for something to do with Devel::Declare.

That scares the living crap out of me.

If you can make it consistent and safe and still add significant expressivity, I'll applaud you.

But the current semantics are the only thing that's come as close to putting us into the red financially as my capacity for tact.

So, yeah, my hatred of this misfeature is kinda personal. But that doesn't mean I don't still have a point.

Samuel Kaufman | February 27, 2013 5:20 AM

One of the largest barriers I've had to training new perl developers is overcoming cryptic error messaging. I loathe indirect syntax, and would love not to have to answer a question about what this error means:
Can't locate object method "Dumper" via package "URI::http" at test.pl line 6.
from:



use strict;

use warnings;

use URI;

my $uri = URI->new("http://google.com");

warn Dumper $uri;

using 'no indirect' helps a bit in this case, but what would help more is the parser barfing, not indirect.
( Though in this case, maybe I should encourage the newer devs to use more parenthesis)

metadoo | February 27, 2013 5:39 PM

Have read both arguments, so, to not waste the effort, want to return back some of my conclusions (just for the sake of discussion):

First, the mst's argument revolves entirely around example that contains circular dependency bug, and this bug is the real reason for the trouble - it causes indirect method syntax to blow up. But mst makes a far reaching conclusion that indirect method call syntax is evil and should be somewhat banned altogether. I think it is wrong.

To disambiguate that ambiguous "Two::two { (foo => 'bar', baz => 'quux') }" one must ensure that Two is loaded first and has the right code in it. If Two has no right code in it (a bug) or has not been loaded due to One and Two circular dependency bug, as is the case, it is not at all surprising that weird things may happen. And it is a mere coincidence that indirect syntax is what blew up - anything else may as well. So, in this case blaming indirect syntax specifically would be wrong.

Also note, that Perl has nonetheless thrown the exception, and that tells the indirect syntax is safe even in this case.

Interestingly, the example Samuel Kaufman brings is from the same class - given no clues what that "Dumper" thing is (a bug) what on earth that "warn Dumper $uri" is supposed to mean? So Perl tries various interpretations, fails, and raises exception.

Also, objectively speaking, in both cases error messages quite clearly state what has happened. In Samuel's example the error message is plain clear and obvious - as a last resort Perl tried to interpret that ambiguous "Dumper" thing as $uri->Dumper() method call, and failed.

In mst's example, the 'Can't use string ("foo") as a subroutine ref while "strict refs" in use' message clearly tells us that we have bug (the circular dependency one in this case) lurking in the code that makes things go wrong, and that Perl has been cornered with nothing left, but desperately assume that the programmer somehow meant 'foo' is a subroutine reference. The only thing left to do for programmer is to guess what bug in the code has caused it (sure it may not be easy).

However, to give programmer more clues it would be better, if error message from Perl sound less convinced that "foo" is a subroutine reference and "Dumper" is a method, since the very fact that Perl has to tell this in the error message means such assumptions may well be wrong.

So the problem at most is potentially misleading phrasing of the error messages when indirect syntax assumption fails, not at all the indirect syntax itself.

Now back to indirect method call syntax, which now stands innocent and has nothing to do with crimes it was accused of. As for indirect syntax merits I would subscribe to Reini's arguments. Most importantly, indirect syntax is useful for Perl to read closer to natural language: "perform that $action" reads far better then equivalent "that->perform($action)", so programmers have more options to write readable, self documenting, elegant code.

Matt S Trout (mst) replied to comment from metadoo | February 27, 2013 5:55 PM

metadoo, with respect to "Also, objectively speaking, in both cases error messages quite clearly state what has happened", I have to answer with "sure, once you already know what happened". The problem I'm talking about is that the error messages make it quite difficult to determine what has happened.

Also, "note, that Perl has nonetheless thrown the exception, and that tells the indirect syntax is safe even in this case" glosses over the point that it's a runtime error - and a relatively inscrutable one.

Sure, once you've seen all of the failure cases, corresponding the error messages to what's happened isn't terribly hard.

But I still think it would be nice to not have to teach new perl users all of that in order to debug what should be relatively simple code - and most importantly, code that wasn't even trying to use indirect object syntax in the first place.

metadoo replied to comment from Matt S Trout (mst) | February 27, 2013 8:15 PM

Matt, I agree with that: "The problem I'm talking about is that the error messages make it quite difficult to determine what has happened... it's a run-time error - and a relatively inscrutable one". However, my point is that Perl has stated precisely what really caused her to raise an exception - this is the most we can expect from Perl as it is not very powerful AI yet. Run-time errors cannot be entirely ruled out. As well as cryptic error messages, since uncaught errors (like that circular dependency bug, etc.) often propagate great distance, making eventual error looking completely puzzling. This is not specific to indirect syntax. In your example ANY error caused by the fact that Two has not completely loaded after that evidently prominent "use Two" would be already very puzzling.

Also, it is true that it is the "relatively simple code ... that wasn't even trying to use indirect object syntax in the first place". But that relatively simple code is not at all innocent - the "Two::two { (foo => 'bar', baz => 'quux') }" that assumes both bareword and possible use of prototype immediately alerts me, even before the error message has a chance to puzzle. It already uses quite a bit of magic, so it is probably not for new perl users anyway. Or perhaps it would be easier to disambiguate it with parentheses, than avoid indirect syntax as evil, especially when it is not evil at all.

Ovid | February 27, 2013 9:45 PM

On a historical note, Ben Evans listed indirect method call syntax as one of the main reasons why he couldn't port Perl to the JVM. It's a huge blocker in terms of alternate implementations, so rather than allowing us to extend the language, it's helping to lock us up in our cell.

Frankly, I agree with Matt's arguments and I'll throw in one of my own: the more "magic" a language has which makes it hard to read a single line and know what it means, the harder it is to maintain programs written in said language. And indirect method syntax (and ties, and prototypes, and operator overloading and, and, and ...) contributes to this. Why? Well, let me rephrase this:

The more "action at a distance" a language has which makes it hard to read a single line and know what it means, the harder it is to maintain programs written in said language

People might stand up and argue for indirect method syntax, but I think it's pretty hard to stand up and argue for more action at a distance. For indirect method syntax, depending upon what other portions of code are doing, one line's behavior could change dramatically.

When I write my code, I never use tie, I never use indirect object notation, I try to avoid overloading (but I've got a nasty habit of using prototypes (I should break this)). As a result, when you read a line of my code, even if it's not immediately clear what's going on, at least I'm not hurling "action at a distance" brain melters at you.

And the fact that it helped kill the port to the JVM really frustrates me :(

Joel Berger replied to comment from Ovid | February 27, 2013 11:09 PM

What's wrong with tie (other than being slow)? When its what you need, little else will help: https://metacpan.org/module/Tie::Array::CSV

Aristotle | February 28, 2013 12:47 AM

I like being able to not use parentheses far more than I like being able to revert the order of method and invocant.

Ovid replied to comment from Joel Berger | February 28, 2013 9:37 AM

Hi Joel,

The issue with tie is simple: it takes an ordinary variable and suddenly makes it magic. When abused (as it often is), it can make it very, very hard to figure out what is going on. It's also a frequent source of bugs as people (including me before I realized I was barking up the wrong tree) find themselves trying to implement a tied interface and getting it wrong. For example, here's the description of STORESIZE for arrays (from perltie):

STORESIZE this, count

Sets the total number of items in the tied array associated with object this to be count. If this makes the array larger then class's mapping of "undef" should be returned for new positions. If the array becomes smaller then entries beyond count should be deleted.

Right. People are going to get that wrong (and that's ignoring the misspellings and poor grammar). Oh, and there's the "untie" issue if your class has a destructor, but who reads the docs?

Or, on the other hand, you can simple create a normal class and return an object rather than try to mysteriously overload the behavior of what looks like a normal variable.

If you must use tie, do so in a limited scope and for very simple things. Your Tie::Array::CSV module is simple and clear and seems like a natural use of tie rather than an abuse (kudos!). Though I note that you don't implement all of the methods that perldoc -f tie says you should. Will this cause problems? Who knows? The tie documentation is, um, a bit lacking at times.

And then there are the well-documented (but not well-known) bugs in tie.

Joel Berger replied to comment from Ovid | March 1, 2013 3:18 PM

True you have to be very careful when designing a tied interface, but one ought to be careful when designing an API as well.

In fact Tie::Array::CSV does implement all the methods, because it inherits from the Tie::Array base class, I have overloaded those which I must or for which I have a better implementation than the default. These base classes make writing those "correct" implementations easier, whether you actually inherit from them, or just inspect their behavior.

mascip | May 29, 2013 6:13 PM

Sounds as if we're standing where TIMTOWTDY can both be useful and harmful. Painful conundrum.

The place where linguistics and computer science don't work in a synergy anymore: optimising for one leads to the deterioration of the other.

Is that where we should prioritise computer science over linguistics, then? I love language and code that reads like english, and i would hate sacrifying this aspect of Perl. But what if it was for the best...?

What do you think Reini?

About Reini Urban

Working at cPanel on cperl, B::C (the perl-compiler), parrot, B::Generate, cygwin perl and more guts, keeping the system alive.

More info »

Reini Urban