Subroutine signatures in Perl are long overdue

By Ovid on March 5, 2014 12:58 PM

The upcoming subroutine signatures in Perl aren't needed so that Perl can be "cool" (we're long past that). They're needed to make your code more correct.

I have a client that I sometimes write code for and they have the layers of their application nicely defined. Their front-end code makes AJAX calls back to an API written in Perl. That, in turn, calls their backend code, also written in Perl. Much of their API code resembles this:

sub get_something { 
    my $self = shift; 
    my ( $id, $foo, $bar ) = validated_list( 
        \@_, 
        id  => { isa => 'Int' }, 
        foo => { isa => 'Int|Undef' }, 
        bar => { isa => 'Str|Undef' }, 
    ); 
    ...
}

I'd much prefer to write:

method get_something(Int $id, Int|Undef $foo, Str|Undef $bar ) {
    ....
}

But I generally can't, so I settle for what I can get.

This client's backend code often relies on their API code calling them with the correct parameters. What they've done is adopt a strategy of putting their validation/sanitation in their API. What happens if they miss some validation in their API? Fortunately, they have decent tests, but in reality, many developers test against the data they expect to receive, not the unexpected data.

So why don't they have tighter validation on their backend? Much of that relies on database calls using bind parameters and not interpolated SQL, so it seems relatively safe compared to most systems, but they don't too as much validation on the backend because it's so time-consuming to write and it can impact performance. They're relying on their clean API to handle this for them.

This is one area in which Perl is very embarrassing.

In Java, you can have method signatures like this:

public <T extends MyClass> Set<T> getSomeThing(int id) {
    ...
}

That tells us that getSomeThing takes a single integer and returns a Set of things that extend MyClass. Note that the return value is not part of the signature because method overloading can't dispatch on return types, but it will throw an exception (it probably won't even compile) if you try to return the wrong type.

So in Perl, we sometimes validate our input data, but usually we don't. We almost never validate our output data.

Why don't we validate our input data?

The developer forgets
The developer doesn't know better
The developer is lazy
The developer is concerned about performance

When the developer does validate their input data, because it's not a core part of Perl:

It's easy to get the validation wrong
It's easy to duplicate validation code all over the place

So what happens when a developer is good about abstracting out their validation code, writing a relatively clean API to use it and uses it continuously? That's when someone says "yeah, but Moose is too slow."

With Perl 5.19, we're finally getting subroutine signatures, but they're marked as experimental and may change or even be removed in future versions. They're actually not too bad, though we still can't assert the type of an argument.

I honestly don't know why it's taken this long to even have them experimentally in the core. Why on earth would you want to keep manually doing something when you can have perl (not Perl) do it for you? For example, when I'm working with clients on cleaning up legacy code that doesn't have strict, I sometimes ask what the following line of code does:

$foo = bar;

As it turns out, in that context, you can't possibly know. First, perl will look to see if there's a sub named bar(). If so, it calls bar() and assigns the return value to $foo. However, if it doesn't find that subroutine, it assigns the string bar to $foo!

Note that the assignment of $foo happens at run time, but the evaluation of bar happens at compile time. That means that if you want to assign the return value of bar() to $foo, then the subroutine must be declared before the bareword! This works:

sub bar { 4 }
$foo = bar; # assigns 4 to $foo

This doesn't work:

$foo = bar; # assigns the string 'bar' to $foo
sub bar { 4 }

What a maintenance nightmare! How do you stop that? Well, you use strict, of course (and warnings, but that's a different story). That's why virtually all experienced Perl devs will tell you to use strictures. Removing strictures from a program will generally not cause it to misbehave, but adding it could break all sorts of things, so you start with using strict and not adding it as an afterthought.

In other words, Perl devs are saying "let the computer help you avoid silly mistakes." And we're militant about that!

And that's what method/subroutine signatures do. For a very low cost, a very common class of mistakes is simply eliminated and devs can stop writing duplicated, broken validation code all over the place.

I strongly urge everyone to use the new signatures as much as possible (outside of production code) so that P5P can get strong feedback on what works and what doesn't. We need them. Of course, we also need the p5-mop, but that hasn't been touched in a while; I hope it's not dead.

If you're interested in hiring me or any of the talented developers we have working at All Around The World, drop me a line at ovid@allaroundtheworld.fr.

11 comments

Tagged as:

perl

11 Comments

Stevan Little | March 5, 2014 2:46 PM

Ovid, nope, p5-MOP is most certainly not dead, we have just been too busy with work (and prior to that, the holidays) to do any real work on it. I am currently shoving a ton of C and XS into my head so that once work stuff clears up I can dive right in.

jjn1056 | March 5, 2014 5:54 PM

Ovid,

I'm looking forward to sub signatures as well, but for the validation case I personally find using something like HTML::Formhandlers a better fit as I'd prefer to return a validation results object rather than raise an exception (which is what what happen in the subroutine sig method I believe).

Thanks! -jnap

Anonymous | March 5, 2014 6:12 PM

IMHO it would be better to implement Smalltalk-style method signatures instead of Java/C-style ones. The readability of Smalltalk code is significantly enhanced by forcing arguments to be interspersed inside the method name itself (eg. "foo setValue:v forKey:k" is much more readable that the Javaesque "foo.set(key, value)"). It seems like a minor difference but in reality it makes a huge huge difference in daily use.

Ross Attrill | March 5, 2014 8:42 PM

There are several method signature modules on CPAN. I have been watching Method::Signatures for some time and I can see there is a huge amount of design work that has gone into the signature syntax and features with this module.

Many of these modules seem to have many more features than the 5.19 proposal.

Possibly not the right place to ask this question, but how will Method::Signatures or similar extension modules work with the 5.19 proposal?

And is there intent to bring more of the features of Method::Signatures into the core of Perl at some stage.

Thank you for the post Ovid.

Toby Inkster | March 5, 2014 9:34 PM

I've actually been doing quite a bit of work to get Kavorka's method signature syntax to be compatible (a superset) of the Perl 5.20 sub signature syntax.

I've also filed "wishlist" bugs against Method::Signatures and Function::Parameters to the same end. As an aside, I believe all three of the signature implementations I mention above support the exact syntax mentioned in the post:

method get_something(Int $id, Int|Undef $foo, Str|Undef $bar ) {
    ....
}

Damian Conway | March 6, 2014 9:39 AM

Ross asked:

how will Method::Signatures or similar extension modules work with the 5.19 proposal?

For the moment, M::S and the other modules will remain independent of the new signatures feature of 5.19+. They will certainly continue to work correctly. You can even use both mechanisms in the same program. There's no conflict, because M::S uses the declarators func and method, whereas the built-in feature uses good old sub.

The long-term hope is that the built-in signature mechanism may eventually be made pluggable...at which point these more sophisticated signature modules would be able to use those core hooks directly, rather than relying on deep magic like the brilliant-but-terrifying Devel::Declare.

Personally, I would like to see all—or at least most—of the features of M::S integrated into the built-in signature mechanism, but I expect that will be a slow and cautious process. For example, Perl 5 would need to have instituted some kind of native type constraint system (such as p5-mop) before adding built-in parameter type-checking would truly be reasonable.

Stevan Little replied to comment from Damian Conway | March 7, 2014 3:13 PM

Actually p5-mop will not have a built in type constraint system, it will be able to support existing type constraint systems via the trait mechanism. This test shows an example of that.

Aristotle | March 7, 2014 7:59 PM

I honestly don't know why it's taken this long to even have them experimentally in the core.

Yeah? Well I know. And Ross Attrill said it:

Many of these modules seem to have many more features than the 5.19 proposal.

That’s why. Everybody wants lots of features in signatures, and everybody wants a different set. So every previous proposal suffered the same fate: people couldn’t quite accept a signature proposal that lacked just the one extra feature that seemed indispensable given the overall minimalism of the proposal in question, and invariably that feature (or two) would throw open a wide array of controversial design choices. Because it always seemed like the barest minimum was just too little, at least to someone, it kept meeting death by p5p megathread.

Quite why this one evaded that trap, I don’t know; whether it was solely Peter Martini’s commitment to stick to the simplest viable thing is not clear to me, since the discussion swayed dangerously and teetered close to doom territory several times. But somehow it emerged unscathed from every incident, possibly because they threads usually died down not long after they started to flirt with disaster – however that managed to happen. Somehow this was the proposal to negotiate all those mine fields and live to tell the tale.

Hard to believe sometimes how much luxury can be the enemy of comfort.

mortenb | March 8, 2014 1:36 AM

We use a heavily extended version of Params::Validate along with forced hashref only parameters. In practice this mean adding a rather complex lookup validation hash to every sub with a key for every parameter key. a Sub signature like the one Ovid propose will help a lot reducing this for plain type-checking, but there will always be a need for the more extensive ones like timestamping, object validation, sql-injection, evals etc.

I had to reimplement the dreaded pseudohashes into perl 5.16.3 when we upgraded a cluster from perl 5.8.9. The intention was to avoid writing blobs the 5.8.9 nodes still running could not understand. When all nodes was upgraded, we turned off the write_pseudohash flag and was finally rid of it and the storable also. This cluster has not been down except for a database crash the last 5 years. Perl really kicks ass. This customer said many times this is the most stable software he has ever come across. We have run out of database tablespace and queued up gigabytes on MQ, but once fixed it has just kept on going without any downtime.

This code had insane parameter checking and validation, but it turned out write back and forth tests between 5.16.3 and 5.8.9, 5.16.3 pseudohash was ~20% faster than the 5.8.9 pseduohash. I'm not sure this was due to 5.16.3 being faster than 5.8.9, but it clearly showed that full parameter validation the way Params::Validate does it seem to be quite fast.

Damian Conway | March 8, 2014 10:37 AM

Thanks for clarifying the nature of p5-mop's type support, Stevan. I guess you've got more than enough on your plate at this stage, without adding in a native type system as well. ;-)

Aristotle replied to comment from Aristotle | March 9, 2014 4:16 AM

(Oh and btw, in case my comment seemed berating – it wasn’t supposed to be, not exactly: I was myself one of the people who, in a previous round, thought a minimal signature proposal really can’t work without just that one extra feature. And that’s after I’d seen a previous proposal die from being loaded with obviously too many features in the first place. So the next time around I thought I was arguing for the minimum… which was nonsense. So by the time Peter Martini showed up with his patch, I had finally swallowed my lesson: any viable proposal must have no features. And that’s what we got. And that’s why we got anything at all. Hooray! At long last…)

About Ovid

Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/

More info »

Ovid