Avoid a Common Software Bug By Using Perl 6

Back in 2001 I was working for a company who had a client who was in a serious bind: the maker of their point of sale (POS) system suddenly jacked up the license fee to the point where our client would go out of business. They needed a new POS in 21 days.

We grabbed an open source POS system and identified all of the features it was missing that our client would need. Then it was 21 days of overtime and no days off. Back in the days of use.perl.org, I blogged about this hell almost every day. It was also, interestingly, the first project I wrote software tests for. The other main dev on the project was teaching me how Perl's testing tools worked and as the days went on, I found myself incredibly proud of seeing all of those tests pass and catching bugs I would not have otherwise caught.

Then disaster struck: we tried to actually run the software instead of just testing it. The Tk panels would appear, and then instantly crash again. Adding some debugging code showed that we had a bit of a mess because we unit tested the code. The different parts of the software were isolated in our testing. The individual bits worked fine, but they had no idea of how to talk to one another.

Passing bad data around is an incredibly common source of bugs. In Perl 6, this is not only easy to avoid, but the tools to do so are also far more powerful than most mainstream languages.

What follows is drawn from my Perl 6 for Mere Mortals talk.

Recursion is often taught with the Fibonacci series. Some people hate that because they don't use the Fibonacci series, but from a teaching standpoint, it's useful because it's dead-simple to understand. Here's the mathematical description:


F0 = 0
F1 = 1
Fn = Fn-1 + Fn-2

The recursive version looks like the following Perl 6 code:

sub fib($nth) {
  given $nth {
    when 0  { 0 }
    when 1  { 1 }
    default { fib($nth-1) + fib($nth-2) }
  }
}

Note that the body of the given block effectively mirrors the mathematical description. Thus, recursion can be a great way to cleanly write a function. However, recursion has three very common failure modes:

  1. Failing to validate your arguments (not restricted to recursion, of course)
  2. Failure to provide the base case(s) to break recursion (we have them above)
  3. Excessive recursion depth blowing the stack (we have that in spades)

In the above, you can say fib(3.7) and it we have an infinite loop because n-1 and n-2 will never match our base cases (though you'll blow the stack quickly). How would we handle that in Perl 5? Well, there's more than one way to do it:

sub fib {
  die unless $_[0] =~ /^\d+$/;
  ...

sub fib {
  die unless $_[0] == int($_[0]);
  ...

use Regexp::Common;
sub fib {
  die unless $_[0] =~ $RE{num}{int};
  ...

Each of the above has multiple bugs lurking in them. You can have fun seeing if you can identify all of them. What's worse, this is what we often find:

sub foo {
    my ( $self, $this, $aref, $hashref )  = @_;
    # lots and lots of skipped validation ...
}

Validating our data in most dynamic languages is done in a very ad-hoc manner. In fact, it's often not done at all and we just hope that things work. In practice, they do. At 2AM when you're frantically trying to chase down where the bad data came from, you might find yourself cursing and wishing you had a stricter type system so that the bad data can be caught as soon as it occurs.

Gradual Typing

Perl 6 has that stricter type system. It's completely optional and called gradual typing. Basically, it works like this:

sub fib(Int $nth) {
  given $nth {
    when 0  { 0 }
    when 1  { 1 }
    default { fib($nth-1) + fib($nth-2) }
  }
}

You see that Int $nth in the signature? It's nice and simple. It's there if you want it, but you don't need to use it. It's also more likely to be used because it's so simple.

(As an aside: in Perl 5, a scalar internally has multiple "slots" to store data in different representations. By declaring the type of a variable, we no longer need to have as many slots and there are plenty of performance optimizations available.)

Subsets

But what if we ask for fib(-3)? There is no -3rd fibonacci number, so let's toss on a constraint:

sub fib(Int $nth where * >= 0) {
  given $nth {
    when 0  { 0 }
    when 1  { 1 }
    default { fib($nth-1) + fib($nth-2) }
  }
}

It's the where * >= 0 bit we care about here.

The asterisk is an example of the Whatever type. When I see it, I like to read it as "whatever I got". So the argument is "an integer called $nth where whatever I got is greater than or equal to zero".

That's actually pretty easy to read and understand and, more interestingly, most mainstream computer languages simply don't support something like that. This will make it very easy to ensure we don't get passed bad data.

But it gets even better! The above gets clumsy if we have several arguments and it's not reusable. So let's pull that out into a subset that we can reuse wherever we need to declare a type.

subset NonNegativeInt of Int where * >= 0;

sub fib(NonNegativeInt $nth) {
  given $nth {
    when 0  { 0 }
    when 1  { 1 }
    default { fib($nth-1) + fib($nth-2) }
  }
}

That's right. Using subsets, you can easily declare your own types on the fly. Other languages tend to allow this but require you to create a class to define your type. When using Perl 6 subsets, you'll find that having a full-blown class for every type you need is rather clumsy (that being said, everything is an object in Perl 6, but the sugary frosting is awesome).

As an aside, I predict that one of the most common subsets is going to be this one:

subset NonEmptyString of Str where *.chars > 0;

No more checking to see if the string is empty! Now you can just assert it and Perl 6 will check for you.

Or here's something that will come in very handy with an advanced ORM. What do you do if you have an field declared as a VARCHAR(255)? Assuming that you don't want to allow empty strings and the data must be less than 256 characters:

subset FirstName of Str where 0 < *.chars < 256;

We can now create our own types on the fly to mirror what we have in our database. Somebody forgot to configure MySQL in strict or traditional mode? Who cares! No more silent truncation of our data in the database! I'm looking forward to ORMs which take advantage of this.

Built-in Memoization

Getting back to our Fibonacci function, that code will blow our stack with larger numbers, so what do we do?

All recursive functions can be rewritten as iterative, but the larger the function, the harder that is to do. Plus, we lose the elegant simplicity of mirroring the mathematical definition and have to walk carefully through the code to see if the behavior is the same.

We could rewrite the function to handle the caching, but that gets ugly, is prone to bugs (as I had in my slides), and also obscures the intent of the code:

sub fib(NonNegativeInt $nth) {
  state %fib_for;
  unless %fib_for{$nth}:exists {
    given $nth {
      when 0  { return 0 }
      when 1  { return 1 }
      default { %fib_for{$nth} = fib($nth-1) + fib($nth-2) }
    }
  }
  return %fib_for{$nth};
}

There are plenty of ways to deal with this, but Perl 6 provides caching natively:

sub fib(NonNegativeInt $nth) is cached {
  given $nth {
    when 0  { 0 }
    when 1  { 1 }
    default { fib($nth-1) + fib($nth-2) }
  }
}

Nice! We now have a fairly powerful type constraint, easy-to-read code, and caching. No rocket science here. It just works.

Return types

Getting back to the huge class of bugs I was first talking about: different functions talk to each other and it's important that they send and receive the correct data. So let's look at this:

sub will-it-blend (NotAnAnimal $something) returns Bool {
    if $something.does('Blendable') {
        return True;
    }
    else {
        return blend-it($something);
    }
}

What does blend-it() return? If it doesn't return a boolean, this code will throw a runtime error rather than risk silently corrupting your code. Part of the reason why tests are more common (and more verbose) in dynamic languages is because it's so hard to test the data being passed around. Integration tests are particularly bad at this. Now, it becomes trivial, though still optional.

So our Fibonacci function now looks like this:

sub fib(NonNegativeInt $nth) is cached returns NonNegativeInt {
  given $nth {
    when 0  { 0 }
    when 1  { 1 }
    default { fib($nth-1) + fib($nth-2) }
  }
}

That's powerful, expressive, and very, very easy to read. It's not nearly as difficult as people are worrying about.

Conclusion

Many people who are working on Perl 6 are pretty excited about what it can do, but when you read blog posts about using red-black trees in Perl 6, your eyes tend to glaze over the same way mine do when someone starts yammering on about football statistics.

Perl 6 is big and yes, there are complex corners of it, but those are often for the hard tasks. In your day-to-day code, it won't look any more difficult than any other language. In fact, just like any new language you learn, you'll start with "baby Perl 6" and gradually move on to more complicated code.

Grab rakudobrew and check out Perl 6 today.

9 Comments

I'm a fan of gradual typing. I have actually experimented with it in Python and my conclusion is that type checking is only really valuable when it's done statically at compile time. Perl 6 does some compile time type checking but it only works in some very simple cases. I would love to see that improved. My favorite example of this done right is TypeScript. Its type inference engine is really smart, and it doesn't provide runtime checking at all. Dart is also nice, but it lacks structural interfaces. Dart also provides optional runtime type checking, but nobody uses it, it's not very useful.

Perl 6 does some compile time type checking but it only works in some very simple cases.

Can you provide examples or data?

I was thinking that all code had static types that were checked at compile-time and some code, not much, also had dynamic types that were checked at run-time.

In the form of a question/answer series, here's how I thought things worked:

==When are types checked?

Static types are checked at compile-time.

==What's a "static type"?

Class types like Any, Int, Str, and users' classes.

Additionally, static subsets.

==What's a "static subset"?

Subsets are the subsets introduced by Ovid above.

A static subset is one that the compiler has decided to reduce to and treat as a finite set.

The compiler may treat Wday in the following code as a static subset and thus a static type:

my enum Day ;

subset Wday of Day where M .. F;

==So what are dynamic types?

Subsets that aren't static are dynamic subsets and are treated as dynamic types. These are the only dynamic types.

==Which code avoids dynamic types and hence is fully type-checked at compile time?

All "untyped" code. Scalar containers ($foo, @bar[1], @bar[2] etc.) and values are assigned the static type Any. Of course this is pretty trivial but it is type checking and it does happen at compile-time. :)

Most if not all the code in the core libraries.

Most of the code in the ecosystem.

============

I'd appreciate correction or wholesale destruction of any wrong ideas from anyone. :)

Hello,

I guess my version of perl6 does not support the where * >=0 verbiage. I am using RHEL 6.6. When I query the rpm database I get this oldish looking version string rakudo-star-0.0.2011.04_3.3.0-1. Do you know the version I could go find where I would have better luck doing this statement?

Just fyi I'm not really a developer, just casually playing around.

Thanks in advance for any advice.

V/r, Bryan

rakudo-star-0.0.2011.04_3.3.0-1

That's from a quite different era in P6's development. ;)

http://irclog.perlgeek.de/perl6/search/?nick=&q=rpm doesn't look promising.

There are docker images: https://registry.hub.docker.com/search?q=perl6+rakudo

If you are willing to build your own (it's typically pretty simple) go to http://rakudo.org/downloads/star/, download the latest, and read the README and INSTALL files.

Or do this with git:

git clone https://github.com/tadzik/rakudobrew ~/.rakudobrew

export PATH=~/.rakudobrew/bin:$PATH

2011.04 is nearly 4 years old at this point; A new version of the rakudo compiler is released every month, and there has been a LOT of development in the last four years.

If your ports system isn't providing an up to date version (and I don't think it'll happen until the "release" later this year) I recommend using rakudobrew in the meantime to keep up to date. See Ovid's article here about getting started with rakudobrew:

http://blogs.perl.org/users/ovid/2014/08/try-rakudobrew-and-play-with-concurrency.html

Thanks for your references. I'll check them out.

V/r, Bryan

Perl 6 does some compile time type checking but it only works in some very simple cases.

Can you provide examples or data?

sub test(Int $num) {
    say $num
}

test("text"); # Throws a compile time error.

my $input = "test"; 
test($input); # Throws a runtime error.

The problem is basically that Perl 6 doesn't have a type inference engine. You can still make that second example error at compile time by explicitly type the variable:

my Str $input = "test"; 
test($input); # Throws a compile time error.

But it adds unnecessary boilerplate. Good gradual typing requires good type inference. Also, it'd be nice to have an option to remove runtime type checking.

Very nice article, thank you very much Ovid!

One question though, about this code:

die unless $_[0] =~ /^\d+$/;

One obvious bug is that it will accept octal numbers such as "040". Will the Perl6 "Int" type prevent such numbers?

About Ovid

user-pic Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/