Better yet, with knowledge of algorithm you can conclude that it’s not possible for fib(n-1) to be cached without fib(n-2) also being cached (because the latter must be calculated in order to calculate the former), and for the same reason it’s not possible for fib(n-2) to be uncached without fib(n-1) also being uncached. So you can run the calculation out of the cache, and you only ever need to check whether fib(n-1) is cached,

```
my @fib = (undef, 1, 1);
sub fib4 {
my $n = shift;
croak "fib($n) undefined" if $n <= 0;
$fib[$n] //= ( $fib[$n-1] // fib4($n-1) ) + $fib[$n-2];
}
```

This only ever recurses linearly and only recurses just as far as necessary, never redoing any work.

The nice thing, of course, is that a cache amortises all calculations over all subsequent calls. If you change the benchmark to run many calculations, e.g.:

`timefor { $a = fib1(180+$_) for 1..30 } 'backwards';`

you will find that it takes essentially identical time to call `fib4`

once or thirty times. It’s hard to make the needle move at all on `fib4`

, even as even the other caching versions accumulate runtime. In fact if you compare

`timefor { $a = fib1 $_ for 1..200 } 'backwards';`

with

`timefor { $a = fib1 $_ for reverse 1..200 } 'backwards';`

you will find that calculating progressively larger Fibonacci values in a loop is faster for `fib4`

than calculating them in a single call, since it’s not being forced to recurse even once ever. In that case it beats the other implementations by a very long shot.

If you prime the caches before benchmarking, you will find that `fib4`

is 50% faster than `fib3`

, which is 6× faster than `fib1`

in turn.

(Needless to say, `fib2`

is not even a contender after timing a handful of repeated calls.)

All this goes to the point I made above: if you use your knowledge of the algorithm to weave caching into the right place, you get otherwise inconceivable speed-ups.

]]>