Perl Weekly Challenge 049: Smallest Multiple and LRU Cache

Smallest Multiple

Write a script to accept a positive number as command line argument and print the smallest multiple of the given number consists of digits 0 and 1.

For example: For given number 55, the smallest multiple is 110 consisting of digits 0 and 1.

The simplest naive solution is to start from the input number, keep adding it to itself until the result consists of 0's and 1's exclusively.

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

my $x = shift;
say "$x ", smallest_multiple($x);

sub smallest_multiple {
    my ($n) = @_;
    my $r = $n;
    $r += $n until $r =~ /^[01]+$/;
    $r
}

The problem of this solution is that it’s very slow for multiples of 9, as the result of the function is a very large number (111_111_111 for 9, 111_111_111_111_111_111 for 99).

There is a way how to find the number more effectively, though. Instead of the number itself, we’ll start from the smallest binary number of the same length; and we’ll repeatedly add 1 to it, but in binary, until the result read as decimal can be divided by the starting number. This way, we can reach the result for 18 in 0.011 seconds instead of 10 seconds. That’s a massive improvement!

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

my $x = shift;
say "$x ", smallest_multiple($x);

sub smallest_multiple {
    my ($n) = @_;
    return 0 unless $n;

    my $binary = 1 . (0 x (length($n) - 1));
    increment($binary) while $binary % $n;
    $binary
}

sub increment {
    my $pos = rindex $_[0], 0;
    if ($pos > -1) {
        substr $_[0], $pos, 1, '1';
        substr $_[0], $pos + 1, length($_[0]) - $pos - 1,
                         '0' x (length($_[0]) - $pos - 1);
    } else {
        $_[0] = '1' . ('0' x length $_[0]);
    }
}

We have already seen the binary increment in Perl Weekly Challenge 044: One Hundred, Two Hundred, but this time we operate on a string, not an array.

Note that the downside of the optimisation is a longer and less readable code.

LRU Cache

Write a script to demonstrate LRU Cache feature. It should support operations get and set. Accept the capacity of the LRU Cache as command line argument.

I started by converting the example into a test suite. I modified the behaviour a bit, though: returning -1 for elements not found in the cache means we couldn’t store -1 as a cached value. In fact, any value we choose would be impossible to store: we could die on a cache miss, or we can return a reference to the value if it’s been found, and undef otherwise. Then it would be possible to store undef in the cache, as it would be returned as \undef.

use Test::More;

my $c = 'Cache::LRU'->new(3);
$c->set(1, 3);
$c->set(2, 5);
$c->set(3, 7);

is_deeply $c->inspect, [1, 2, 3];

is ${ $c->get(2) }, 5, 'get 2';

is_deeply $c->inspect, [1, 3, 2];

is ${ $c->get(1) }, 3, 'get 1';

is_deeply $c->inspect, [3, 2, 1];

is $c->get(4), undef, 'get 4';

is_deeply $c->inspect, [3, 2, 1];

$c->set(4, 9);

is_deeply $c->inspect, [2, 1, 4];

is $c->get(3), undef, 'get 3';

done_testing();

Note the inspect method. We need to implement it only for the purpose of the test, because it could be non-trivial for some of the implementations of the cache.

When you google for “linked lists” and “perl”, you might find articles explaining that linked lists are not needed in Perl, as its arrays are flexible enough to replace them. So, I implemented the LRU Cache using an array instead of a linked list.

The cache object consists of three slots: a number that represents the capacity, a hash that maps the keys to values, and an array of keys that represents their order.

package Cache::LRU;
use enum qw( CAPACITY HASH ARRAY );

sub new {
    my ($class, $capacity) = @_;
    bless [$capacity, {}, []], $class
}

sub capacity { $_[0][CAPACITY] }

sub _value { $_[0][HASH]{ $_[1] } }

sub _move_to_start {
    my ($self, $key) = @_;
    @{ $self->[ARRAY] } = ($key, grep $_ ne $key, @{ $self->[ARRAY] });
}

sub get {
    my ($self, $key) = @_;
    return undef unless exists $self->[HASH]{$key};

    $self->_move_to_start($key);
    return \$self->_value($key)
}

sub set {
    my ($self, $key, $value) = @_;
    $self->[HASH]{$key} = $value if 3 == @_;
    $self->_move_to_start($key);
    delete $self->[HASH]{ pop @{ $self->[ARRAY] } }
        if @{ $self->[ARRAY] } > $self->capacity;
}

sub inspect {
    [reverse @{ $_[0][ARRAY] }]
}

Do you see the problem? It’s hidden in the move_to_start subroutine. When we need to move an element to the start, we need to iterate over the whole array, because we don’t know its index. Storing the index wouldn’t help, either, as we would have to adjust the stored indices after each change. The problem doesn’t manifest in our toy example, but creating a cache with capacity 2000 and populating and querying it 10 thousand times takes more than 4 seconds.

Therefore, I decided to implement the cache using a real linked list. Let’s start with a class implementing it:

package Linked::List;
use strict;
use warnings;

use enum qw( KEY VALUE PREV NEXT );

sub new {
    my ($class, $key, $value) = @_;
    my $self = [];
    @$self[KEY, VALUE, PREV, NEXT] = ($key, $value, $self, $self);
    bless $self, $class
}

sub extract {
    my ($self) = @_;
    my $prev = $self->[PREV];
    my $next = $self->[NEXT];
    $prev->[NEXT] = $next;
    $next->[PREV] = $prev;
    @$self[PREV, NEXT] = ($self, $self);
}

sub prepend_to {
    my ($self, $list) = @_;
    return unless $list;
    $self->extract if $self->[PREV] != $self;
    $self->[NEXT] = $list // $self;
    $self->[PREV] = $list->[PREV] // $self;
    $list->[PREV][NEXT] = $self;
    $list->[PREV] = $self;
}

# Prevent memory leaks.
sub demolish {
    $_[0][NEXT] = $_[0][PREV] = undef;
}

sub key  { $_[0][KEY] }
sub prev { $_[0][PREV] }
sub next { $_[0][NEXT] }
sub value :lvalue { $_[0][VALUE] }

You can write a test suite for the class as an exercise.

The linked list is represented by its first element, but it’s implemented as a cyclic list, i.e. the first elements points to the last element as its predecessor and vice versa. The extract method detaches an element from the linked list by connecting the previous and next elements.

Note the demolish method. It breaks the cyclic reference to prevent memory leaks, we have to call it in our cache implementation when removing an element from the cache.

The cache implementation will be similar to the previous one. Neither get nor set will require walking the whole list. Again, optimisation makes the code a bit more complex and harder to read. On the other hand, the benchmark test takes 0.16 seconds, which is about 25 times faster!

package Cache::LRU;
use warnings;
use strict;

use enum qw( CAPACITY HASH FIRST );

sub new {
    my ($class, $capacity) = @_;
    my $self = [];
    @$self[CAPACITY, HASH, FIRST]
        = ($capacity, {}, undef);
    bless $self, $class
}

sub get {
    my ($self, $key) = @_;
    return unless exists $self->[HASH]{$key};

    my $element = $self->[HASH]{$key};
    if ($element != ($self->[FIRST] // -1)) {
        $element->extract;
        $element->prepend_to($self->[FIRST]);
        $self->[FIRST] = $element;
    }
    return \$element->value
}

sub set {
    my ($self, $key, $value) = @_;

    my $element;
    if (exists $self->[HASH]{$key}) {
        $element = $self->[HASH]{$key};
    } else {
        $element = 'Linked::List'->new($key, $value);
        $self->[HASH]{$key} = $element;
    }
    $element->prepend_to($self->[FIRST])
        unless $element == ($self->[FIRST] // -1);
    $self->[HASH]{$key}->value = $value;
    $self->[FIRST] = $element;
    if (keys %{ $self->[HASH] } > $self->[CAPACITY]) {
        my $last = $self->[FIRST]->prev;
        $last->extract;
        delete $self->[HASH]{ $last->key };
        $last->demolish;
    }
}

sub inspect {
    my ($self) = @_;
    my $element = $self->[FIRST];
    my @keys;
    for (keys %{ $self->[HASH] }) {
        unshift @keys, $element->key;
        $element = $element->next;
    }
    return [@keys]
}

You can see that both the get and set methods extract the element from the list and prepend it to the first position (set can even populate it if it doesn’t exist yet). They also change the cache object to point to the first element of the list. Moreover, set also removes the last element (the list is cyclic, so it’s just the element preceding the first one) when the capacity has been reached.

Do you see -1 mentioned twice in the code? We are comparing the reference addresses of two objects, if the object doesn’t exist, we use -1 which is an impossible value for an address (it means the cache is empty).

For completeness, here’s the benchmark code. Its output was used to verify both the implementations work identically.

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

srand(1);
my $c = 'Cache::LRU'->new(2000);
$c->set(int rand 5000, rand) for 1 .. 10_000;
$c->get(int rand 5000) for 1 .. 10_000;

my @inspect = $c->inspect;
@inspect = ref $inspect[0] ? @{ $inspect[0] } : @inspect;
say "@inspect"; 

Leave a comment

About E. Choroba

user-pic I blog about Perl.