December 2019 Archives

Perl Weekly Challenge 40: Multiple Arrays Content and Sublist Sorting

These are some answers to the Week 40 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a couple of days (December 29, 2019). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

Challenge # 1: Multiple Arrays Content

You are given two or more arrays. Write a script to display values of each list at a given index.

For example:

Array 1: [ I L O V E Y O U ]
Array 2: [ 2 4 0 3 2 0 1 9 ]
Array 3: [ ! ? £ $ % ^ & * ]

We expect the following output:

I 2 !
L 4 ?
O 0 £
V 3 $
E 2 %
Y 0 ^
O 1 &
U 9 *

Multiple Arrays Content in Perl 5

Since this task seems fairly simple and does not require too much typing, I’ll suggest several solutions, with added features each time.

Considering the example provided with the task, we can see that all three arrays have the same number of items and that each item is just one character long. With such input data, the solution may be as simple as this:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my @a1 = qw /I L O V E Y O U/;
my @a2 = qw /2 4 0 3 2 0 1 9/;
my @a3 = qw /! ? £ $ % ^ & */;

say "$a1[$_] $a2[$_] $a3[$_]" for 0..$#a1;

Running this script produces the following output:

$ perl mult_arrays.pl
I 2 !
L 4 ?
O 0 £
V 3 $
E 2 %
Y 0 ^
O 1 &
U 9 *

So job done in just one real code line, it seems, without any attempt at golfing, just normal relatively concise code.

But what if the arrays don’t have the same size? What if we don’t have 3 arrays but, for example, 2 or 4? What if the array’s items don’t have the same length? Of course, in any of these situations, our code may very well break.

Items With Different Sizes

Let’s start with items not having the same size. If we assume that the items all have less than seven characters, we can just use tabulations instead of spaces:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my @a1 = qw /I L OOO V E Y O U/;
my @a2 = qw /244 42 0 1233 222 0 11 90/;
my @a3 = qw /! ???? £ $ %% ^ & */;

say "$a1[$_]\t$a2[$_]\t$a3[$_]" for 0..$#a1;

Running the script displays this:

$ perl mult_arrays.pl
I       244     !
L       42      ????
OOO     0       £
V       1233    $
E       222     %%
Y       0       ^
O       11      &
U       90      *

If any item can have a size greater than or equal to 7, then using tabulations is not sufficient. In this case, we can use formatted printing (with printf or sprintf). This is a boring exercise, and therefore best left as an exercise to the reader. (Just kidding, of course. If you want this feature and don’t know how to do it, please refer to the end of the Raku section below, where an example on how to do it is provided.)

Varying Number of Sublists

For going further, we probably want to change our data structure. Rather than having three hard-coded arrays, we will use an array of arrays (AoA), where the number of sub-arrays can be anything.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my @a = ( [ qw /I L O V E Y O U/ ], 
          [ qw /244 42 0 1233 222 0 11 90/ ],
          [ qw /! ???? £ $ %% ^ & */ ],
          [ qw /a b c d e f g f/ ] 
        );

my $sub_array_size = scalar @{$a[0]};
for my $i (0..$sub_array_size -1) {
    for (0..$#a) {
        print "$a[$_][$i]\t";
    }
    say "";
}

This works:

$ perl mult_arrays.pl
I       244     !       a
L       42      ????    b
O       0       £       c
V       1233    $       d
E       222     %%      e
Y       0       ^       f
O       11      &       g
U       90      *       f

but this starts to be somewhat unwieldy.

Matrix Transposition

At this point, we may want to transpose lines and columns of the @a array, store the transposed version into a @b array, and then simply print line by line the sub-arrays of @b:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my @a = ( [ qw /I L O V E Y O U/ ], 
          [ qw /244 42 0 1233 222 0 11 90/ ],
          [ qw /! ???? £ $ %% ^ & */ ],
          [ qw /a b c d e f g f/ ] 
        );
my @b;
my $sub_array_size = scalar @{$a[0]};
for my $i (0..$sub_array_size -1) {
    push @b, [ map { $a[$_][$i]} 0 .. @a - 1];
}
say join "\t", @$_ for @b;

This displays the same as before:

$ perl  mult_arrays.pl
I       244     !       a
L       42      ????    b
O       0       £       c
V       1233    $       d
E       222     %%      e
Y       0       ^       f
O       11      &       g
U       90      *       f

Sublists of Different Sizes

This line-column transposition makes the program only moderately simpler, but it will make the next (and last) step easier. The next step is to handle the case where the sub-arrays don’t have the same number of elements. To handle this case, we first need to loop over the input array to find out the size of largest sub-array (the last one in the example below) and change the range of the main for loop header accordingly. The only additional change required is to handle empty slots in the last code line that prints out the result:

use strict;
use warnings;
use feature qw/say/;

my @a = ( [ qw /I L O V E Y O U / ], 
          [ qw /244 42 0 1233 222 0 / ],
          [ qw /! ???? £ $ %% ^ / ],
          [ qw /a b c d e f g h i j k/ ] 
        );
my $max_size = 0;
for (@a) {
    $max_size = scalar @$_ if @$_ > $max_size;
}
my @b;
for my $i (0..$max_size - 1) {
    push @b, [ map { $a[$_][$i]} 0 .. @a - 1];
}
say join "\t", map {$_ // "" } @$_ for @b;

Our final version displays the following output:

$ perl  mult_arrays.pl
I       244     !       a
L       42      ????    b
O       0       £       c
V       1233    $       d
E       222     %%      e
Y       0       ^       f
O                       g
U                       h
                        i
                        j
                        k

Multiple Arrays Content in Raku (formerly known as Perl 6)

We don’t need to try to solve the problem step by step in the Raku programming language, as the Z Zip operator, used together with the [] reduction metaoperator to act on several sub-arrays, gives us a very easy way to transpose lines and columns of a 2-D array:

use v6;

my @a = < I L O V E Y O U >, 
        < 244 42 0 1233 222 0 11 90 >,
        < ! ???? £ $ %% ^ & * >,
        < a b c d e f g f >;

my @b = [Z] @a;  # performs transposition
say join "\t", map {$_ // "" }, @$_ for @b;

This displays the following output:

$ perl6 mult_arrays.p6
I       244     !       a
L       42      ????    b
O       0       £       c
V       1233    $       d
E       222     %%      e
Y       0       ^       f
O       11      &       g
U       90      *       f

Sublists of Different Sizes

But that doesn’t work if the sub-arrays don’t have the same size, since the Z Zip operator will stop if one of the operands runs out of elements prematurely.

One possibility to solve the problem is to add dummy items (for example empty strings) to the smaller sublists. This means we now need to iterate twice over the input array, once to figure out the longest sublist, and a second time to add the dummy items. For some reason, I wasn’t able to modify the sublists (they appear to be immutable), so I had to create a copy of the @a input array.

use v6;

my @a = < I L O V E Y O U >, 
        < 244 42 0 1233 222 0 11 >,
        < ! ???? £ $ %% ^ & * >,
        < a b c d e f g f i j>;

my $max = max map { .elems }, @a;
my @b = map { (| $_, "" xx $max - .elems).flat }, @a;
my @c = [Z] @b;
say join "\t", map {$_ // "" }, @$_ for @c;

This produces the following output:

$ perl6 mult_arrays.p6
I       244     !       a
L       42      ????    b
O       0       £       c
V       1233    $       d
E       222     %%      e
Y       0       ^       f
O       11      &       g
U               *       f
                        i
                        j

Another way to do it is to use a for loop to copy the array elements one by one. In most programming languages, you would normally need two nested loops, but we can avoid that thanks to Raku’s X Cross operator used over the indices of the array of arrays:

use v6;

my @a = < I L O V E Y O U >, 
        < 244 42 0 1233 222 0 11 >,
        < ! ???? £ $ %% ^ & * >,
        < a b c d e f g f i j>;

my $max = max map { .elems }, @a;
my @b;
for ^$max X ^@a.elems -> ($i, $j) {
    @b[$i][$j] = @a[$j][$i] // "";
}
say join "\t", @$_ for @b;

This produces the same output as the previous implementation immediately above.

Item Lengths Exceeding the Tabulation Size

Now, what if some of the array items have a length exceeding the tabulation size (7 or more character)? Using tabulations is no longer sufficient. We can construct dynamically a formatting string to be used by the sprinf, printf, or fmt built-in functions:

use v6;

my @a = < I L O V E Y O U >, 
        < 244 42 0 123344556677 222 0 11 >,
        < ! ?????? £ $ %% ^ & * >,
        < a b c d e f g f i j>;

my $max = max map { .elems }, @a;
my @max-lengths = map { .map({.chars}).max  }, @a;
my $fmt = [~] map {"%-" ~ @max-lengths[$_] + 2 ~ "s"}, keys @max-lengths;
say "Format: ", $fmt;  # Displaying the resulting formatting string
my @b;
for ^$max X ^@a.elems -> ($i, $j) {
    @b[$i][$j] = @a[$j][$i] // "";
}
printf "$fmt\n", @$_ for @b;

This displays the following output:

$ perl6 mult_arrays.p6
Format: %-3s%-14s%-8s%-3s
I  244           !       a
L  42            ??????  b
O  0             £       c
V  123344556677  $       d
E  222           %%      e
Y  0             ^       f
O  11            &       g
U                *       f
                         i
                         j

Challenge # 2: Sort Sublists

You are given a list of numbers and set of indices belong to the list. Write a script to sort the values belongs to the indices.

For example,

List: [ 10, 4, 1, 8, 12, 3 ]
Indices: 0,2,5

We would sort the values at indices 0, 2 and 5 i.e. 10, 1 and 3.

Final List would look like below:

List: [ 1, 4, 3, 8, 12, 10 ]

Sorting Sublists in Perl 5

This is the perfect example for using array slices, which was the subject of a challenge a few weeks ago. We’ll use slices twice: one to extract from the list the values to be sorted, and once again for inserting the sorted values back into the array at their proper position. And we end up with a single line of code doing all the real work:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
use Data::Dumper;

my @numbers = (10, 4, 1, 8, 12, 3);
my @indices = (0, 2, 5);

@numbers[@indices] = sort { $a <=> $b } @numbers[@indices];
say "@numbers";

This is the output displayed by the program:

$ perl  sublists.pl
1 4 3 8 12 10

Sorting Sublists in Raku

As in Perl 5, we can use array slices to make things really simple. The program is even simpler in Raku, since we don’t need the { $a <=> $b } code block used in Perl 5 to obtain numeric sort: Raku’s sort procedure is clever enough to discover that it should perform numeric sort when it sees numbers (well, more accurately, it is the default cmp operator used by sort which is smart enough to compare strings with string semantics and numbers with number semantics).

use v6;

my @numbers = 10, 4, 1, 8, 12, 3;
my @indices = 0, 2, 5;

@numbers[@indices] = sort @numbers[@indices];
say @numbers;

This program displays the following output:

$ perl6 sublists.p6
[1 4 3 8 12 10]

Wrapping up

The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, January 5, 2020. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge # 39: Guest House and Reverse Polish Notation

These are some answers to the Week 39 of the Perl Weekly Challence organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a couple of days (December 22, 2019). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

Task # 1: Guest House

A guest house had a policy that the light remain ON as long as the at least one guest is in the house. There is guest book which tracks all guest in/out time. Write a script to find out how long in minutes the light were ON.

The guest book looks as follows:

1) Alex    IN: 09:10 OUT: 09:45
2) Arnold  IN: 09:15 OUT: 09:33
3) Bob     IN: 09:22 OUT: 09:55
4) Charlie IN: 09:25 OUT: 10:05
5) Steve   IN: 09:33 OUT: 10:01
6) Roger   IN: 09:44 OUT: 10:12
7) David   IN: 09:57 OUT: 10:23
8) Neil    IN: 10:01 OUT: 10:19
9) Chris   IN: 10:10 OUT: 11:00

First, although the input data provided with the task spans over only 2 hours, I’ll make the computation over a full day, from 00:00 to 23:59. One of the reasons for doing so is that I wanted to add a guest staying over more than two hours, in order to test the case where someone is in the guest house for more than two adjacent hours. Also, I did not want the guests to be male only. So, I added one female guest:

10) Liz    IN: 12:07 OUT: 17:05

I can think of several ways to solve this task. I decided to create a hash of arrays covering every minute in the 00:00-23:59 range. It could have been an array of arrays, but I started with 09:00-11:00 range provided in the task, and that led to an array with empty slots, which I did not like too much because this is likely to generate warnings or require some special care to avoid such warnings (or runtime errors). The program then parses the input data and sets each minute in the presence ranges with 1. Populating the whole range with zeros before starting isn’t strictly necessary, but it makes other things easier, as it is possible at the end to just add values without having to first check for definedness.

We don’t care about the guests’ names, so when reading the input data, we only look at the time intervals.

Note that there is a slight ambiguity in the task description. If one guest arrives at 10:10 and leaves at 10:11, I consider that the light has to be on for 2 minutes, even though it may be argued that, by a simple subtraction, the guest staid only 1 minute.

Guest House in Perl 5

In Perl 5, we just put the input data in the DATAsection.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my %hm;
for my $hour (0..23) {
    $hm{$hour}[$_] = 0 for 0..59;
}
while (<DATA>) {
    next unless /\S/;
    my ($in_h, $in_m, $out_h, $out_m) = /(\d\d):(\d\d)\D+(\d\d):(\d\d)/;
    if ($out_h eq $in_h) {
        $hm{0+$in_h}[$_] = 1 for $in_m..$out_m;
    } else {
        $hm{0+$in_h}[$_]  = 1 for $in_m..59; # end the first hour
        for my $hour ($in_h + 1 .. $out_h -1) {
            $hm{$hour}[$_] = 1 for 0..59;    # If several hours
        }
        $hm{0+$out_h}[$_] = 1 for 0..$out_m; # Complete last hour
    }
}
my $total_on = 0;
for my $hour (keys %hm) {
        $total_on += $hm{$hour}[$_] for 0..59;
}
say "Total time on: $total_on minutes.";

__DATA__
1) Alex    IN: 09:10 OUT: 09:45
2) Arnold  IN: 09:15 OUT: 09:33
3) Bob     IN: 09:22 OUT: 09:55
4) Charlie IN: 09:25 OUT: 10:05
5) Steve   IN: 09:33 OUT: 10:01
6) Roger   IN: 09:44 OUT: 10:12
7) David   IN: 09:57 OUT: 10:23
8) Neil    IN: 10:01 OUT: 10:19
9) Chris   IN: 10:10 OUT: 11:00
10) Liz    IN: 12:07 OUT: 17:05

With the original input data set, the result was 111 seconds. With my modified data set, I obtain the following output:

$ perl  guesthouse.pl
Total time on: 410 minutes.

Guest House in Raku (formerly known as Perl 6)

There is no DATA section in Raku. Raku should have much more feature-rich capabilities using pod (plain old documentation) sections, but these are not implemented yet. We could use the heredocs feature, but since TIMTOWTDI, we will simply use a multi-line string variable within standard double quote marks.

use v6;

my $input = 
   "1) Alex    IN: 09:10 OUT: 09:45
    2) Arnold  IN: 09:15 OUT: 09:33
    3) Bob     IN: 09:22 OUT: 09:55
    4) Charlie IN: 09:25 OUT: 10:05
    5) Steve   IN: 09:33 OUT: 10:01
    6) Roger   IN: 09:44 OUT: 10:12
    7) David   IN: 09:57 OUT: 10:23
    8) Neil    IN: 10:01 OUT: 10:19
    9) Chris   IN: 10:10 OUT: 11:00
    10) Liz    IN: 12:07 OUT: 17:05";

my %hm;
for 0..23 -> $hour {
    %hm{$hour}[$_] = 0 for 0..59;
}
for $input.lines {
    next unless /\S/;
    my ($in_h, $in_m, $out_h, $out_m) = map { +$_}, $/[0..3] if /(\d\d)':'(\d\d)\D+(\d\d)':'(\d\d)/;
    if ($out_h == $in_h) {
        %hm{$in_h}[$_] = 1 for $in_m..$out_m;
    } else {
        %hm{$in_h}[$_]  = 1 for $in_m..59; # end the first hour
        for $in_h + 1 .. $out_h -1 -> $hour {
            %hm{$hour}[$_] = 1 for 0..59; # If several hours
        }
        %hm{$out_h}[$_] = 1 for 0..$out_m; # Complete last hour
    }
}

my $total_on = 0;
for keys %hm -> $hour {
    $total_on += sum %hm{$hour};
}
say "Total time on: $total_on minutes.";

This program produces the same result as the P5 program:

$ perl6 guesthouse.p6
Total time on: 410 minutes.

Task # 2: Reverse Polish Notation

Write a script to demonstrate Reverse Polish notation (RPN). Checkout the wiki page for more information about RPN.

This task reminds me of the Hewlett-Packard pocket calculators during my teen years in the 1970’s. To tell the truth, I had bought at the time a Texas Instruments programmable calculator using standard infix notation, but a friend of mine has a much more powerful HP hand-held calculator. The most important difference was that my friend’s HP calculator could save programs for later use, while my more basic (and much cheaper) TI calculator would lose everything when switched off. So we worked quite a bit on his calculator for studying assignments and, although I never became as fluent as my friend with RPN, I understood quite well at the time how to use it.

Anyway, the point about RPN, which is also known as postfix notation, is that you first state the operands and then only the operators. If it is a binary operator (the most common case), you just pick up the last two previous operands. For example, to add numbers 7 and 11, instead of typing something like 7 + 11, you would type 7, 11 +. RPN is a bit counter-intuitive at first, but it is quite efficient because it avoids using parentheses for specifying operation precedence. As a result, RPN supposedly requires less typing than usual infix notation. The following Wikipedia example shows the difference. The following inxix notation expression:

((15 ÷ (7 − (1 + 1))) × 3) − (2 + (1 + 1))

can be written as follows in RPN:

15 7 1 1 + − ÷ 3 × 2 1 1 + + − =

The essential idea for processing RPN notation is a stack (a last-in first-out data or LIFO structure): when you read an operand, you just push it onto a stack. And when you read a binary operator, you just pop two values from the stack, apply the operator to them and push the result back onto the stack. We need to be cautious about something for operators which are not commutative such as subtraction or division: the first operand that we pop from the stack has to be the second operand in the operation, and the second popped operand will be the the first one in the operation.

The code for the operations is stored in a dispatch table, i.e. a hash where the keys are the operators and the values are code references to short subroutines performing the arithmetic operations. Note that I encountered unexpected difficulties because some of the RPN expressions that I copied from the Wikipedia page contain special Unicode characters for subtraction, multiplication and division. This was especially tricky for the subtraction operator, since the common ASCII dash or hyphen and the Unicode minus sign really look similar. To fix this, I only needed to add entries with those special characters in the dispatch table (and the use utf8; pragma).

Reverse Polish Notation in Perl 5

I have included five test cases using the Test::More module. For a real life program, we would probably want more tests

#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use Test::More tests => 5;

my %operations = (
    '+' => sub { return $_[0] + $_[1]; },
    '-' => sub { return $_[0] - $_[1]; }, # hyphen
    '−' => sub { return $_[0] - $_[1]; }, # minus
    'x' => sub { return $_[0] * $_[1]; },
    '*' => sub { return $_[0] * $_[1]; },
    '×' => sub { return $_[0] * $_[1]; },
    '/' => sub { return $_[0] / $_[1]; },
    '÷' => sub { return $_[0] / $_[1]; },
);

sub parse_operation {
    my @stack;
    for my $token (split /\s+/, shift) {
        if ($token =~ /^\d+$/) {
            push @stack, $token ;
        } elsif (exists $operations{$token}) {
            return "Invalid RPN expression" if @stack < 2;
            my $op2 = pop @stack;
            my $op1 = pop @stack;
            push @stack, $operations{$token}->($op1, $op2);
        } else {
            die "Invalid token $token.";
        }
    }
    return $stack[0]
}
is parse_operation("1 2 +"), 3, "2 operands";
is parse_operation("1 2 + 4 ×"), 12, "3 operands, a Unicode multiplication operator";
is parse_operation("1 2 + 4 * 5 + 3 -"), 14, "5 operands";
is parse_operation("3 4 5 x -"), -17, "Start with 3 operands and then two operators";
is parse_operation("15 7 1 1 + − ÷ 3 × 2 1 1 + + −"), 5, "8 operands, 4 Unicode operators";

Running the program shows that all tests pass correctly:

$ perl rpn.pl
1..5
ok 1 - 2 operands
ok 2 - 3 operands, unicode multiplication operator
ok 3 - 5 operands
ok 4 - Start with 3 operands and then two operators
ok 5 - 8 operands, 4 Unicode operators

Note that this program does only minimal RPN validity check (only that we get valid tokens and that the stack has at least two values when we want to process an operator). Otherwise, we basically assume the RPN expression is correct. In a real-life program, more validity checks would probably be necessary (for example that the stack has only one value left at the end of the parsing).

Reverse Polish Notation in Raku

Although I have been thinking of various ways of doing it in Raku, notably using a grammar, I think the simplest is to use a stack as in Perl 5. Only a few minor changes are required to have a working Raku program:

use v6;
use Test;

my %operations = (
    '+' => { $^a + $^b; },
    '-' => { $^a - $^b; }, # hyphen
    '−' => { $^a - $^b; }, # dash
    'x' => { $^a * $^b; },
    '*' => { $^a * $^b; },
    '×' => { $^a * $^b; },
    '/' => { $^a / $^b; },
    '÷' => { $^a / $^b; },
);

sub parse_operation (Str $expr) {
    my @stack;
    for $expr.split(/\s+/) -> $token {
        if $token ~~ /^ \d+ $/ {
            push @stack, $token ;
        } elsif (%operations{$token}:exists) {
            return "Invalid RPN expression" if @stack.elems < 2;
            my $op2 = pop @stack;
            my $op1 = pop @stack;
            push @stack, %operations{$token}($op1, $op2);
        } else {
            die "Invalid token $token.";
        }
    }
    return @stack[0]
}
plan 5;
is parse_operation("1 2 +"), 3, "2 operands";
is parse_operation("1 2 + 4 ×"), 12, "3 operands, a Unicode multiplication operator";
is parse_operation("1 2 + 4 * 5 + 3 -"), 14, "5 operands";
is parse_operation("3 4 5 x -"), -17, "Start with 3 operands and then two operators";
is parse_operation("15 7 1 1 + − ÷ 3 × 2 1 1 + + −"), 5, "8 operands, 4 Unicode operators";

Running this program produces the same output as the P5 program:

$ perl6 rpn.p6
1..5
ok 1 - 2 operands
ok 2 - 3 operands, a Unicode multiplication operator
ok 3 - 5 operands
ok 4 - Start with 3 operands and then two operators
ok 5 - 8 operands, 4 Unicode operators

Using the when “switch” statement provided by Raku, we can get rid of the dispatch table and make our code slightly more concise as follows:

use v6;
use Test;

sub perform-op (&op) {
    push @*stack, @*stack.pop R[&op] @*stack.pop;
}
sub parse_operation (Str $expr) {
    my @*stack;
    for $expr.split(/\s+/) {
        when /^ \d+ $/       { @*stack.push($_)}
        when '+'             { perform-op &[+] }
        when '*' | 'x' | '×' { perform-op &[*] }
        when '/' | '÷'       { perform-op &[/] }
        when '-' | '−'       { perform-op &[-] }
        default { die "Invalid token $_."; }
    }
    return @*stack[0]
}
plan 5;
is parse_operation("1 2 +"), 3, "2 operands";
is parse_operation("1 2 + 4 ×"), 12, "3 operands, a Unicode multiplication operator";
is parse_operation("1 2 + 4 * 5 + 3 -"), 14, "5 operands";
is parse_operation("3 4 5 x -"), -17, "Start with 3 operands and then two operators";
is parse_operation("15 7 1 1 + − ÷ 3 × 2 1 1 + + −"), 5, "8 operands, 4 Unicode operators";

This passes all the tests correctly as before.

Wrapping up

The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, December 29. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge 38: Date Finder and Word Game

These are some answers to the Week 38 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Challenge # 1: Date Finder

Create a script to accept a 7 digits number, where the first number can only be 1 or 2. The second and third digits can be anything 0-9. The fourth and fifth digits corresponds to the month i.e. 01,02,03…,11,12. And the last 2 digits represents the days in the month i.e. 01,02,03….29,30,31. Your script should validate if the given number is valid as per the rule and then convert into human readable format date.

Rules:

1) If 1st digit is 1, then prepend 20 otherwise 19 to the 2nd and 3rd digits to make it 4-digits year.

2) The 4th and 5th digits together should be a valid month.

3) The 6th and 7th digits together should be a valid day for the above month.

For example, the given number is 2230120, it should print 1923-01-20.

Task 1: Date Finder in Perl 5

This time, rather than concentrating on a test suite, I decided to focus on trying to provide useful warnings and error messages when the input value is not valid, which led me to test the input data piece by piece:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
use Time::Local;

my $in = shift // '2230120';
die "Input should be seven digits\n" unless $in =~ /^\d{7}$/;

my ($y1, $y2, $m, $d) = $in =~ /^(\d)(\d\d)(\d\d)(\d\d)/;
die "First digit should be 1 or 2\n" if $y1 !~ /[12]/;
my $year = $y1 == 1 ? "20$y2" : "19$y2";
die "Digits 4 and 5 should be a valid month number\n" unless $m =~ /(0\d)|(1[012])/;
die "Digits 6 and 7 should be a valid day in month\n" unless $d =~ /([012]\d)|(3[01])/;
my $test = eval { timelocal 0, 0, 0, $d, $m-1, $year - 1900 };
warn $@ if $@;
die "$in is equivalent to $year-$m-$d, which is an invalid date\n" unless defined $test;
say "$in is equivalent to $year-$m-$d.";

Note that, in the final section, I’m using the Time::Local module to validate a date. I have shown previously (see for example my blog post on the week day task of PWC # 37) how to figure out the number of days in any month of any year, taking into account leap years, without the help of any module.

These are a few examples of tests with various input data:

$ perl date_finder.pl 223022
Input should be seven digits

$ perl date_finder.pl
2230120 is equivalent to 1923-01-20.

$ perl date_finder.pl 2230120
2230120 is equivalent to 1923-01-20.

$ perl date_finder.pl 2230431
Day '31' out of range 1..30 at date_finder.pl line 15.
2230431 is equivalent to 1923-04-31, which is an invalid date

$ perl date_finder.pl 2230229
Day '29' out of range 1..28 at date_finder.pl line 15.
2230229 is equivalent to 1923-02-29, which is an invalid date

$ perl date_finder.pl 1960229
1960229 is equivalent to 2096-02-29.

When something goes wrong with the date, we have two messages (a warning and an error). Of course, we don’t need both, one would be sufficient, but this illustrates two different ways of reporting an invalid date.

Task 1: Date Finder in Raku (formerly known as Perl 6)

Let’s start with a simple port of the P5 program to Raku:

use v6;

sub MAIN ($in where * ~~ /^\d ** 7$/ = '2230120') {
    my ($y1, $y2, $m, $d) = ($in ~~ /^(\d)(\d\d)(\d\d)(\d\d)/)[0..3];
    die "First digit should be 1 or 2\n" if $y1 !~~ /<[12]>/;
    my $year = $y1 == 1 ?? "20$y2" !! "19$y2";
    die "Digits 4 and 5 should be a valid month number\n" unless $m ~~ /(0\d) | (1<[012]>)/;
    die "Digits 6 and 7 should be a valid day in month\n" unless $d ~~ /(<[012]>\d) | (3<[01]>)/;

    try { 
        my $test = Date.new($year, $m, $d);
    }
    die "$in is equivalent to $year-$m-$d, which is an invalid date\n" if $!;
    say "$in is equivalent to $year-$m-$d.";
}

Besides the minor syntax changes between the two languages, the only significant change is that the program attempts to create a Date object within a try block. This program produces essentially the same output as the P5 program.

Another way to do it would be to use a grammar, for example:

use v6;

grammar My-custom-date {
    token TOP { <y1> <y2> <m> <d> }
    token y1  { <[12]> }
    token y2  { \d ** 2}
    token m   { 0\d | 1<[012]> }
    token d   { <[012]> \d | 3<[01]> } 
}

sub MAIN ($in where * ~~ /^\d ** 7$/ = '2230120') {
    my $matched  = so My-custom-date.parse($in);
    say "Invalid input value $in" and exit unless $matched;
    my $year = $<y1> == 1 ?? "20$<y2>" !! "19$<y2>";
    try { 
        my $test = Date.new($year, $<m>, $<d>);
    }
    say "ERROR: $in is equivalent to $year-$<m>-$<d>, which is an invalid date\n" and exit if $!;
    say "$in is equivalent to $year-$<m>-$<d>.";
}

But, in this case, the advantage of using a grammar is not obvious, except for the fact the parsing is possibly slightly clearer. It might even be argued that using a grammar for such a simple case is sort of a technological overkill.

These are some sample runs:

$ perl6 date_finder.p6
2230120 is equivalent to 1923-01-20.

$ perl6 date_finder.p6 2230228
2230228 is equivalent to 1923-02-28.

$ perl6 date_finder.p6 2230229
Use of Nil in string context
ERROR: 2230229 is equivalent to 1923--, which is an invalid date

  in block  at date_finder.p6 line 17
Use of Nil in string context
  in block  at date_finder.p6 line 17

Task2: Word Game

Lets assume we have tiles as listed below, with an alphabet (A..Z) printed on them. Each tile has a value, e.g. A (1 point), B (4 points) etc. You are allowed to draw 7 tiles from the lot randomly. Then try to form a word using the 7 tiles with maximum points altogether. You don’t have to use all the 7 tiles to make a word. You should try to use as many tiles as possible to get the maximum points.

For example, A (x8) means there are 8 tiles with letter A.

1 point

    A (x8), G (x3), I (x5), S (x7), U (x5), X (x2), Z (x5)

2 points

    E (x9), J (x3), L (x3), R (x3), V (x3), Y (x5)

3 points

    F (x3), D (x3), P (x5), W (x5)

4 points

    B (x5), N (x4)

5 points

    T (x5), O (x3), H (x3), M (x4), C (x4)

10 points

    K (x2), Q (x2)

So, the game is essentially similar to Scrabble, except that there is no game board.

I don’t see any way to solve correctly this task without a brute-force approach, i.e. trying all possibilities to find the best score.

Just as for some previous challenges, I will use a words.txt file containing 113,809 lower-case English words usually accepted for crossword puzzles and other word games. The words.txt file can be found on my Github repository. The original list was contributed to the public domain by Internet activist Grady Ward in the context of the Moby Project. This word list is also mirrored at Project Gutenberg.

For the purpose of testing the programs below, the words.txt file is located in my current directory. Obviously, when we will be reading the list, we will need to keep only the words having the same length as the two input words. The word.txt input file only contains words with only lowercase alphabetical ASCII characters.

Word Game in Raku

Given that I had a very busy week and weekend for various personal reasons, it is now late on Sunday, and I’m not sure I’ll be able do solve this Scrabble-like in both Perl 5 and Raku in time for the deadline. So I decided to start with Raku, which has some functionalities that are useful for this task and not existing in Perl 5.

The solution is essentially as follows: read the file of authorized words, normalize the words by putting their letters in alphabetical order and store in a %word-list hash the normalized versions as a key, and the original word as a value.

Store the available letter tiles in a Bag. Then pick seven tiles (or any other number) from the bag, use the combinations method to produce all time combinations from the drawn letters, sort each combination alphabetically and look up for the result in the hash. If the result is found, compute its score and retain the word with the highest score so far. At the end, print the word with the highest score.

use v6;

constant %tile-values =  
    A => 1, B => 4, C => 5, D => 3, E => 2, 
    F => 3, G => 1, H => 5, I => 1, J => 2, 
    K => 10, L => 2, M => 5, N => 4, O => 5, 
    P => 3, Q => 10, R => 2, S => 1, T => 5, 
    U => 1, V => 2, W => 3, X => 1, Y => 2, Z => 1;

constant %tile-count =
    A => 8, B => 5, C => 4, D => 3, E => 9, 
    F => 3, G => 3, H => 3, I => 5, J => 3, 
    K => 2, L => 3, M => 4, N => 4, O => 3, 
    P => 5, Q => 2, R => 3, S => 7, T => 5, 
    U => 5, V => 3, W => 5, X => 2, Y => 5, Z => 5;

my $tile-bag = (map {$_ xx %tile-count{$_}}, keys %tile-count).Bag;

sub MAIN (Int $count = 7) {
    my %word-list;
    for "words.txt".IO.lines -> $line {
        next if $line.chars > $count;
        my $ordered = $line.uc.comb.sort.join("");
        my $line-value = [+] $ordered.comb.map({%tile-values{$_}});
        %word-list{$ordered}<word> = $line;
        # Note we will keep only one word for anagrams, but 
        # that's OK since anagrams have the same value
        %word-list{$ordered}<value> = $line-value;
    }
    for 1..10 {
        my @picked-tiles = $tile-bag.pick($count);
        my $max-combination = "";
        my $max-value = 0;
        for @picked-tiles.combinations -> $candidate {
            my $ordered = $candidate.sort.join("");
            next unless %word-list{$ordered}:exists;
            if %word-list{$ordered}<value> > $max-value {
                $max-value = %word-list{$ordered}<value>;
                $max-combination = $ordered;
            }
        }
        say "The best candidate for list ", @picked-tiles.join(""), " is:"; 
        say "    No word found!" and next unless $max-value;
        say "    %word-list{$max-combination}<word> (score: $max-value)";
    }
}

Note that the program is not very fast (2 to 3 seconds for 7 tiles), but that’s mainly due to time required to read the 113k-word list and store the words into a hash. Once the hash is populated, finding the best solution is quite fast. This is the reason for which I decided to run the solution-finding part ten times once the hash is populated. If I were going to make a real-life solution for the challenge, I would store the hash in a file, as I have shown previously, notably on the word ladder challenge in May of this year.

This is an example run:

$ time perl6  scrabble.p6 7
The best candidate for list MESWAAG is:
    wames (score: 12)
The best candidate for list XPSPBAG is:
    paps (score: 8)
The best candidate for list KISCPAC is:
    spick (score: 20)
The best candidate for list BMRPSPU is:
    bumps (score: 14)
The best candidate for list LZRYVBY is:
    byrl (score: 10)
The best candidate for list KHEBLZP is:
    kelp (score: 17)
The best candidate for list FHIOUHI is:
    foh (score: 13)
The best candidate for list VXAWSJH is:
    wash (score: 10)
The best candidate for list LIXMPGZ is:
    limp (score: 11)
The best candidate for list AJSFBOF is:
    boffs (score: 16)

real    0m2,751s
user    0m0,000s
sys     0m0,031s

I know that some words found above may seem funny or uncommon, but they belong to the words.txt file that is one of the official source of words for word games.

I’m sorry, I just don’t have enough time right now to finish the Perl 5 version of this task for the deadline (although I’ve almost completed it).

Wrapping up

The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, December 22. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge 37: Week Days in Each Month and Daylight Gain/loss

These are some answers to the Week 37 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a couple of days (December 8, 2019). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

Challenge # 1: Week Days in Each Month

Write a script to calculate the total number of weekdays (Mon-Fri) in each month of the year 2019.

Jan: 23 days
Feb: 20 days
Mar: 21 days
Apr: 22 days
May: 23 days
Jun: 20 days
Jul: 23 days
Aug: 22 days
Sep: 21 days
Oct: 23 days
Nov: 21 days
Dec: 22 days

Although the challenge speaks only of year 2019, I’ll expand it a bit to compute the total number of weekdays in each month of any year passed as a parameter (defaulted to 2019 if no year is passed).

Days in Week in Perl 5

There are many core or CPAN modules to deal with dates, many of which can be used for the task. However, it is a programming challenge. To me, if you let a module do the work for you, then you don’t really solve the challenge yourself. And you don’t learn much from the challenge. Plus you don’t really get the fun of solving something by yourself.

However, I don’t want to be dogmatic on that: this doesn’t mean that I don’t want to use any module. I just don’t want to let all the work been done by a module. In fact, I’ll use a Perl core module here, Time::timegm, but only to find the day in the week of the first day of the year.

Of course, if you have a module providing the weekday of any date, then you could just iterate over every day of every month of a year and call the appropriate function of that module (such as timegm) for each such day and easily compute the number of week days in any month of any year. But that’s not fun. So, I’ll use the module to find out the weekday of Jan. 1st and then then calculate everything by myself throughout the year under scrutiny.

For this, I need a @months array giving me the number of days in each month. This is straight forward common knowledge, except for the fact that leap years have 29 days in February. So we need to figure out if the year passed as a parameter is leap. In the Julian calendar (named after Julius Caesar), a year is leap if it is evenly divided by 4. Fairly simple. But the Julian calendar is slightly inaccurate and drifts of about 3 days every 4 centuries as compared to astronomy (the date of the equinoxes). That’s why Pope Gregory XIII introduced what is now known as the Gregorian Calendar in October 1582. According to the Gregorian calendar that we commonly use today, years evenly divided by 4 are not leap if they are also evenly divided by 100, unless they are also evenly divided by 400. For example, the years 1700, 1800, 1900, 2100 and 2200 are not leap years, but the years 1600, 2000, and 2400 are leap years. Having said that, I should point out that the Julian rule for leap years is consistent with the Gregorian rule for all years between 1901 and 2099, so that a simple is_leap subroutine implemented as follows:

sub is_leap {
    my $year = shift;
    return $year % 4 == 0;
}

would be appropriate for the 1901-2099 year range. So, this simple Julian rule is probably sufficient for any business application written nowadays. If, however, you want to make sure you’re accurate over a longer range of dates, then you need to implement the Gregorian calendar rule, which could lead to this new is_leap subroutine:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

sub is_leap {
  my $year = shift;
  return $year % 4   ? 0 :  # not evenly divided by 4  -> not leap
       $year % 100 ? 1 :  # divided evenly by 4 but not 100 -> leap
       $year % 400 ? 0 :  # divided by 100 but not 400 -> not leap
       1;                 # divided by 400
}

say is_leap(shift) ? "Yes" : "No";

Just for fun, we’ll use a slightly different implementation in our program below.

Once we have the number of days for each month of any year (including February) and the weekday of Jan. 1st, then it is fairly easy to compute the day in the week of any date in the year. But we don’t really need to do that for every single date: Any month, including February, has four weeks, and thus 20 weekdays, between the 1st and the 28th day. Thus, we only need to figure out the day in week of days between the 29th day and the month end. And we also know that the day in week of 29th day of a month is the same as the day in week of the 1st of the month.

So we need to iterate over every month, initialize the number of weekdays of that month to 20 and add one for each day after the 28th that falls on a weekday, and at the same time keep track of the weekday for these days, so that we can then start the next month knowing which day is the 1st day of that new month. In the timegm function, days in week (the 7th returned value) are represented by a number between 0 and 6, with 0 being Sunday. So, a date is a weekday of its day in week value is not 0 and not 6. In the inner for loop, we just add one for every new day, so that our calculated day in week might be temporarily 7, 8 of 9. Because of that, a value of 7 will also be a Sunday (and 8 and 9 will be, respectively, Monday and Tuesday. So, we count days in week 0, 6 and 7 as non weekdays. Once we have finished the inner loop, we subtract 7 from the current day in week value if needed to re-normalize the day in week to a 0..6 range.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
use Time::Local;

my $yr = shift // 2019;
my @months = (0, 31, is_leap($yr) ? 29 : 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
my $start_date = timegm( 0, 0, 0, 1, 0, $yr - 1900 ); # Jan 1st
my $day_in_week = (gmtime $start_date)[6];

for my $month (1..12) {
    my $weekdays = 20;
    for my $day (29..$months[$month]) {
        $weekdays ++ unless $day_in_week =~ /[607]/;
        $day_in_week ++;
    }
    printf "%02d/%d has  $weekdays week days.\n", $month, $yr;
    $day_in_week -= 7 if  $day_in_week > 6;
}

sub is_leap {
    my $yr = shift;
    return 0 if $yr % 4;    # no if not divisible by 4
    return 1 if $yr % 100;  # yes if divisible by 4 but not by 100
    return 0 if $yr % 400;  # no if divisible by 100 and not by 400
    return 1;               # yes if divisibe by 400
}

Note that the @months array has 13 items, the first one being 0 (it could be anything). The reason is that we’re using subscripts 1 to 12 for the months, so that the item at index 0 is simply not used.

This program displays the following output:

$ perl weekdays.pl 2019
01/2019 has  23 week days.
02/2019 has  20 week days.
03/2019 has  21 week days.
04/2019 has  22 week days.
05/2019 has  23 week days.
06/2019 has  20 week days.
07/2019 has  23 week days.
08/2019 has  22 week days.
09/2019 has  21 week days.
10/2019 has  23 week days.
11/2019 has  21 week days.
12/2019 has  22 week days.

It also works fine if we pass another year:

$ perl weekdays.pl 2020
01/2020 has  23 week days.
02/2020 has  20 week days.
03/2020 has  22 week days.
04/2020 has  22 week days.
05/2020 has  21 week days.
06/2020 has  22 week days.
07/2020 has  23 week days.
08/2020 has  21 week days.
09/2020 has  22 week days.
10/2020 has  22 week days.
11/2020 has  21 week days.
12/2020 has  23 week days.

And, with no parameter passed, it displays the weekdays for 2019 (the default value).

Days in Week in Raku (formerly known as Perl 6)

We could easily do the same in Raku, but Raku has expressive and efficient built-in features for date manipulations in the Date class.

This is an example under the REPL:

> my $date = Date.new(2019, 1, 1)
2019-01-01
> say $date.month;
1
> say $date.day-of-week;
2

So, Jan., 1st, 2019 fell on a Tuesday (day in week 2), and it is the first month (January).

Thus, using the methods demonstrated above, we could write simple a one-liner (formatted here over 2 lines to make more readable on this blog post) to find the result:

$ perl6 -e 'my @a; for Date.new(2019, 1, 1) .. Date.new(2019, 12, 31) -> $day
> { @a[$day.month]++ if $day.day-of-week == (1..5).any}; say @a[1..12];
'
(23 20 21 22 23 20 23 22 21 23 21 22)

For every date in the year, we increment a counter for the date’s month if that data is a weekday. Note the use of the (1..5).any junction to simplify comparisons with the 1..5 range.

We could even add a little bit of sugar to improve the output:

$ perl6 -e 'my @a; for Date.new(2019, 1, 1) .. Date.new(2019, 12, 31) -> $day
> { @a[$day.month]++ if $day.day-of-week == (1..5).any}; 
>  for @a[1..12].kv -> $k, $v {printf "%02d/2019: %d week days\n", $k+1, $v};
> '
01/2019: 23 week days
02/2019: 20 week days
03/2019: 21 week days
04/2019: 22 week days
05/2019: 23 week days
06/2019: 20 week days
07/2019: 23 week days
08/2019: 22 week days
09/2019: 21 week days
10/2019: 23 week days
11/2019: 21 week days
12/2019: 22 week days

But that’s perhaps getting a bit long for a one-liner. Let’s do a real program.

We will use the same general method as in Perl 5, i.e. iterating on the days after the 28th day of any month to find the number of weekdays in that interval, except that it can be simplified thanks to the Date class numerous method. First, Raku has a is-leap-year methodmethodis-leap-year), so we don’t need to implement it ourselves. But, in fact, we don’t even need to use this is-leap-year method, since the Date class also provides a days-in-month methodmethoddays-in-month) returning directly what we really need: the number of days in a given month.

The program is very simple and significantly shorter than its Perl 5 counterpart:

use v6;

sub MAIN (UInt $yr = 2019) {
    for 1..12 -> $mth {
        my $weekdays = 20;
        for 29..Date.new($yr, $mth, 1).days-in-month -> $day {
            $weekdays++ if 
                Date.new($yr, $mth, $day).day-of-week == (1..5).any;
        }
        printf "%02d/%d has $weekdays week days.\n", $mth, $yr;
    }
}

This program displays the following output:

$ perl6 weekdays.p6 2019
01/2019 has 23 week days.
02/2019 has 20 week days.
03/2019 has 21 week days.
04/2019 has 22 week days.
05/2019 has 23 week days.
06/2019 has 20 week days.
07/2019 has 23 week days.
08/2019 has 22 week days.
09/2019 has 21 week days.
10/2019 has 23 week days.
11/2019 has 21 week days.
12/2019 has 22 week days.

And it works fine with another year passed as an argument. If no argument is passed, the program correctly displays the result for the default input value, year 2019.

Task 2: Daylight Loss or Gain

Write a script to find out the Daylight gain/loss in the month of December 2019 as compared to November 2019 in the city of London. You can find out sunrise and sunset data for November 2019 and December 2019 for London.

A look at the links provided reveals that the linked pages provide not only sunrise and sunset data, but also daylight duration, which is really the input data we’re looking for. Not only is it going to be slightly easier to use directly daylight duration, but daylight values are also sixty times more accurate: sunrise and sunset have an accuracy of a minute, whereas daylight duration are precise to the second (so that, in fact, it won’t really be easier, since our calculations will need to be more accurate (and that’s a bit of a pain in the neck when values are given in sexagesimal or base-60 notation).

Otherwise, the requirement is not very clear, but I’ll take it to mean that we want to compute the daylight gain or loss between each day of December 2019 and the corresponding day in November 2019. Since November has only 30 days, we won’t be able to say anything about December 31, 2019, as there is no corresponding day in November. We will also compute the average daylight gain or loss (well, it’s obviously a loss, but we’ll assume we don’t know and will let the computer program find this out).

My final comment is that I haven’t used a Perl program for scraping data on the Internet for the last 15 years or so, and I don’t want to try to re-learn that in just a couple of days. Therefore, I just copied and pasted the data into a text file and edited it to remove useless data; and I’ll use that text file as input for my programs.

Daylight in Perl 5

In order to compute differences between values in sexagesimal notation (in base 60, i.e. expressed in hours/minutes/seconds), there are at least two strategies: you could implement a sexagesimal subtraction, with special rules for carry or borrow, or you could convert everything to seconds, perform standard arithmetic operations on values in seconds, and convert the result back into HMS sexagesimal values if needed. I chose the second solution (although, thinking again about it, the first solution might have been slightly simpler, but most software implementations of such problems rely on timestamps expressed in seconds elapsed since an arbitrary time origin often called the epoch). Anyway, the sec3hrs and hrs2secsubroutines perform the necessary conversions from and to seconds.

The program first reads through the input data and stores the daylight data into a @nov and a @dec arrays. Then it loops through the 1..30 range and, for each day, subtract the November daylight value from the December daylight value. The program also keeps track of a cumulative $total_diff change to be able to compute an average change over 30 days.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

sub hrs2sec {
    my ($hrs, $min, $sec) = split /:/, shift;
    return $hrs * 3600 + $min * 60 + $sec;
}
sub sec2hrs {
    my $sec = shift;
    my $hrs = int $sec / 3600;
    $sec = $sec % 3600;
    my $min = int $sec / 60;
    $sec = $sec % 60;
    return sprintf "$hrs:%02d:%02d", $min, $sec;
}
my (@nov, @dec);
my $aref = \@nov;
while (<DATA>) {
    chomp;
    $aref = \@dec if /^Dec/;
    next unless /^\d/;
    my ($date, $duration) = (split /\s+/)[0, 3];
    $aref->[$date] = hrs2sec $duration;
}
my $total_diff;
say "Daylight change between:";
for my $i (1..30) {
    my $diff = $dec[$i] - $nov[$i];
    $total_diff += $diff;
    my $dif_hrs = sec2hrs abs $diff;
    $dif_hrs = "- $dif_hrs" if $diff < 0; 
    printf "%02d Nov and %02d Dec: $dif_hrs\n", $i, $i;
}
say "Average change between Nov and Dec: ", $total_diff < 0 ? "- " : "", sec2hrs (abs $total_diff / 30);

__DATA__

    Sunrise Sunset  Length  

Nov 2019

1   06h53   16h34   9:40:44 
2   06h55   16h32   9:37:10 
3   06h56   16h30   9:33:37 
4   06h58   16h28   9:30:07 
5   07h00   16h27   9:26:38 
6   07h02   16h25   9:23:11 
7   07h03   16h23   9:19:45 
8   07h05   16h22   9:16:22 
9   07h07   16h20   9:13:01 
10  07h09   16h18   9:09:42 
11  07h10   16h17   9:06:25 
12  07h12   16h15   9:03:11 
13  07h14   16h14   8:59:59 
14  07h16   16h12   8:56:50 
15  07h17   16h11   8:53:44 
16  07h19   16h10   8:50:40 
17  07h21   16h08   8:47:39 
18  07h22   16h07   8:44:42 
19  07h24   16h06   8:41:48 
20  07h26   16h05   8:38:57 
21  07h27   16h04   8:36:09 
22  07h29   16h03   8:33:25 
23  07h31   16h01   8:30:45 
24  07h32   16h00   8:28:09 
25  07h34   15h59   8:25:36 
26  07h35   15h59   8:23:08 
27  07h37   15h58   8:20:44 
28  07h38   15h57   8:18:24 
29  07h40   15h56   8:16:09 
30  07h41   15h55   8:13:59 

Dec 2019

1   07h43   15h55   8:11:53
2   07h44   15h54   8:09:53
3   07h46   15h53   8:07:57
4   07h47   15h53   8:06:07
5   07h48   15h53   8:04:22
6   07h49   15h52   8:02:42
7   07h51   15h52   8:01:08
8   07h52   15h51   7:59:40
9   07h53   15h51   7:58:17
10  07h54   15h51   7:57:00
11  07h55   15h51   7:55:50
12  07h56   15h51   7:54:45
13  07h57   15h51   7:53:46
14  07h58   15h51   7:52:54
15  07h59   15h51   7:52:07
16  08h00   15h51   7:51:27
17  08h00   15h51   7:50:54
18  08h01   15h52   7:50:27
19  08h02   15h52   7:50:06
20  08h02   15h52   7:49:52
21  08h03   15h53   7:49:44
22  08h04   15h53   7:49:43
23  08h04   15h54   7:49:48
24  08h04   15h54   7:50:00
25  08h05   15h55   7:50:19
26  08h05   15h56   7:50:44
27  08h05   15h57   7:51:15
28  08h06   15h57   7:51:53
29  08h06   15h58   7:52:37
30  08h06   15h59   7:53:27
31  08h06   16h00   7:54:24

The program is fairly straight forward. Note that in order to use the same loop to populate both the @novand @dec arrays, the program uses an aref array ref pointing to either of the arrays, depending the part of the input data we’re reading.

This program displays the following output:

$ perl day_light.pl
Daylight change between:
01 Nov and 01 Dec: - 1:28:51
02 Nov and 02 Dec: - 1:27:17
03 Nov and 03 Dec: - 1:25:40
04 Nov and 04 Dec: - 1:24:00
05 Nov and 05 Dec: - 1:22:16
06 Nov and 06 Dec: - 1:20:29
07 Nov and 07 Dec: - 1:18:37
08 Nov and 08 Dec: - 1:16:42
09 Nov and 09 Dec: - 1:14:44
10 Nov and 10 Dec: - 1:12:42
11 Nov and 11 Dec: - 1:10:35
12 Nov and 12 Dec: - 1:08:26
13 Nov and 13 Dec: - 1:06:13
14 Nov and 14 Dec: - 1:03:56
15 Nov and 15 Dec: - 1:01:37
16 Nov and 16 Dec: - 0:59:13
17 Nov and 17 Dec: - 0:56:45
18 Nov and 18 Dec: - 0:54:15
19 Nov and 19 Dec: - 0:51:42
20 Nov and 20 Dec: - 0:49:05
21 Nov and 21 Dec: - 0:46:25
22 Nov and 22 Dec: - 0:43:42
23 Nov and 23 Dec: - 0:40:57
24 Nov and 24 Dec: - 0:38:09
25 Nov and 25 Dec: - 0:35:17
26 Nov and 26 Dec: - 0:32:24
27 Nov and 27 Dec: - 0:29:29
28 Nov and 28 Dec: - 0:26:31
29 Nov and 29 Dec: - 0:23:32
30 Nov and 30 Dec: - 0:20:32
Average change between Nov and Dec: - 0:58:20

Note that all the change values are negative, meaning that we have a daylight loss for any corresponding day between December and November (I expected it, since the winter solstice, the shortest day in the year, occurs on December 22, i.e. in the last third of December, but it’s better to have hard data proving it).

Daylight in Raku

In order to compute differences between values in sexagesimal notation (in base 60, i.e. expressed in hours/minutes/seconds), we will convert everything to seconds, perform arithmetic operations on values in seconds, and convert the result back into HMS sexagesimal values if needed. The sec3hrs and hrs2sec subroutines perform the necessary conversions from and to seconds.

The program first reads through the input data and stores the daylight data into an @novand an @dec arrays. Then it loops through the 1..30 range and, for each day, subtract the November daylight value from the December daylight value. The program also computes an average change over 30 days.

Compared to the Perl 5 implementation, the data for November and December 2019 are stored into separate text files (same format as above), because Raku doesn’t have the __DATA__ features; it should have much more feature-rich capabilities using pod (plain old documentation) sections, but these are not implemented yet. Otherwise, the sec2hrs subroutine is much simpler because it uses the multiple modulo polymod method to convert directly seconds into hours, minutes and seconds. Also, we used the Z- zip) metaoperator along with the - subtract operator to compute all the duration differences in just one single statement. Overall, these changes make the actual code twice shorter than the P5 implementation:

use v6;
sub hrs2sec ($hms) {
    my ($hrs, $min, $sec) = split /\:/, $hms;
    return $hrs * 60² + $min * 60 + $sec;
}
sub sec2hrs (Numeric $sec) {
    my @duration = $sec.abs.polymod(60, 60);
    my $fmt = ($sec < 0 ?? "-" !! "") ~ "%d:%02d:%02d";
    return sprintf $fmt, @duration[2, 1, 0];
}

my @nov = 'november_2019.txt'.IO.lines[0..29].map({(.split(/\s+/))[3]});
my @dec = 'december_2019.txt'.IO.lines[0..29].map({(.split(/\s+/))[3]});
my @diff = @dec.map({hrs2sec $_}) Z- @nov.map({hrs2sec $_});
say "Daylight changes between Dec and Nov:";
for @diff.kv -> $k, $v { printf "%2d: %s\n", $k + 1, sec2hrs( $v) };
say "\nAverage change between Nov and Dec: ", sec2hrs ([+] @diff) / 30;

This program displays more or less the same output at the P5 program:

$ perl6  day_light.p6
Daylight changes between Dec and Nov:
 1: -1:28:51
 2: -1:27:17
 3: -1:25:40
 4: -1:24:00
 5: -1:22:16
 6: -1:20:29
 7: -1:18:37
 8: -1:16:42
 9: -1:14:44
10: -1:12:42
11: -1:10:35
12: -1:08:26
13: -1:06:13
14: -1:03:56
15: -1:01:37
16: -0:59:13
17: -0:56:45
18: -0:54:15
19: -0:51:42
20: -0:49:05
21: -0:46:25
22: -0:43:42
23: -0:40:57
24: -0:38:09
25: -0:35:17
26: -0:32:24
27: -0:29:29
28: -0:26:31
29: -0:23:32
30: -0:20:32

Average change between Nov and Dec: -0:58:20

Wrapping up

The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, December 15, 2019. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge 36: Vehicle Identification Numbers (VIN) and the Knapsack Problem

These are some answers to the Week 36 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Task # 1: Vehicle Identification Numbers (VIN)

Write a program to validate given Vehicle Identification Number (VIN). For more information, please checkout wikipedia.

From the Wikipedia article, it appears that VINs are made up of 17 digits and upper-case letters, with the exception of letters I (i), O (o) and Q (q), to avoid confusion with numerals 0, 1, and 9. There are some additional rules that only applicable to certain areas of the world but are not internationally recognized.

Vehicle Identification Numbers in Perl 5

We write a simple validate subroutine that returns a true value (1) if the passed parameter complies with the above rules for VINs and a false value (0) otherwise.

In addition, we write a test suite using the Test::More core testing framework and containing 16 test cases. The ok function is fine for checking if a Boolean value is true, but for testing that the function returns a 0 (false value), we need to use the is function of Test::More, because it unfortunately has no ko or nok function that would pass a test when the expression returns a false value.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
use Test::More tests => 16;

sub validate {
    my $vin = shift;
    return 0 if $vin =~ /[OIQ]/;
    return 1 if $vin =~ /^[A-Z0-9]{17}$/;
    return 0;
}

ok validate("A" x 17),  "17 A's";
ok validate(1 x 17)  ,  "17 digits";
is validate("AEIOU")   , 0, "Five vowels";
is validate(1234567890), 0, "Ten digits";
is validate("1234AEIOU5678901"),   0, "Sixteen digits or letters";
is validate("12345678901234567"),  1, "17 digits";
is validate("1234567890123456Q"),  0, "16 digits and a Q";
is validate("1234567890123456O"),  0, "16 digits and a O";
is validate("1234567890123456I"),  0, "16 digits and a I";
is validate("Q1234567890123456"),  0, "A Q and 16 digits";
is validate("I1234567890123456"),  0, "An I and 16 digits";
is validate("ABCD4567890123456"),  1, "17 digits and letters";
is validate("ABef4567890123456"),  0, "Digits and some lower case letters";
is validate("ABE?4567890123456"),  0, "A non alphanumerical character";
is validate("ABCD4567 90123456"),  0, "A space";
is validate("ABCD45678901234567"), 0, "More than 17 characters";

Running the program shows that all test pass:

$ perl vin.pl
1..16
ok 1 - 17 A's
ok 2 - 17 digits
ok 3 - Five vowels
ok 4 - Ten digits
ok 5 - Sixteen digits or letters
ok 6 - 17 digits
ok 7 - 16 digits and a Q
ok 8 - 16 digits and a O
ok 9 - 16 digits and a I
ok 10 - A Q and 16 digits
ok 11 - An I and 16 digits
ok 12 - 17 digits and letters
ok 13 - Digits and some lower case letters
ok 14 - A non alphanumerical character
ok 15 - A space
ok 16 - More than 17 characters

In North America, the ninth position in a VIN is a check digit i.e. a number calculated from all other characters. Although this is not explicitly requested in the task, we’ll make a second version of our program also verifying the check digit, as a bonus. The check_digit subroutine splits the input string, translates the characters into numbers, multiplies each number by the weight assigned to its rank, sums up all the results, computes the remainder of its division by 11, and replaces the remainder by “X” if it is found to be 10.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

sub validate {
    my $vin = shift;
    return 0 if $vin =~ /[OIQ]/;
    return 0 unless $vin =~ /^[A-Z0-9]{17}$/;
    return check_digit($vin);
}

sub check_digit {
    my $vin = shift;
    my %translations = (
     A => 1, B => 2, C => 3, D => 4, E => 5, F => 6, G => 7, H => 8,
     J => 1, K => 2, L => 3, M => 4, N => 5, P => 7, R => 9, S => 2,
     T => 3, U => 4, V => 5, W => 6, X => 7, Y => 8, Z => 9, 
    );
    $translations{$_} = $_ for 0..9;

    my @weights = 
        (8, 7, 6, 5, 4, 3, 2, 10, 0, 9, 8, 7, 6, 5, 4, 3, 2);
    my $i = 0;
    my $sum = 0;
    for my $char (split //, $vin) {
        $sum += $translations{$char} * $weights[$i++];
    }
    my $mod = $sum % 11;
    $mod = 'X' if $mod == 10;
    return 1 if $mod eq substr $vin, 8, 1;
    return 0;
}

my $vin = shift // "1M8GDM9AXKP042788";
say validate($vin) ? "Correct" : "Wrong";

Running this program produces the following output:

$ perl vin.pl 1M8GDM9AXKP042788
Correct

$ perl vin.pl 1M8GDM9AXKP042789
Wrong

$ perl vin.pl
Correct

Vehicle Identification Numbers in Raku (formerly known as Perl 6)

We can just do almost the same thing in Raku.

Note that the Raku Test framework has a nok function that makes it possible to test directly a Boolean value, rather than comparing to some values.

use v6;
use Test;

sub validate ($vin) {
    return False if $vin ~~ /<[OIQ]>/;
    return True if $vin ~~ /^ <[A..Z0..9]> ** 17 $/;
    return False;
}

plan 16;

ok  validate("A" x 17),   "17 A's";
ok  validate(1 x 17),     "17 digits";
nok validate("AEIOU"),    "Five vowels";
nok validate(1234567890), "Ten digits";
nok validate("1234AEIOU5678901"),   "sixteen digits or letters";
ok  validate("12345678901234567"),  "17 digits";
nok validate("1234567890123456Q"),  "16 digits and a Q";
nok validate("1234567890123456O"),  "16 digits and a O";
nok validate("1234567890123456I"),  "16 digits and a I";
nok validate("Q1234567890123456"),  "A Q and 16 digits";
nok validate("I1234567890123456"),  "An I and 16 digits";
ok  validate("ABCD4567890123456"),  "17 digits and letters";
nok validate("ABef4567890123456"),  "Digits and some lower case letters";
nok validate("ABE?4567890123456"),  "A non alphanumerical character";
nok validate("ABCD4567 90123456"),  "A space";
nok validate("ABCD45678901234567"), "More than 17 characters";

Running the program shows that all test pass:

$ perl6 vin.p6
1..16
ok 1 - 17 A's
ok 2 - 17 digits
ok 3 - Five vowels
ok 4 - Ten digits
ok 5 - sixteen digits or letters
ok 6 - 17 digits
ok 7 - 16 digits and a Q
ok 8 - 16 digits and a O
ok 9 - 16 digits and a I
ok 10 - A Q and 16 digits
ok 11 - An I and 16 digits
ok 12 - 17 digits and letters
ok 13 - Digits and some lower case letters
ok 14 - A non alphanumerical character
ok 15 - A space
ok 16 - More than 17 characters

As in Perl 5, we’ll implement the North America check digit feature:

use v6;

sub validate (Str $vin) {
    return False if $vin ~~ /<[OIQ]>/;
    return False unless $vin ~~ /^ <[A..Z0..9]> ** 17 $/;
    return check-digit $vin;
}

sub check-digit (Str $vin) {
    my %translations = 
     A => 1, B => 2, C => 3, D => 4, E => 5, F => 6, G => 7, H => 8,
     J => 1, K => 2, L => 3, M => 4, N => 5, P => 7, R => 9, S => 2,
     T => 3, U => 4, V => 5, W => 6, X => 7, Y => 8, Z => 9;
    %translations{$_} = $_ for 0..9;
    my @weights = 8, 7, 6, 5, 4, 3, 2, 10, 0, 9, 8, 7, 6, 5, 4, 3, 2;
    my $i = 0;
    my $sum = sum map { %translations{$_} * @weights[$i++]}, $vin.comb;
    my $mod = $sum % 11;
    $mod = 'X' if $mod == 10;
    return True if $mod eq substr $vin, 8, 1;
    return False;
}

sub MAIN (Str $vin = "1M8GDM9AXKP042788") {
    say validate($vin) ?? "Correct" !! "Wrong"; 
}

Running the program displays the following output:

$ perl6 vin.p6
Correct

$ perl6 vin.p6 1M8GDM9AXKP042788
Correct

$ perl6 vin.p6 1M8GDM9AXKP042789
Wrong

Task # 2: The Knapsack Problem

Write a program to solve Knapsack Problem.

There are 5 color coded boxes with varying weights and amounts in GBP. Which boxes should be chosen to maximize the amount of money while still keeping the overall weight under or equal to 15 kg?

R: (weight = 1 kg, amount = £1)
B: (weight = 1 kg, amount = £2)
G: (weight = 2 kg, amount = £2)
Y: (weight = 12 kg, amount = £4)
P: (weight = 4 kg, amount = £10)

Bonus task, what if you were allowed to pick only 2 boxes or 3 boxes or 4 boxes? Find out which combination of boxes is the most optimal?

The knapsack problem or rucksack problem is a well-known problem in combinatorial optimization: given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most valuable items. In this specific case, this is what is sometimes called the 0-1 knapsack problem, where you can chose only one of each of the listed items.

I will directly take the “bonus” version of the problem, as it seems simpler to take this constraint right from the beginning.

The knapsack problem is known to be a at least an NP-Complete problem (and the optimization problem is NP-Hard). This means that there is no known polynomial algorithm which can tell, given a solution, whether it is optimal. There are, however, some algorithms that can solve the problem in pseudo-polynomial time, using dynamic programming. However, with a set of only five boxes, we can run a so-called brute-force algorithm, that is try all possible solutions to find the best. A better algorithm would probably be needed to manage 30 or more boxes, but we’re given only 5 boxes, and trying to find a better algorithm for only five boxes would be, in my humble view, a case of over-engineering.

The Knapsack Problem in Perl 5

To start with, we populate a %boxes hash of hashes with the box colors as keys, and their respective weights and values.

The most immediate solution to test all boxes combinations would be to use five nested loops, but that’s tedious and ugly, and we would need to neutralize some of the loops for satisfying the bonus task with only 2, 3, or 4 boxes.

I prefer to implement a recursive solution where the parameters to the recursive try_one subroutine govern the number of loops that will be performed. These parameters are as follows: * Current cumulative weight of the selected boxes; * Current total value of the selected boxes; * Maximum number of boxes to be selected (for the bonus) * A string listing the boxes used so far in the current solution; * A list of the boxes still available;

For the first call of try_one, we have the following parameters: 0 for the weight, 0 for the value, the maximum number of boxes to be used is passed as a parameter to the script or, failing a parameter, defaulted to 5, an empty string for the list of boxes, and the list of box colors.

The recursion base case (where recursion should stop) is reached when the current weight exceed 15 or when the number of available boxes left reaches 0.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my %boxes = (
    "R" => { "w" => 1,  val => 1  },
    "B" => { "w" => 1,  val => 2  },
    "G" => { "w" => 2,  val => 2  },
    "Y" => { "w" => 12, val => 4  },
    "P" => { "w" => 4,  val => 10 },
);
my $start_nb_boxes = shift // 5;
my $max_val = 0;
my $max_boxes;

sub try_one {
    my ($cur_weight, $cur_val, $num_boxes, $boxes_used, @boxes_left) = @_;
    if ($cur_val > $max_val) {
        $max_val = $cur_val;
        $max_boxes = $boxes_used;
    }
    for my $box (@boxes_left) {
        my $new_cur_weight = $cur_weight + $boxes{$box}{w};
        next if $new_cur_weight > 15 or $num_boxes <= 0;
        my @new_boxes_left = grep $_ ne $box, @boxes_left;
        my $new_box_used = $boxes_used ? $boxes_used . "-$box" : $box;
        try_one ($new_cur_weight, $cur_val + $boxes{$box}{val}, $num_boxes -1, $new_box_used, @new_boxes_left);
    }
}
try_one (0, 0, $start_nb_boxes, "", keys %boxes);        
say "Max: $max_val, Boxes:  $max_boxes";

Note that we’re using two global variables to store the maximum value and the corresponding list of boxes. This is often frowned upon as contrary to the best practices, and often rightly so, but I consider that these variables are really global to the program (they keep track of the best solution so far) and not part of any specific recursive call of the subroutine. We could easily pass these variables around back and forth as parameters to and return values of the recursive calls, but that would make the program more complicated (and probably slightly slower) with no obvious benefit. Best practices are good to follow when they make sense, but, in my view, they shouldn’t become a bureaucratic straight jacket, and I don’t think we should try to bend over backward to follow them when they make things significantly more complicated than needed. But that’s only my humble opinion, you may disagree with that.

Running this script with no parameter yields a solution with four boxes (predictably all boxes except “Y” that has a very high weight):

$ perl boxes.pl
Max: 15, Boxes:  R-G-B-P

And, for the bonus we run the same program with parameters between 1 and 4:

$ perl boxes.pl 4
Max: 15, Boxes:  B-P-G-R

$ perl boxes.pl 3
Max: 14, Boxes:  B-P-G

$ perl boxes.pl 2
Max: 12, Boxes:  B-P

$ perl boxes.pl 1
Max: 10, Boxes:  P

One of the weaknesses of the recursive solution above is that we are testing permutations (i.e. subsets of the data where the order in which the boxes are selected matters) of elements, rather than combinations (where the order doesn’t matter), and we of course don’t care about the order in which we pick the boxes. So our program is doing too much work, because it’s testing far too many cases. It doesn’t really matter for a small set of 5 boxes, as we obtain the result in significantly less than a tenth of a second:

$ time perl boxes.pl
Max: 15, Boxes:  P-B-G-R

real    0m0,078s
user    0m0,000s
sys     0m0,015s

But it would still be nicer to test only combinations, as this would scale better for larger data sets. To get combinations, we can just retain only permutations that are in a given order, for example in alphabetic order, and filter out the others. We add one parameter to our recursive subroutine, $last_box_used, to that we can compare each box in the for loop with it and only keep those where box comes after in the alphabetic order. And we make our first call of the try_onesubroutine with an additional dummy parameter, “A”, which comes before any of the boxes.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my %boxes = (
    "R" => { "w" => 1,  val => 1  },
    "B" => { "w" => 1,  val => 2  },
    "G" => { "w" => 2,  val => 2  },
    "Y" => { "w" => 12, val => 4  },
    "P" => { "w" => 4,  val => 10 },
);
my $start_nb_boxes = shift // 5;
my $max_val = 0;
my $max_boxes;

sub try_one {
    my ($cur_weight, $cur_val, $num_boxes, $boxes_used, $last_box_used, @boxes_left) = @_;
    if ($cur_val > $max_val) {
        $max_val = $cur_val;
        $max_boxes = $boxes_used;
    }
    for my $box (@boxes_left) {
        next if $box lt $last_box_used;
        my $new_cur_weight = $cur_weight + $boxes{$box}{w};
        next if $new_cur_weight > 15 or $num_boxes <= 0;
        my @new_boxes_left = grep $_ ne $box, @boxes_left;
        my $new_box_used = $boxes_used ? $boxes_used . "-$box" : $box;
        try_one ($new_cur_weight, $cur_val + $boxes{$box}{val}, $num_boxes -1, $new_box_used, $box, @new_boxes_left);
    }
}
try_one (0, 0, $start_nb_boxes, "", "A", sort keys %boxes);        
say "Max: $max_val, Boxes:  $max_boxes";

This runs slightly faster that our previous version of the script:

$ time perl boxes.pl 5
Max: 15, Boxes:  B-G-P-R

real    0m0,062s
user    0m0,000s
sys     0m0,030s

Such an optimization is of course useless with such a small input dataset (and such short run times), but it might help quite significantly if we had larger input.

The Knapsack Problem in Raku

As a starting point, we can adapt the improved version of our Perl 5 recursive subroutine to Raku:

use v6;

constant %boxes = (
    "R" => { "w" => 1,  val => 1  },
    "B" => { "w" => 1,  val => 2  },
    "G" => { "w" => 2,  val => 2  },
    "Y" => { "w" => 12, val => 4  },
    "P" => { "w" => 4,  val => 10 },
);

sub MAIN (UInt $start-nb-boxes = 5) {
    my @boxes = keys %boxes;
    my $*max-val = 0;
    my $*max-boxes = "";
    try-one(0, 0, $start-nb-boxes, "", "A", @boxes);        
    say "Max: $*max-val, Boxes:  $*max-boxes";
    say now - INIT now;
}

sub try-one ($cur-weight, $cur-val, $num-boxes, $boxes-used, $last-box-used, @boxes-left) {
    if $cur-val > $*max-val {
        $*max-val = $cur-val;
        $*max-boxes = $boxes-used;
    }
    for @boxes-left -> $box {
        next if $box lt $last-box-used;
        my $new-cur-weight = $cur-weight + %boxes{$box}{'w'};
        next if $new-cur-weight > 15 or $num-boxes <= 0;
        my @new-boxes-left = grep { $_ ne $box}, @boxes-left;
        my $new-box-used = $boxes-used ?? $boxes-used ~ "-$box" !! $box;
        try-one $new-cur-weight, $cur-val + %boxes{$box}{'val'}, $num-boxes -1, $new-box-used, $box, @new-boxes-left;
    }
}

This are some examples of output:

$ perl6 boxes.p6
Max: 15, Boxes:  B-G-P-R
0.0099724

$ perl6 boxes.p6 4
Max: 15, Boxes:  B-G-P-R
0.0209454

$ perl6 boxes.p6 3
Max: 14, Boxes:  B-G-P
0.01895075

$ perl6 boxes.p6 2
Max: 12, Boxes:  B-P
0.0109711

Note that the overall run time (as measured by the Unix time command is about 0.4 second, so much more than the Perl 5 equivalent), but the execution time of the script itself is between 10 and 20 milliseconds, so that most of the overall run time is presumably taken by compilation and start up time.

But Raku offers the built-in combinations routine that can make our program shorter and simpler. It will return a list (really a Seq) of all possible combinations of the input list or array. You can even specify the number of items, or, even better, a range for the numbers of items in each combinations; this will enable us to answer the bonus question by specifying the maximal number of boxes, and also to remove from the output the empty list (which may otherwise generate errors or warnings). The find-best subroutine does most of the work: the first statement populates a @valid-candidates array with combinations not exceeding the maximal weight, along with their total respective values, and the next statement returns the maximal value combination.

use v6;

constant %boxes = (
    "R" => { "w" => 1,  val => 1  },
    "B" => { "w" => 1,  val => 2  },
    "G" => { "w" => 2,  val => 2  },
    "Y" => { "w" => 12, val => 4  },
    "P" => { "w" => 4,  val => 10 },
);
sub MAIN (UInt $max-nb = 5) {
    my ($best, $max) = find-best %boxes.keys.combinations: 1..$max-nb;
    say "Max: $max; ", $best;
}
sub find-best (@candidates) {
    my @valid-candidates = gather for @candidates -> $cand {
        take [ $cand, $cand.map({ %boxes{$_}{'val'}}).sum ] 
            if $cand.map({ %boxes{$_}{'w'}}).sum <= 15;
    }
    return  @valid-candidates.max({$_[1]});
}

The output is the same as before:

$ perl6 boxes2.p6
Max: 15; (R G B P)

$ perl6 boxes2.p6 5
Max: 15; (R P B G)

$ perl6 boxes2.p6 4
Max: 15; (B G P R)

$ perl6 boxes2.p6 3
Max: 14; (B G P)

$ perl6 boxes2.p6 2
Max: 12; (P G)

Wrapping up

The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, December 8. And, please, also spread the word about the Perl Weekly Challenge if you can.

About laurent_r

user-pic I am the author of the "Think Perl 6" book (O'Reilly, 2017) and I blog about the Perl 5 and Raku programming languages.