Perl Weekly Challenge 100: Fun Time and Triangle Sum

These are some answers to the Week 100 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Task 1: Fun Time

You are given a time (12 hour / 24 hour).

Write a script to convert the given time from 12 hour format to 24 hour format and vice versa.

Ideally we expect a one-liner.

Example 1:

Input: 05:15 pm or 05:15pm
Output: 17:15

Example 2:

Input: 19:15
Output: 07:15 pm or 07:15pm

Well, I like concise code, but I don’t think it makes real sense to try to pack this task into a one-liner. I have no doubt that it can be done, but it will either be not a real one-liner (like a pipeline chaining multiple statements into one expression), or a difficult to understand golfing attempt. So, I will not try to do a one-liner.

Fun Time in Raku

We use a regex to detect whether the input string matches “am” or “pm”; if so, we remove that “am” or “pm” string and add 12 hours if the matched string was “pm”; otherwise, we subtract 12 from the hour part it if is larger than 12 and add “pm” or “am” depending on the case.

use v6;

my $time = @*ARGS[0];
if $time ~~ /(<[ap]>)m/ {
    if $0 eq 'a'  {
        $time ~~ s/(\d\d':'\d\d).*/$0/;
    } else {
        $time ~~ s/(\d\d)':'(\d\d).*/{$0 + 12}:$1/;
    }
} else {
    $time ~~ /^(\d\d)':'(\d\d)/;
    my ($suffix, $hour) = $0 > 12 ?? ('pm', $0 - 12) !! ('am', $0);
    $time = "$hour:$1 $suffix";
}
say $time;

These are some execution examples with various input strings:

$ raku fun-time.raku '10:54'
10:54 am
-
$ raku fun-time.raku '17:54'
5:54 pm
-
$ raku fun-time.raku '10:54 pm'
22:54
-
$ raku fun-time.raku '10:54 am'
10:54

Fun Time in Perl

This is a port to Perl of the Raku program just above:

use strict;
use warnings;
use feature "say";

my $time = shift;
if ($time =~ /([ap])m/) {
    if ($1 eq 'a')  {
        $time =~ s/(\d\d:\d\d).*/$1/;
    } else {
        $time =~ /(\d\d):(\d\d.).*/;
        my $hour = $1 + 12;
        $time = "$hour:$2";
    }
} else {
    $time =~ /^(\d\d):(\d\d)/;
    my ($suffix, $hour) = $1 > 12 ? ('pm', $1 - 12) : ('am', $1);
    $time = "$hour:$2 $suffix";
}
say $time;

And some execution examples:

$ perl fun-time.pl '10:54 pm'
22:54

$ perl fun-time.pl '10:54 am'
10:54

$ perl fun-time.pl '10:54'
10:54 am

$ perl fun-time.pl '15:54'
3:54 pm

Task 2: Triangle Sum

You are given triangle array.

Write a script to find the minimum path sum from top to bottom.

When you are on index i on the current row then you may move to either index i or index i + 1 on the next row.

Example 1:

Input: Triangle = [ [1], [2,4], [6,4,9], [5,1,7,2] ]
Output: 8

Explanation: The given triangle

            1
           2 4
          6 4 9
         5 1 7 2

The minimum path sum from top to bottom:  1 + 2 + 4 + 1 = 8

             [1]
           [2]  4
           6 [4] 9
          5 [1] 7 2

Example 2:

Input: Triangle = [ [3], [3,1], [5,2,3], [4,3,1,3] ]
Output: 7

Explanation: The given triangle

            3
           3 1
          5 2 3
         4 3 1 3

The minimum path sum from top to bottom: 3 + 1 + 2 + 1 = 7

             [3]
            3  [1]
           5 [2] 3
          4 3 [1] 3

Triangle Sum in Raku

We use the traverse recursive subroutine to find the smallest path through the triangular array:

use v6;

my @triangle = (1), (2,4), (6,4,9), (5,1,7,2);
my @min-path = map { $_[0] }, @triangle; # pick any path
my $min-path-val = @min-path.sum;
my $index = 0;
traverse @triangle, (), $index;
say @min-path;

sub traverse (@triangle, @path, $index) {
    my @first-line = @triangle[0];
    my @new-triangle = @triangle[1 .. @triangle.end];
    say "First-line: ", @first-line;
    my @new-path = | (@path, @first-line[$index]);
    say @new-path, "  ", "\n";
    if @new-triangle.elems > 0 {
        traverse(@new-triangle, @new-path, $index);
        traverse(@new-triangle, @new-path, $index + 1);
    } else {
        my $new-path-val = @new-path.sum;
        if $new-path-val < $min-path-val {
            @min-path = @new-path;
            $min-path-val = $new-path-val
        }
    }
}

Output:

$ raku triangle-sum.raku 1 2 4 1

Triangle Sum in Perl

Except for the fact that we define also a sum subroutine, this essentially the same Raku algorithm ported to Perl:

use strict;
use warnings;
use feature "say";

my $triangle = [ [1], [2,4], [6,4,9], [5,1,7,2] ];
# $triangle = [ [1], [2,4], [6,30,9], [30,30,30,2] ];
my $min_path = [ map { $_->[0] } @$triangle];
my $min_path_val = sum($min_path);
my $index = 0;
find_min_path($triangle, [], $index);
say "@$min_path";

sub sum {
    my $aref = shift;
    my $sum = 0;
    $sum += $_ for @$aref;
    return $sum;
}

sub find_min_path {
    my ($tri_ref, $path, $index) = @_;
    my @triangle = @$tri_ref;
    my @first_row = @{$triangle[0]};
    my @new_triangle = @triangle[1 .. $#triangle];
    my $new_path = [ @$path, $first_row[$index] ];
    if (@new_triangle) {
        find_min_path([@new_triangle], $new_path , $index);
        find_min_path([@new_triangle], $new_path, $index + 1);
    } else { 
        my $new_path_val = sum($new_path);
        if ($new_path_val < $min_path_val) {
            $min_path = $new_path;
            $min_path_val = $new_path_val;
        }
    }
}

Output with the above input data:

$ perl  triangle-sum.pl
1 2 4 1

If you uncomment the second triangle definition, you get the following output:

$ perl  triangle-sum.pl
1 4 9 2

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, February 28, 2021. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge 99: Pattern Match and Unique Subsequence

These are some answers to the Week 99 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a couple of days (February 14, 2021). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

Task 1: Pattern Match

You are given a string $S and a pattern $P.

Write a script to check if given pattern validate the entire string. Print 1 if pass otherwise 0.

The patterns can also have the following characters: - ? - Match any single character. - * - Match any sequence of characters.

Example 1:

Input: $S = "abcde" $P = "a*e"
Output: 1

Example 2:

Input: $S = "abcde" $P = "a*d"
Output: 0

Example 3:

Input: $S = "abcde" $P = "?b*d"
Output: 0

Example 4:

Input: $S = "abcde" $P = "a*c?e"
Output: 1

So the pattern are similar to those used in the Unix ls command, with literal match for letters, digits and other characters common used in file names, and two wild-card characters, ? for any single character, and * for any sequence of characters.

Both in Raku and Perl, we will build a regex pattern by replacing * with the .* regex sequence, and ? with the . regex wild-card character. In addition, since we are requested to match the entire string, we will add the ^ start of string and $ end of string regex anchors.

Pattern Match in Raku

We could use regex substitutions to build the regex pattern (as we did below for the pattern match in Perl solution), but, here, the match subroutine loops through the input pattern characters and construct the regex pattern by hand. This subroutine finally applies regex matching to the input string.

use v6;

my $in = "abcde";
my @test-patterns = <a*e a*d ?b*d a*c?e>;
for @test-patterns -> $test {
    say "$test: ", match $test, $in;
}

sub match (Str $pattern, Str $in) {
    my $regex = 
        join "", gather {
            take '^';
            for $pattern.comb {
                when '*' { take '.*' }
                when '?' { take '.'  }
                default  { take $_   }
            }
            take '$';
    }
    return  $in ~~ /<$regex>/ ?? 1 !! 0;
}

With the “abcde” input string and the four input patterns, this program displays the following output:

$ raku pattern-match.raku
a*e: 1
a*d: 0
?b*d: 0
a*c?e: 1

Pattern Match in Perl

As mentioned above, I decided here to use regex substitutions to convert the input pattern into a regex pattern.

use strict;
use warnings;
use feature "say";

my $input = "abcde";
my @test_patterns = qw/a*e a*d ?b*d a*c?e/;
for my $pat (@test_patterns) {
    say "$pat: ", match($input, $pat)
}

sub match {
    my ($in, $pattern) = @_;
    $pattern =~ s/\*/.*/g;
    $pattern =~ s/\?/./g;
    $pattern = "^$pattern\$";
    return $in =~ /$pattern/ ? 1 : 0;
}

This program displays the same output as the Raku program above:

$ perl match-pattern.pl
a*e: 1
a*d: 0
?b*d: 0
a*c?e: 1

Task 2: Unique Subsequence

You are given two strings $S and $T.

Write a script to find out count of different unique subsequences matching $T without changing the position of characters.

UPDATE: 2021-02-08 09:00 AM (UK TIME) suggested by Jonas Berlin, missing entry [5].

Example 1:

Input: $S = "littleit', $T = 'lit'
Output: 5

    1: [lit] tleit
    2: [li] t [t] leit
    3: [li] ttlei [t]
    4: litt [l] e [it]
    5: [l] ittle [it]

Example 2:

Input: $S = "london', $T = 'lon'
Output: 3

    1: [lon] don
    2: [lo] ndo [n]
    3: [l] ond [on]

Dear Mohammad, when you do such updates, would you please be kind enough to send an e-mail informing us of such change. I had loaded the web page before the update, and spent a couple of hours trying to fix my program against your solution until I found that, in fact, my program was correct and your initial solution wrong. Only at that point did I think about re-loading the page and found that you had fixed the solution on the Web page. Every one can make mistakes, I have no problem with that, but please inform us when this happens.

Unique Subsequence in Raku

Although I admit that there may be some more efficient solution in terms of speed performance, I felt that using brute force with the combinations built-in method was better in terms of coding efficiency. The program generates all the input letters combinations having the size of the searched substring, filters out those not matching the input substring, and finally returns the number of matching substrings.

use v6;

my @input-tests = [ "littleit", "lit"], ["london", "lon"];

for @input-tests -> $test {
    my ($in, $substr) = $test[0..1];
    say "$test: ", search-substr $in, $substr;
}
sub search-substr (Str $in, Str $substr) {
    my @results = $in.comb.combinations($substr.\
        chars)>>.join("").grep({$_ eq $substr}).elems;
}

All the real work is done in a single code line (well, formatted here over two lines for a better graphical rendering on this blog page).

This program displays the following output:

$ ./raku subsequence.raku
littleit lit: [5]
london lon: [3]

Unique Subsequence in Perl

The solution in Perl is completely different from the Raku solution. Here, we use the search_substr recursive subroutine to explore all the possible substrings. This program could generate all the letter combinations as the Raku program, except that it cuts the process for any letter combination that will never eventually match the search substring, so that the program is doing much less useless work. Presumably this should make the program faster, especially for longer input strings and substrings.

use strict;
use warnings;
use feature "say";

my @input_tests = ( [ "littleit", "lit"], ["london", "lon"], ["aaaa", "aa"]);
my $count;
for my $in (@input_tests) {
    $count = 0;
    search_substr (@$in);
    say "@$in: $count";
}
sub search_substr {
    my ($in, $searched) = @_; 
    my $start = substr $searched, 0, 1;
    my $index = 0;
    while (1) {
        $index = index $in, $start, $index;
        return if $index < 0;
        $index++;
        ++$count and next  if length $searched == 1;
        search_substr (substr($in, $index), substr($searched, 1));      
    }
}

This programs displays the following output:

$ perl subsequence.pl
littleit lit: 5
london lon: 3
aaaa aa: 6

Wrapping up

The next week Perl Weekly Challenge will start soon and will be the 100th challenge. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, February 21, 2021. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge 98: Read N-Characters and Search Insert Position

These are some answers to the Week 98 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Task 1: Read N-characters

You are given file $FILE.

Create subroutine readN($FILE, $number) that returns the first n-characters and moves the pointer to the (n+1)th character.

Example:

Input: Suppose the file (input.txt) contains "1234567890"
Output:
    print readN("input.txt", 4); # returns "1234"
    print readN("input.txt", 4); # returns "5678"
    print readN("input.txt", 4); # returns "90"

Read N-characters in Raku

This is my first attempt:

use v6;

my $input = @*ARGS[0] // "I_have_a_dream.txt";

given $input.IO.open {
    for 0..2 -> $num {
        my $n = (1..10).pick;
        say "Taking $n characters: ", .readchars: $n;
    }
}

Using Martin Luther King's 1963 famous "I have a dream" speech at the Lincoln Memorial in Washington as an input file, I obtain the following output:

$ raku read_n_chars.raku
Taking 3 characters: I a
Taking 3 characters: m h
Taking 4 characters: appy

$ raku read_n_chars.raku
Taking 3 characters: I a
Taking 5 characters: m hap
Taking 9 characters: py to joi

However, my understanding of the challenge is that, maybe, Mohammad S. Anwar wanted us to implement an iterator, rather than using a Raku built-in subroutines implementing such iterators.

To do that, I wrote a create-iter subroutine iterating over the file contents:

use v6;

my $input = @*ARGS[0] // "I_have_a_dream.txt";

sub create-iter ($file-in) {
    my $counter = 0;
    my $content = $file-in.IO.slurp;
    return sub (Int $length) {
        my $out = substr $content, $counter, $length;
        $counter += $length;
        return $out;
    }
}
my &iterator = create-iter $input;       
for 0..2 -> $num {
    my $n = (1..10).pick;
    say "Taking $n characters: ", &iterator($n);
}

Example output:

$ raku read_n_chars2.raku
Taking 8 characters: I am hap
Taking 6 characters: py to
Taking 9 characters: join with

Read N-characters in Perl

This is a Perl implementation of the first Raku program above:

use strict;
use warnings;
use feature qw/say/;

my $input = shift // "I_have_a_dream.txt";
open my $IN, "<", $input or die "Cannot open $input $!";
for my $n (4..7) {
    say "Taking $n characters: ", map getc $IN, 1..$n;
}

Example output:

$ perl read_n_chars1.pl
Taking 4 characters: I am
Taking 5 characters:  happ
Taking 6 characters: y to j
Taking 7 characters: oin wit

And here is a solution implementing an iterator:

use strict;
use warnings;
use feature qw/say/;

sub create_iter {
    my $input = shift;
    my $counter = 0;
    open my $IN, '<', $input or die "Couldn't open $input $!";
    local $/ = undef;   # enabling "slurping mode"
    my $content = <$IN>; # slurping the file
    return sub {
        my $length = shift;
        my $out = substr $content, $counter, $length;
        $counter += $length;
        return $out;
    }
}
my $in = shift // "I_have_a_dream.txt";
my $iterator = create_iter $in;       
for my $n (3..5) {
    say "Taking $n characters: ", $iterator->($n);
}

Task 21: Search Insert Position

You are given a sorted array of distinct integers @N and a target $N.

Write a script to return the index of the given target if found otherwise place the target in the sorted array and return the index.

Example 1:

Input: @N = (1, 2, 3, 4) and $N = 3
Output: 2 since the target 3 is in the array at the index 2.

Example 2:

Input: @N = (1, 3, 5, 7) and $N = 6
Output: 3 since the target 6 is missing and should be placed at the index 3.

Example 3:

Input: @N = (12, 14, 16, 18) and $N = 10
Output: 0 since the target 10 is missing and should be placed at the index 0.

Example 4:

Input: @N = (11, 13, 15, 17) and $N = 19
Output: 4 since the target 19 is missing and should be placed at the index 4.

Search Insert Position in Raku

This is a simple implementation in Raku:

use v6;

my @tests = [3,  < 1  2  3  4>],
            [6,  < 1  3  5  7>],
            [10, <12 14 16 18>],
            [19, <11 13 15 17>];
for @tests -> $test {
    say $test.gist.fmt("%-20s:\t"), find_insert-pos ($test);
}
sub find_insert-pos ($test) {
    my $target = $test[0];
    my @array = |$test[1];
    for 0..@array.end -> $i {
        return $i if @array[$i] >= $target;
    }
    return @array.end + 1;
}

Output:

$ raku search_insert_pos.raku
[3 (1 2 3 4)]       :   2
[6 (1 3 5 7)]       :   3
[10 (12 14 16 18)]  :   0
[19 (11 13 15 17)]  :   4

This implementation is somewhat inefficient when the input data is large. I made a better binary search implementation, but I am so late that I can't really present it.

Search Insert Position in Perl

Update: I was so late when I originally posted this that I did not have time to include the Perl program for this task. Here it is:

use strict;
use warnings;
use feature "say";

my @tests = ( [3,  qw < 1  2  3  4>],
              [6,  qw < 1  3  5  7>],
              [10, qw <12 14 16 18>],
              [19, qw <11 13 15 17>],
            );
for my $test (@tests) {
    say "$test->[0], @{$test}[1..@{$test}-1]: ", find_insert_pos ($test);
}
sub find_insert_pos {
    my ($target, @array) = @{$_[0]};
    for my $i (0..$#array) {
        return $i if $array[$i] >= $target;
    }
    return $#array + 1;
}

Output:

$ perl  search_insert_pos.pl
3, 1 2 3 4: 2
6, 1 3 5 7: 3
10, 12 14 16 18: 0
19, 11 13 15 17: 4

Search Insert Position in Python

This is a port to Python of the Raku and Perl programs above:

def find_insert_pos(target, list):
    for i in range(len(list)):
        if list[i] >= target:
            return i 
    return len(list)

in_list = [1, 3, 5, 7]
for j in range (8):
    print('Target: ', j, '->', find_insert_pos(j, in_list))

Output:

$ python3 search_insert_pos.py
Target:  0 -> 0
Target:  1 -> 0
Target:  2 -> 1
Target:  3 -> 1
Target:  4 -> 2
Target:  5 -> 2
Target:  6 -> 3
Target:  7 -> 3

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, February 14, 2021. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge 97: Caesar Cipher and Binary Substrings

These are some answers to the Week 97 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Task 1: Caesar Cipher

You are given string $S containing only the letters A..Z and a number $N.

Write a script to encrypt the given string $S using a Caesar Cipher with left shift of size $N.

Example:

Input: $S = "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG", $N = 3
Output: "QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD"

Plain:    ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher:   XYZABCDEFGHIJKLMNOPQRSTUVW

Plaintext:  THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
Ciphertext: QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD

A Caesar cypher is a weak form of encryption that involves “rotating” each letter of the input string by a fixed number of places. To rotate a letter means to shift it through the alphabet, wrapping around to the end if necessary. In the movie 2001: A Space Odyssey, the spaceship’s computer is called HAL, which is IBM left rotated by 1.

Note that the task description says that the input string contains only the letters A..Z, but the example provided also contains spaces which are not in the encrypted solution. So we need to handle spaces as a special case. Depending on the language, my solutions will either handle spaces as one special case, or decide not to convert any letter outside of the A..Z range in order, for example, to preserve also punctuation marks).

Caesar Cipher in Raku

I decided to implement the solution in a functional style (to make the porting to Scala easier). So almost everything is made in a map block that processes each letter in turn and returns a stream of converted letters that are then join into the cypher string. Note that in the Raku solution, we convert only the letters the A..Z range.

use v6;
constant $default = "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG";
constant $min = 'A'.ord;
constant $max = 'Z'.ord;

sub MAIN (Str $in-string = $default, Int $shift = 3) {
    say rotate($in-string.uc, $shift);
    #say $out;
}
sub rotate ($in, $shift) {
    return join "", 
        map { my $let= $_ - $shift; 
              $let +=  26 if $let < $min; 
              $min <= $_ <= $max ?? $let.chr !! $_.chr; 
            }, $in.comb>>.ord;
}

This script displays the following output:

$ raku caesar.raku
QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD

Note that there is a special case of Caesar cipher called ROT13, where each letter is rotated by 13 places. Since 13 is half of the number of letters in our alphabet, applying ROT13 twice returns the original string, so that the same code may be used to encode and decode a string. ROT13 was commonly used on the Internet to weakly hide potentially offensive jokes or solutions to puzzles. With a fixed shift of 13, the code might be much simpler and can be contained in a simple one-liner:

$ raku -e 'my $w = @*ARGS[0]; $w ~~ tr/A..MN..Z/N..ZA..M/; say $w;' FOOBAR
SBBONE

$ raku -e 'my $w = @*ARGS[0]; $w ~~ tr/A..MN..Z/N..ZA..M/; say $w;' SBBONE
FOOBAR

Caesar Cipher in Perl

This is essentially a port to Perl of the Raku program, except that, here, only the space character is handled differently:

use strict;
use warnings;
use feature "say";
use constant MIN => ord 'A';

my $in_string = shift // "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG";
my $shift = shift // 3;
say rotate(uc $in_string, $shift);

sub rotate {
    my ($in, $shift) = @_;
    return join "", 
        map { my $let = ord($_) - $shift; 
              $let +=  26 if $let < MIN; 
              $_ eq " " ? " " : chr $let 
            } split "", $in;
}

This script displays the following output:

$ perl  caesar.pl
QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD

In the specific case of ROT13 (Caesar cipher with a shift of 13 letters), we can also use a simple Perl one-liner:

$ perl -E '$w = shift; $w =~ tr/A-MN-Z/N-ZA-M/; say $w;' FOOBAR
SBBONE

$ perl -E '$w = shift; $w =~ tr/A-MN-Z/N-ZA-M/; say $w;' SBBONE
FOOBAR

Caesar Cipher in Scala

This a simple port to Scala of the Raku and Perl programs above.

object caesar extends App {
  val test = "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG"
  val shift = 3
  println( test.map(convert(_, shift)))

  def convert(c: Char, shift: Int): Char = {
    val min = 'A'.toByte
    val asc = c.toByte - shift;
    val conv = if (asc < min) asc + 26 else asc
    return if (c == ' ') ' ' else conv.toChar
  }
}

Output:

QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD

Caesar Cipher in Python

Again, a port to Python of the Raku and Perl programs above. Except that functional programming is much less easy in Python, so we use a more conventional procedural approach. Since Python makes it possible to chain comparison operators, it makes it simple to convert only the letters the A..Z range.

ALPHA_COUNT = 26
MIN = ord('A')

input_string = "THE QUICK BROWN FOR JUMPS OVER THE LAZY DOG"
shift = 3
out = ""
for char in input_string:
    if 'A' <= char <= 'Z':
        asc_code = ord(char) - shift
        if asc_code < MIN:
            asc_code += ALPHA_COUNT
        out += chr(asc_code)
    else:
        out += char
print(out)

This script displays the following output:

$ python3 caesar.py
QEB NRFZH YOLTK CLO GRJMP LSBO QEB IXWV ALD

Task #2: Binary Substrings

You are given a binary string $B and an integer $S.

Write a script to split the binary string $B into substrings of size $S and then find the minimum number of flips required to make all substrings the same.

Example 1:

Input: $B = “101100101”, $S = 3
Output: 1

Binary Substrings:
    "101": 0 flip
    "100": 1 flip to make it "101"
    "101": 0 flip

Example 2:

Input $B = “10110111”, $S = 4
Output: 2

Binary Substrings:
    "1011": 0 flip
    "0111": 2 flips to make it "1011"

It isn’t really necessary to actually split the input string. We can iterate over the substrings and, for each position, find the number of 1s (or 0s, it’s your draw). So, in each position, we sum the minimum of the number of 1s and the number of 0s.

Binary Substrings in Raku

With the above explanations, this is hopefully clear:

use v6;
subset Binstr of Str where /^<[01]>*$/;

sub MAIN (Binstr $in-string,  Int $size) {
    my $sub-str-len = $in-string.chars / $size;
    my $flips = 0;
    for 0..^$sub-str-len -> $i {
        my $ones = 0;
        for 0..^$size -> $j {
            my $idx = $j * $sub-str-len + $i;
            $ones++ if substr($in-string, $idx, 1) == 1
        }
        my $zeroes = $size - $ones;
        $flips += min ($zeroes, $ones)
    }
    say $flips;
}

Output:

$ ./raku bin-substrings.raku  101100101 3
1

$ ./raku bin-substrings.raku  10110111 4
2

Binary Substrings in Perl

This is the same idea as above for the Perl version:

use strict;
use warnings;
use feature qw/say/;

my ($in_string, $size) = @ARGV;
my $sub_str_len = length($in_string) / $size;
my $flips = 0;
for my $i (0 .. $sub_str_len - 1) {
    my $ones = 0;
    for my $j (0 .. $size - 1) {
        my $idx = $j * $sub_str_len + $i;
        $ones++ if substr ($in_string, $idx, 1) == 1;
    }
    my $zeroes = $size - $ones;
    $flips += $zeroes > $ones ? $ones : $zeroes;
}
say $flips;

Output:

$ perl  bin-substrings.pl 101100101 3
1

$ perl  bin-substrings.pl  10110111 4
2

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, February 7, 2021. And, please, also spread the word about the Perl Weekly Challenge if you can.

Perl Weekly Challenge 96: Reverse Words and Edit Distance (and Decorators in Perl)

These are some answers to the Week 96 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a couple of days (January 24, 2021). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

Task 1: Reverse Words

You are given a string $S.

Write a script to reverse the order of words in the given string. The string may contain leading/trailing spaces. The string may have more than one space between words in the string. Print the result without leading/trailing spaces and there should be only one space between words.

Example 1:

Input: $S = "The Weekly Challenge"
Output: "Challenge Weekly The"

Example 2:

Input: $S = "    Perl and   Raku are  part of the same family  "
Output: "family same the of part are Raku and Perl"

Reverse Words in Raku

we simply chain the words, reverse and join method invocations:

use v6;

my $input = @*ARGS[0] // "    Perl and   Raku are  part of the same family  ";
say $input.words.reverse.join(" ");

Example output:

$ raku reverse-words.raku
family same the of part are Raku and Perl
~
$ raku reverse-words.raku "Don't ask what your country can do for you, ask what you can do for   your country  "
country your for do can you what ask you, for do can country your what ask Don't

Of course, this short script can easily be transformed into a Raku one-liner:

$ raku -e '@*ARGS[0].words.reverse.join(" ").say;' "    Perl and   Raku are  part of the same family  "
family same the of part are Raku and Perl

Reverse Words in Perl

In Perl, we use the same idea, just chaining function calls instead of method invocations:

use strict;
use warnings;
use feature "say";

my $input = shift // "    Perl and   Raku are  part of the same family  ";
say join " ", reverse split /\s+/, $input;

Output:

$ perl reverse-words.pl
family same the of part are Raku and Perl

Of course, we could also make it as a Perl one-liner:

$ perl -E 'say join " ", reverse split /\s+/, shift' "    Perl and   Raku are  part of the same family  "
family same the of part are Raku and Perl

Reverse Words in Scala

Using the same idea as in Raku, i.e. chaining the split, reverse, and mkString method invocations:

object reverseWords extends App {
  val in = "    Perl and   Raku are  part of the same family  "
  println(in.split("\\s+").reverse.mkString(" "))
}

Output:

family same the of part are Raku and Perl

Reverse Words in Python

Whether using a function-call syntax (in Perl), or a method-invocation syntax (in Raku and Scala), our three programs above all use a functional programming approach chaining pure functions and using immutable data, more precisely a data flow or pipeline pattern. In this programming model, each piece of data is channeled through a series of successive transformations in which the returned data elements produced by one of the operations are fed to the next operation, and so on. This is possible because each operation of the pipeline is a “pure function” that takes an input and produces an output to be processed by the next operation.

This is not easily transposable in Python, because some operators acts are pure function as in the three other languages, and some others (such as reverse in our case) are not pure functions and modify the data in-place instead of sending back the modified data as a return value. It would certainly be possible to work around the limitation in Python (using for example maps), but this is much less natural than in the three other languages, and it probably doesn’t make much sense to try to force the data flow model into Python. Therefore, our Python implementation will use intermediate temporary variables, as in traditional procedural programming.

def reverse_words(in_str):
    words = in_str.split()
    words.reverse()
    return " ".join(words)

input = "    Perl and   Raku are  part of the same family  "
print(reverse_words(input))

Output:

$ python3 reverse-words.py
Perl and Raku are part of the same family

Task 2: Edit Distance

You are given two strings $S1 and $S2.

Write a script to find out the minimum operations required to convert $S1 into $S2. The operations can be insert, remove or replace a character. Please check out Wikipedia page for more information.

Example 1:

Input: $S1 = "kitten"; $S2 = "sitting"
Output: 3

Operation 1: replace 'k' with 's'
Operation 2: replace 'e' with 'i'
Operation 3: insert 'g' at the end

Example 2:

Input: $S1 = "sunday"; $S2 = "monday"
Output: 2

Operation 1: replace 's' with 'm'
Operation 2: replace 'u' with 'o'

In computer science, edit distance is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations (usually single character edits) required to transform one string into the other. When the operations permettied are insertion, deletion, or substitution of a character, edit distance is usually called Levenshtein distance, named after the Soviet mathematician Vladimir Levenshtein.

The Levenshtein distance between two strings a, b (of length |a| and |b| respectively) is given by lev ⁡ (*a*, *b*) where

levenstein_dist.jpg

where the tail of some string x is a string of all but the first character of x and *x*[*n*] is the nth character of the string x, starting with character 0.

Note that, in the above formula, the first element in the minimum corresponds to deletion, the second to insertion and the third to replacement.

This definition can lead directly to a naïve recursive implementation. The problem, though, is that such naïve implementation would have an exponential time complexity and would unusable even for moderately large strings (especially if the strings are markedly different). As an example, the naïve (not optimized) version of the Raku recursive subroutine implementation presented below for the pseudo random strings “LMIjkHFSAE” and “dmqkdjfERZG” takes more than one minute:

$ time raku edit-distance.raku
11 LMIjkHFSAE - dmqkdjfERZG
-
real    1m15,592s
user    0m0,015s
sys     0m0,046s

The reason for that is that the recursive subroutine is called many times with the same input in the process. If we can cache (or memoize) the results to avoid having to recompute again and again the same result, then the time to compute the Levenshtein distance falls dramatically and becomes roughly proportional to the product of the two string lengths. This still makes the process quite inefficient for very long strings (such as DNA strands), but it’s usually OK for strings representing words in common human languages.

The recursive approach is a form of top-down dynamic programming, i.e. it breaks a large problem into smaller and smaller subproblems, until the subproblems can be solved. Other solutions use bottom-up dynamic programming, i.e. start from small elementary problems and expand them to larger problems; they often use a matrix or table and iterate over the matrix to expand it. The bottom-up approach also as a time complexity roughly proportional to the product of the two string lengths. So, while the two approaches may not be equivalent (the bottom-up approach is likely to be faster), they have the same time complexity, which means that they essentially scale up essentially the same way when the size of the strings increase. Here, we will use the recursive (top-down) approach.

Note that our implementations will work the other way around, from right to left. The only reason for this is that I had written about four years ago an edit-distance program in Perl, and I found it easier to start from that implementation.

Edit Discance in Raku

We present two implementations of a cached implementation of the recursive solution.

Memoized Version (Raku)

First, we implement a “naïve” edit-distance recursive function and use the Raku Memoize module (written by my friend Elizabeth Mattijsen as a port of Mark-Jason Dominus’s equivalent Perl module) to automatically perform the caching of already computed distances:

use v6;
use Memoize;

sub edit-distance (Str $left, Str $right) {
    # If one of the substrings is empty, return the length of the other
    return $right.chars unless $left;
    return $left.chars unless $right;
    my $shortened-left  = substr $left,  0, *-1;
    my $shortened-right = substr $right, 0, *-1;

    # If the last chars are the same, we ignore them
    # and call edit-distance on shortened strings    
    return edit-distance $shortened-left, $shortened-right 
        if substr($left, *-1) eq substr($right, *-1);

    # Else find the minimum between the three operations
    return 1 + min(
        edit-distance($left,       $shortened-right), #insert
        edit-distance($shortened-left,  $right),      #remove
        edit-distance($shortened-left,  $shortened-right)  #replace
    );
}    

memoize("edit-distance");

my @test-pairs = (
    [ < kitten sitting >],
    [ < Monday Friday > ],
    [ < Sunday Saturday > ],
    [ < January February > ],
    [ < November December > ],
    [ < constitutionally anticonstitutional > ],
    [ < LMIjkHFSAE dmqkdjfERZG >],
);
for @test-pairs -> @test {
    my ($str1, $str2) = @test;
    print edit-distance($str1, $str2), " $str1 - $str2\n";
}

This program produces the following output:

$ time raku edit-distance.raku
3 kitten - sitting
3 Monday - Friday
3 Sunday - Saturday
4 January - February
3 November - December
6 constitutionally - anticonstitutional
11 LMIjkHFSAE - dmqkdjfERZG
-
real    0m1,452s
user    0m0,015s
sys     0m0,046s

Note how much faster this program is (1.45 seconds for 8 tests), compared to the non-optimized version with only the last test discussed above (about 1 min 15 sec for only one string pair).

Implementing A Cache Manually in Raku

I have often said that I eschew using off-the-shelf modules in the context of a programming challenge because I feel that it is sort of cheating. In the case of the above solution, I used the Memoize module because it wasn’t really part of the Levenshtein distance algorithm, but only a performance optimization. In that case in point, however, that performance optimization is crucial (making the difference between a usable and a not usable implementation), so that I feel it is necessary to show a solution that implements the cache manually. This might be more useful for a beginner or a student wishing to understand how caching or memoizing works:

use v6;

my %cache;

sub edit-distance (Str $left, Str $right) {
    sub cache-distance (Str $l, Str $r) {
        %cache{"$l;$r"} = edit-distance($l, $r) unless %cache{"$l;$r"}:exists;
        return %cache{"$l;$r"};
    }

    # If one of the substrings is empty, return the length of the other
    return $right.chars unless $left;
    return $left.chars unless $right;
    my $shortened-left  = substr $left,  0, *-1;
    my $shortened-right = substr $right, 0, *-1;
    # say " $shortened-left  $shortened-right";

    # If the last chars are the same, we ignore them
    # and call edit-distance on shortened strings
    if substr($left, *-1) eq substr($right, *-1) { 
        return cache-distance $shortened-left, $shortened-right;
    }

    # Else find the minimum between the three operations
    return 1 + min(
        cache-distance($left,       $shortened-right), #insert
        cache-distance($shortened-left,  $right),      #remove
        cache-distance($shortened-left, $shortened-right)  #replace
    );
}    

my @test-pairs = (
    [ < kitten sitting >],
    [ < Monday Friday > ],
    [ < Sunday Saturday > ],
    [ < January February > ],
    [ < November December > ],
    [ < constitutionally anticonstitutional > ],
    [ < LMIjkHFSAE dmqkdjfERZG >],
);
for @test-pairs -> @test {
    my ($str1, $str2) = @test;
    print edit-distance($str1, $str2), " $str1 - $str2\n";
}

Note that I have implemented the cache management as a separate lexically-scoped subroutine, cache-distance, because there were four recursive calls to edit-distance in the body of the edit-distance and I did not want to implement the cache management code four times.

This script displays the following output:

$ time raku edit-distance_cache.raku
3 kitten - sitting
3 Monday - Friday
3 Sunday - Saturday
4 January - February
3 November - December
6 constitutionally - anticonstitutional
11 LMIjkHFSAE - dmqkdjfERZG

real    0m0,398s
user    0m0,015s
sys     0m0,031s

We see another significant performance improvement, probably because it is more efficient to tailor a cache for a specific problem, compared to a general solution such as using the Memoize module (and perhaps also because it takes some time to compile the module, not quite sure).

Edit Distance in Perl

Memoized Version (Perl)

First, we implement a “naïve” edit-distance recursive function and use the core Perl Memoize module (written by Mark-Jason Dominus) to automatically perform the caching of already computed distances:

use strict;
use warnings;
use feature 'say';
use Memoize;

sub min {
    my $rv = shift;
    for my $tmp (@_) {
        $rv = $tmp if $tmp < $rv;
    }
    return $rv;
}

sub edit_distance {
    my ($left, $right) = @_;

    # If one of the substrings is empty, return the length of the other
    return length $right unless $left;
    return length $left  unless $right;

    my $shortened_left  = substr $left,  0, -1;
    my $shortened_right = substr $right, 0, -1;
    # In the last chars are the same, we ignore them
    # and call edit_distance on shortened strings
    return edit_distance ($shortened_left, $shortened_right) if substr($left, -1) eq substr($right, -1);

    # Else find the minimum between the three operations
    return 1 + min(
        edit_distance($left, $shortened_right), #insert
        edit_distance($shortened_left, $right), #remove
        edit_distance($shortened_left, $shortened_right) #replace
    );
}

memoize("edit_distance");

my @test_pairs = (
    [ qw<kitten sitting>],
    [ qw<Monday Friday> ],
    [ qw<Sunday Saturday> ],
    [ qw<January February> ],
    [ qw<November December > ],
    [ qw<constitutionally anticonstitutional > ],
    # [ qw<LMIjkHFSAE dmqkdjfERZG>],
);
for my $ar (@test_pairs) {
    my ($str1, $str2) = @$ar;
    say edit_distance($str1,$str2), " $str1 - $str2 ";
}

This program displays the following output and time measures:

$ time perl edit-distance.pl
3 kitten - sitting
3 Monday - Friday
3 Sunday - Saturday
4 January - February
3 November - December
6 constitutionally - anticonstitutional
11 LMIjkHFSAE - dmqkdjfERZG

real    0m0,103s
user    0m0,015s
sys     0m0,046s

Creating a Decorator in Perl

As I said before, I don’t like to use ready-made modules in programming challenges (which the reason why I also implemented myself the min subroutine). We could manually implement some form of wrapper around the naïve edit_distance subroutine, as we more or less did in Raku, to manage the cache. However, I thought it would be more fun to implement a decorator in order to modify the behavior of the edit_distance subroutine.

Originally, a decorator was a OO-programming design pattern making it possible to assign new properties or responsibilities to an object, without modifying that object’s class. The idea has later been expanded to other programming models. For example, PEP 318 in Python introduced decorators as functions that are designed to change the behavior of other functions without modifying the code of these other functions. This may be useful, for example, to modify the behavior of a legacy or complex function that you don’t want to change, or to modify in some specific case the behavior of a module that you otherwise don’t want to change because other programs may be using that module. It is this extended meaning of decorators that we mean here. So, let’s assume we don’t want to modify the code of our edit_distance subroutine (perhaps it’s complicated and I spent so much time debugging it that I want to leave it alone). Rather than changing the code of that subroutine, we will decorate it to add to it the caching functionality.

Contrary to Python, Perl doesn’t have a specific syntax for decorators, but, as we will see, it is relatively easy to use higher-order functions and typeglobs to implement our own decorators.

First, we define a decorate subroutine which takes as input parameter a reference to the subroutine to be cached and returns an anonymous subroutine that checks the cache and returns the value in the cache if it exists, and else calls the subref it has received as a parameter:

sub decorate {
    my $coderef = shift;  # the argument is a ref to edit_distance
    my %cache;
    return sub {
        my ($l, $r) = @_;
        $cache{"$l;$r"} = $coderef->(@_) unless exists $cache{"$l;$r"};
        return $cache{"$l;$r"};
    }
}

Note that we define the %cache in the decorate subroutine. The anonymous subroutine thus acts as a closure and keeps its access to the cache.

Then, we replace the original edit_distance subroutine with the anonymous subroutine returned by decorate in the main symbol table:

{
    # local scope for the deactivation of the redefine warning
    no warnings 'redefine';
    # we replace the edit-distance subroutine by its 
    # decorated version in the main symbol table
    *main::edit_distance = decorate(\&edit_distance);
}

Here, *main::edit_distance is a typeglob representing the entry of the original edit_distance subroutine in the symbol table. Before that, we deactivate the redefine warning (to avoid a warning upon the subroutine definition, and we do that in a code block to limit the scope of the deactivation to this code line.

Now, whenever the code will call the edit_distance subroutine, it is the anonymous subroutine produced by the decorate subroutine that will be called instead. Thus, the edit_distance subroutine seen by the rest of the program is now memoized (the edit distances are cached), although we did not change anything to the code defining it. That subroutine is called recursively four times in its own code, but we don’t need to charge all these subroutine calls.

The overall program now looks like this:

use strict;
use warnings;
use feature qw/say/;

sub min {
    my $rv = shift;
    for my $tmp (@_) {
        $rv = $tmp if $tmp < $rv;
    }
    return $rv;
}

sub edit_distance {
    my ($left, $right) = @_;

    # If one of the substrings is empty, return the length of the other
    return length $right unless $left;
    return length $left  unless $right;

    my $shortened_left  = substr $left,  0, -1;
    my $shortened_right = substr $right, 0, -1;
    # In the last chars are the same, we ignore them
    # and call edit_distance on shortened strings
    return edit_distance ($shortened_left, $shortened_right) if substr($left, -1) eq substr($right, -1);

    # Else find the minimum between the three operations
    return 1 + min(
        edit_distance($left, $shortened_right), #insert
        edit_distance($shortened_left, $right), #remove
        edit_distance($shortened_left, $shortened_right) #replace
    );
}

# The decorator returns the edit_distance subroutine wrapped in 
# code lines performing the caching of values
sub decorate {
    my $coderef = shift;
    my %cache;
    return sub {
        my ($l, $r) = @_;
        $cache{"$l;$r"} = $coderef->(@_) unless exists $cache{"$l;$r"};
        return $cache{"$l;$r"};
    }
}

{
    # local scope for the deactivation of the redefine warning
    no warnings 'redefine';
    # we replace the edit-distance subrouytine by its 
    # decorated version in the main symbol table
    *main::edit_distance = decorate(\&edit_distance);
}

my @test_pairs = (
    [ qw<kitten sitting> ],
    [ qw<Monday Friday> ],
    [ qw<Sunday Saturday> ],
    [ qw<January February> ],
    [ qw<November December > ],
    [ qw<constitutionally anticonstitutional > ],
    [ qw<LMIjkHFSAE dmqkdjfERZG>],
);
for my $ar (@test_pairs) {
    my ($str1, $str2) = @$ar;
    say edit_distance($str1,$str2), " $str1 - $str2 ";
}

This script produces the following output and execution times:

$ time perl  edit-distance_decorator.pl
3 kitten - sitting
3 Monday - Friday
3 Sunday - Saturday
4 January - February
3 November - December
6 constitutionally - anticonstitutional
11 LMIjkHFSAE - dmqkdjfERZG

real    0m0,064s
user    0m0,000s
sys     0m0,046s

Note that our manually decorated subroutine is slightly faster that the original memoized version.

Edit Distance in Python

I do not know whether there is something equivalent to the Memoize module in Python, so I will manage the cache manually (in the cached_distance subroutine):

cache = dict()

def cached_distance(left, right):
    key = left + ';' + right
    if key not in cache:
        cache[key] = edit_distance(left, right)  
    return cache[key]

def edit_distance(left, right):
    lr = len(right)
    ll = len(left)
    if not left: return lr
    if not right: return ll
    shortened_l = left[0:ll-1]
    shortened_r = right[0:lr-1]
    if left[ll-1] == right[lr-1]:
        return cached_distance(shortened_l, shortened_r)

    return 1 + min( cached_distance(left, shortened_r),     # Insert 
                    cached_distance(shortened_l, right),    # Remove 
                    cached_distance(shortened_l, shortened_r)  # Replace 
                  )     

tests = ( [ "kitten", "sitting" ], [ "Monday", "Friday" ], 
          [ "Sunday", "Saturday" ], [ "January", "February" ],
          [ "November", "December" ],
          [ "constitutionally", "anticonstitutional" ],
        )

for test in tests:
    print (test[0], test[1], edit_distance(test[0], test[1]  ))

This produces the following output:

$ time python3 edit-distance.py
kitten sitting 3
Monday Friday 3
Sunday Saturday 3
January February 4
November December 3
constitutionally anticonstitutional 6

real    0m0,114s
user    0m0,000s
sys     0m0,062s

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, January 31, 2021. And, please, also spread the word about the Perl Weekly Challenge if you can.