April 2019 Archives

Perl Weekly Challenge # 5: Anagrams

By laurent_r on April 28, 2019 12:54 PM

These are some answers to the Week 5 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Challenge #1: Anagrams of a Word

Write a program which prints out all anagrams for a given word. For more information about Anagram, please check this Wikipedia page.

I'm not sure I fully understand the question, but my understanding is that we are looking for existing words which are anagrams of each other. For example, "pots", "spot", "tops", and "post" are anagrams of each other because they have exactly the same letters rearranged in a different order.

Just as for last week second challenge, I will use a words.txt file containing 113,809 lower-case English words usually accepted for crossword puzzles and other word games. The words.txt file can be found on my Github repository. The original list was contributed to the public domain by Internet activist Grady Ward in the context of the Moby Project. This word list is also mirrored at Project Gutenberg.

For the purpose of testing the programs below, the words.txt file is located in my current directory.

Perl 5 Solution: the `is_anagram` Subroutine

We first create an is_anagram subroutine that takes two words as parameters and return a true value if the words are anagrams of each other and false otherwise.

use strict;
use warnings;
use feature 'say';
sub is_anagram {
    my ($word1, $word2) = @_;
    return 0 if length $word1 != length $word2;
    my $letters1 = join "", sort split "", $word1;
    my $letters2 = join "", sort split "", $word2;
    return $letters1 eq $letters2
}
for ( [qw / ban bane/], [qw /post post/], [qw / post spot /], [qw /post spot/], [qw / pots spot/], [qw /pots taps/] ) {
    say "$$_[0] $$_[1]:\t", is_anagram($$_[0], $$_[1]) ? "True" : "False";
}

The is-anagram subroutine first returns a false value if the input words don't have the same length. Words of different lengths cannot be anagrams and it is believed that checking the length of the input words is significantly faster than the subsequent processing of their individual letters. Therefore, it should speed up processing when we will use this subroutine repeatedly for a large number of word combinations. When words have the same length, then we just split the words into individual letters, sort the letters, and produce new "normalized" or "canonical form" strings with the input letters; if the new normalized strings are equal, then the words were anagrams of each other.

This will display the following output:

ban bane:       False
post post:      True
post spot:      True
post spot:      True
pots spot:      True
pots taps:      False

Of course, we consider here "post" to be an anagram of itself. We could make a different special rule when the words are the same, if needed.

Finding All Anagrams of a Word From a Word List

Another possible interpretation of the challenge is that we are given a word and a word list, and should output the words that are anagrams of the input word. We can use the same is_anagram subroutine as follows:

sub is_anagram { ... }
my $word = "post";
for (qw /past post spot tops taps pots top/) {
    say "$word $_\t", is_anagram($word, $_) ? "True" : "False";
}

We're now getting the following output:

post past       False
post post       True
post spot       True
post tops       True
post taps       False
post pots       True
post top        False

Let's see now whether we can find more anagrams in our word list (still using the same is_anagram subroutine as before):

sub is_anagram { ... }
my $word = "post";
while (<>) {
    chomp;
    say if is_anagram $word, $_;
}

Here, the words.txt file is passed as a parameter to the script.

And, yes, we find two more anagrams that we had not thought about before in our word list ("opts" and "stop"):

opts
post
pots
spot
stop
tops

Perl 6 Solution: the `is-anagram` Subroutine

Normalizing the Words

Even though it might end up being slightly less useful than in P5, let's start with a similar is-anagram subroutine:

sub is-anagram (Str $word1, Str $word2) {
    return False if $word1.chars != $word2.chars;
    return $word1.comb.sort eq $word2.comb.sort;
}
for <ban bane post stop pots stop post pots pots taps> -> $w1, $w2 {
    say "$w1 $w2:\t", is-anagram $w1, $w2;
}

This displays the following output:

$ perl6 is-anagram.p6
ban bane:       False
post stop:      True
pots stop:      True
post pots:      True
pots taps:      False

Note that this works correctly because the eq operator in the is-anagram subroutine coerces its arguments into strings. Also note that the code is significantly shorter and simpler in Perl 6; I can understand that some people don't like the changes introduced in P6 compared to P5 (in some cases, I initially had the same reaction), but this illustrates that there are good reasons for these syntax changes.

Using a Bag

Another approach is to use one of a variety of Perl 6 data structure types called Set, Bag and Mix. They are immutable ordered collections of unique and weighed items.

You might construct a Set as follows:

> my $s = set <banana apple orange orange banana pear apple>;
set(apple banana orange pear)

As you can see, duplicates have been removed. Sets only tell us whether at least one item of a given name has been encountered.

A bag, by contrast, also keeps track of how many of each item have been seen:

> my $b = bag <banana apple orange orange banana pear apple orange>;
bag(banana(2), orange(3), pear, apple(2))
> say $b{'banana'}
2

Mixes are similar to bags, except that the elements' weights don't have to be integers.

One interesting thing about these collections is that they can use many set operators commonly used in mathematics, such as the (elem) set membership operator or (<) subset operator (you can also use the Unicode math set theory symbols in the P6 code, but I'm not sure how they would render on this blog post):

> say "Found it!" if 'apple' (elem) $s; 
Found it!
> say "It is a subset" if qw/orange banana/ (<) $s
It is a subset
> say "Found several oranges" if  qw/orange orange/ (<) $b
Found several oranges

We can now try the following alternate subroutine using bags:

sub is-anagram (Str $word1, Str $word2) {
    return $word1.comb.Bag === $word2.comb.Bag;
}
for <ban bane post stop pots stop post pots pots taps> -> $w1, $w2 {
    say "$w1 $w2:\t", is-anagram $w1, $w2;
}

The === value identity operator used between two bags returns True it the bags are identical. This displays the same output as before.

Creating an Anagram Operator

Just a bit of fun: rather than creating an is-anagram subroutine, we could create the ana operator:

sub infix:<ana> (Str $word1, Str $word2) {
    return $word1.comb.Bag === $word2.comb.Bag;
}
for <ban bane post stop pots stop post pots pots taps> -> $w1, $w2 {
    say "$w1 $w2:\t", $w1 ana $w2;
}

This prints the same result. Note, however that this feels significantly slower. My guess is that it is mainly compilation time. Adding the following line at the end of the previous script:

say now - INIT now;

shows that the run time is less than 0.02 seconds:

perl6 is-anagram.p6
ban bane:       False
post stop:      True
pots stop:      True
post pots:      True
pots taps:      False
0.0156261

I might undertake a serious benchmark one day, but it really seems that the perceived response time really has to do with compile time.

Just as with P5, the possible understanding of the challenge is that we are given a word and a word list, and should output the words that are anagrams of the input word. We can use the same is-anagram subroutine as follows and use directly the words.txt file:

my $post-bag = "post".comb.Bag;
sub is-anagram (Str $word) {
    return $word.comb.Bag === $post-bag;
}
for "words.txt".IO.lines -> $line {
    say $line if is-anagram $line; 
}

This program displays the same anagrams as our equivalent P5 program:

perl6 is-anagram.p6
opts
post
pots
spot
stop
tops

I must admit that, at this point, the P6 version is running markedly slower than the P5 version. Having said that, I should add that I haven't upgraded my Rakudo version for more than a year, maybe it is better now. More on this later.

Challenge #2: Highest Number of Anagrams

Write a program to find the sequence of characters that has the most anagrams.

Again, I'm not completely sure about the requirement, but I'll assume that the point is to go through all items of a word list and find those that have the highest number of anagrams.

Perl 5 Solution

The one thing that we don't want to do here is to compare each word with each other, for example in two nested loops, as this would take ages with a large input list (we would need almost 13 billion comparisons with our word list). Rather, we want to read each word from the file only once and store these words in a hash, with the key being the canonical or normalized form of the word (say a string where the letters of the word have been sorted) and the value an array of the original words. In other words, we will use a hash of arrays (HoA). Once this data structure is populated, we can sort it by the number of items in the array or, probably better, roll out some other way to pick up the entry with the largest number of items.

Since we know we have at least one case with 6 anagrams (post, spot, etc.), we can skip input words with less than three letters. Similarly, in the for loop to find the highest counts of anagrams, we can initialize the $max variable at 5 to skip any entry with a lower count.

use strict;
use warnings;
use feature 'say';

my %words;   # our HoA
my $file_in = "words.txt";
open my $IN, "<", $file_in or die "Ouverture impossible $file_in $!";
while (my $word = <$IN>) {
    next unless $word =~ /\w/;  # skipping empty lines if any
    $word =~ s/\s+$//;          # removing trailing spaces, new lines and carriage returns (if any)
    next if length $word < 3; 
    my $key = join '', sort split //, $word;  # normalizing the word for the hash key
    push @{$words{$key}}, $word;              # storing the word in the HoA
}
close $IN;
my @max_anagrams;
my $max = 5;
for my $key (keys %words) {
    next if @{$words{$key}} < $max;
    if (@{$words{$key}} == $max) {
        push @max_anagrams, $key;
    } else {
        @max_anagrams = ($key); 
        $max = scalar @{$words{$key}};
    }
}
say "$_:\t @{$words{$_}}" for (@max_anagrams);

The script finds two groups of eleven anagrams and runs in less than one second:

$ time perl anagrams.pl
aelrst:  alerts alters artels estral laster ratels salter slater staler stelar talers
aeprs:   apers asper pares parse pears prase presa rapes reaps spare spear

real    0m0.957s
user    0m0.781s
sys     0m0.077s

I must admit that there are in these two lists quite a few words which I had never seen before.

Perl 6 Solution

I wish I could suggest a markedly different solution in Perl 6, for example using bags or other such structures as we did before, but storing the words in a hash as in Perl 5 really seems to be the best and simplest solution. I will still try to depart from the P5 version and use idiomatic P6:

my %words;
for "words.txt".IO.lines -> $line { 
    next unless $line ~~ /\w/;  # skipping empty lines if any
    $line ~~ s/\s+$//;          # removing trailing spaces if any
    next if $line.chars < 3;
    my $key = $line.comb.sort.join('');
    push %words{$key}, $line;
}
my @max-anagrams;
my $max = 5;
for %words.keys -> $key {
    given %words{$key}.elems {
        when $_ < $max  { next }
        when $_ == $max { @max-anagrams.push($key) }
        default         { @max-anagrams = $key,; $max = $_}
    }
}
say "$_:\t %words{$_}" for (@max-anagrams);

The script finds the same groups of eleven anagrams as the P5 script:

aelrst:  alerts alters artels estral laster ratels salter slater staler stelar talers
aeprs:   apers asper pares parse pears prase presa rapes reaps spare spear

The P6 script is significantly more concise (by about one third), in part because it avoids some boiler plate code, and in part because the P6 syntax is more expressive. The P6 version is also, in my humble opinion, easier to read, in the sense that it avoids the somewhat obscure syntax of P5 references.

When I first wrote the above code yesterday, I had sort of forgotten that there is a max builtin routine that is designed to return the maximum value from a list. We could remove the second for loop and simplify the code above quite significantly:

my %words;  
for "words.txt".IO.lines -> $line { 
    next unless $line ~~ /\w/;  # skipping empty lines if any
    $line ~~ s/\s+$//;          # removing trailing spaces if any
    next if $line.chars < 3;
    my $key = $line.comb.sort.join('');
    push %words{$key}, $line;
}
my $max = max map { .elems }, values %words; 
say "$_:\t %words{$_}" if %words{$_}.elems == $max for %words.keys;

Finding the largest count of anagrams (the $max variable defined in the penultimate code line above) could also be written with a method invocation syntax and the use of the unary >> hyperoperator as:

my $max = %words.values>>.elems.max;

Either way, the P6 script is now far more concise and, with just 10 code lines, about 2.7 times shorter than the P5 code. Note, however, that the code of this new version might be slightly less efficient, because we are now traversing the %words hash twice instead of only once. As we will see below, this is somewhat immaterial because performance bottlenecks are elsewhere (namely, in the first for loop of our original version).

Performance Issue with Perl 6

The P6 script above is more concise than its P5 counterpart, but it runs in about 19.3 seconds, which is about 20 times slower than the P5 equivalent. Let's face it: that's a real problem when dealing with a relatively large amount of data (although a file of 113-k words really isn't that large). In my daily work activity with Perl 5, I'm processing regularly files that have several GB or even sometimes tens of GB of data; I simply couldn't afford to do it with P6 at this point.

I tried to investigate a bit the reasons for such slow processing. It appears that the program is spending more than 97.5% of the time in the first for loop (of the original version) which reads the words.txt file and populates the %words hash (18.8 sec. out of a total of 19.3 sec.). Commenting out the two code lines with regexes (which are not really needed with my clean input file) brings the time spent for the first for loop to 13.0 sec. So, quite obviously, the regex engine's speed (or, rather, lack thereof) is a significant part of the problem, but there is more.

What else in that for loop is so slow? I suspected it might use the regex engine in the IO.lines method that provides individual lines to the program. But, further tests showed that this is not the case. The real slow lines in that loop are those two:

my $key = $line.comb.sort.join('');   #  6.5 seconds
push %words{$key}, $line;             #  6.3 seconds

Then, it occurred to me that (as mentioned earlier) I have a rather old version of Perl 6/Rakudo on the box where I ran these tests:

perl6 -v
This is Rakudo Star version 2017.10 built on MoarVM version 2017.10
implementing Perl 6.c.

Maybe a newer version of Rakudo will bring better performance, as a lot of performance enhancement work has been done recently. I tried to download a more recent version, but, unfortunately, the Rakudo download site is temporarily unavailable these days. So I decided at least to try on another PC where I have a slightly more recent version (2018.04.1), although this other PC is, I believe, less powerful and slower. The performance seems to be significantly better:

Full for loop, including regexes: 13.9 seconds;
Full for loop, without the regexes: 9.6 seconds;
Full for loop, without the regexes and without the push: 5.3 seconds.

That's about 30% faster, so it's better, but we are still very far behind Perl 5. I wish I could test with a much more recent version, but that's not possible for the time being. I might come back to this subject when the download server becomes available again.

Wrapping up

The next week Perl Weekly Challenge is due to start very soon. If you're interested in participating in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 6 p.m. BST (British summer time) on next Sunday, May 5. And, please, also spread the word about the Perl Weekly Challenge if you can.

1 comment

Perl Weekly Challenge: Week 4

By laurent_r on April 21, 2019 5:28 PM

These are some answers to the Week 4 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

This post will be relatively short, because I don't have too much time this week.

Challenge #1: Pi Digits

Write a script to output the same number of PI digits as the size of your script. Say, if your script size is 10, it should print 3.141592653.

A Perl 5 Solution

Here, I decided to make it really short. This a Perl 5 one-liner:

perl -E '
say 4*atan2 1,1
'
3.14159265358979

I consider that the script is what is between the single quotes, so that's 15 characters (not counting the newlines, because they are only there to show that the code has the same number of characters as the number of digits of the output).

Some might argue that this is cheating a bit but, hey, that satisfies the request. I'll give a more extensive script in the Perl 6 section, and I think it could easily be adapted to P5, using for example the https://metacpan.org/pod/bigrat module.

A Perl 6 Solution

Of course, we can do more or less the same in Perl 6, with just some variations for fun:

perl6 -e '
print pi, "\n";
'
3.14159265358979

But let's try to be more serious and adapt the size of our output to the size of the script, which means to actually calculate some digits of pi.

I've tried several formulae known for centuries (François Viète, John Wallis, Isaac Newton, Gottfried Wilhhem Leibniz, etc.), but will show only one of them to illustrate the problem.

John Wallis's infinite product can be reformated as:

pi / 2 = (2 * 2 * 4 * 4 * 6 * 6 * 8 * 8 ... ) / (1 * 3 * 3 * 5 * 5 * 7 * 7 * 9 ... )

We can build two lazy infinite lists, one for the numerator and one for the denominator and use the reduction operator to calculate Wallis's infinite product:

my @numerators =  2, 2, -> $a, $b {| ($a + 2, $b + 2) } ... *;
my @denominators = 1, 3, 3, -> $a, $b {| ($a + 2, $b + 2) } ... *;
my $pi = 2 * ([*] @numerators[0..2000]) / ([*] @denominators[0..2000]);

The value obtained for $pi is:

3.14237736509388

Only the first three digits are correct with a product of two thousands terms! Quite obviously, those century-old formulas converge much too slowly for our purpose. We need something more efficient.

The Indian mathematician Srinivasa Ramanujan is known as the author of a number of innovative new formulae for calculating digits of pi during the first decades of the twentieth century, but the one I looked at is not so easy to implement, notably because it involves the square root of two, so we would need to start by calculating the digits of that number.

In 2006, Franco-Canadian mathematician Simon Plouffe used the so-called PSLQ integer relation algorithm to generate several new formulae for pi. One of them was described back in 1995 and is known as one of the spigot algorithms.

We can write the following plouffe subroutine:

sub plouffe (Int $k) {
    my $result = (1 / 16 ** $k) * (  (4 / (8 * $k + 1)) - (2 / (8 * $k + 4)) - (1 / (8 * $k + 5)) - (1 / (8 * $k + 6) )  );
}

to calculate the individual terms of the infinite sum and then compute pi as follows:

my $pi = [\+]  (plouffe $_ for 0..20);

That does not work properly, however, as the plouffe terms get converted from rationals to floats (well, really from Rat to Num) when the input value reaches 11 or more.

> say (plouffe $_).WHAT for 0..15;
(Rat)
(Rat)
(Rat)
(Rat)
(Rat)
(Rat)
(Rat)
(Rat)
(Rat)
(Rat)
(Rat)
(Num)
(Num)
(Num)
(Num)
(Num)

so that we are losing accuracy and the result:

3.141592653589793129614170564041344859

is correct only up to the 16th digit.

So, let's try to use the FatRat type:

sub plouffe (Int $k) {
    my $result = 1.FatRat *  (1 / 16 ** $k) * (  (4 / (8 * $k + 1)) - (2 / (8 * $k + 4)) - (1 / (8 * $k + 5)) - (1 / (8 * $k + 6) )  );
}

It is a bit better, but we are again falling back to Num when the subroutine input value reaches 17 or above:

> say (plouffe $_).WHAT for 0..20;
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(FatRat)
(Num)
(Num)
(Num)
(Num)
(Num)

For some reason, coercing the input value to a FatRat:

sub plouffe (FatRat $k) {
    my $result =  (1 / 16 ** $k) * (  (4 / (8 * $k + 1)) - (2 / (8 * $k + 4)) - (1 / (8 * $k + 5)) - (1 / (8 * $k + 6) )  );
}
say (plouffe $_.FatRat).WHAT for 0..20;

doesn't work either.

While still trying to understand why we are falling from FatRat to Num, I posted a question on perl6-users, and I quickly received very useful tips from Fernando Santagata and Brian Duggan. Many thanks to them.

The following syntax suggested by Brian works properly:

sub plouffe (Int $k) {
    my $result = (1.FatRat / 16 ** $k) * (  (4 / (8 * $k + 1)) - (2 / (8 * $k + 4)) - (1 / (8 * $k + 5)) - (1 / (8 * $k + 6) )  );
}
# printing 200 digits of pi
my $pi = [+] (plouffe $_ for  0..200);

Now, $pi is populated with about 750 digits, two thirds of which are wrong, but the important point is that they are all correct up to the 249th digit. In general, we get an average of about 1.2 correct digits per term in the sum. So, with the above input values, we're on the safe side of things if we print out the first two hundred digits:

> say substr $pi, 0, 201;
3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489549303819

It is now fairly easy to output the same number of PI digits as the size of the script:

sub plouffe (Int $k) {
    my $result = (1.FatRat / 16 ** $k) * (  (4 / (8 * $k + 1)) - (2 / (8 * $k + 4)) - (1 / (8 * $k + 5)) - (1 / (8 * $k + 6) )  );
}
my $script-size = $*PROGRAM-NAME.IO.s;
my $pi = [+] (plouffe $_ for  0..$script-size);
say substr $pi, 0, $script-size + 1;

The script has 290 bytes and the script prints the first 290 digits of pi:

$ perl6 pi.p6
3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489549303819644288109756659334461284756482337867831652712019091456485669234603486104543266482133936072

The script runs in slightly less than 1.5 sec.

Challenge #2: Letters and Word List

You are given a file containing a list of words (case insensitive 1 word per line) and a list of letters. Print each word from the file than can be made using only letters from the list. You can use each letter only once (though there can be duplicates and you can use each of them once), you don’t have to use all the letters.

A Perl 5 One-Liner

I was running out of time and also a bit too lazy to include the list of letters in the input file. Therefore, the list of letters is hard-coded in the one-liner below. Otherwise, I have in the current directory a words.txt file containing 113,809 lower-case words usually accepted for word games, crossword puzzles, and so on. The words.txt file can be found here.

The basic idea is to sort the letter list and build a string with that, and then to sort the letters of each word, and finally to check whether the string containing the sorted letters of the word can be found in the string of the sorted input letters.

This is the one-liner:

perl -E 'my $letters = join "", sort qw/a e i t r s o u a /; while (<>) { chomp; my $norm = join "", sort split //, $_; say $_ if index($letters, $norm) >=0;}' words.txt
aa
ae
ariose
or
ors
orts
osier
outraise
riots
rots
roust
routs
rust
ruts
sautoire
sori
sort
sortie
stour
stourie
suitor
tiros
tories
tors
torsi
torus
tours
trios
triose
trois
ut
uts

This is the same script reformatted to make it slightly clearer:

perl -E 'my $letters = join "", sort qw/a e i t r s o u a /; 
    while (<>) { 
        chomp; 
        my $norm = join "", sort split //, $_;
        say $_ if index($letters, $norm) >=0;
    } ' words.txt

A Perl 6 Solution

We can use a one-liner similar to the P5 solution:

perl6 -e 'my $letters = join "", sort qw/a e i t r s o u a /; for "words.txt".IO.lines -> $line { my $norm = $line.comb.sort.join(""); say $line if defined index($letters, $norm);}'

The output is the same as for the P5 one-liner.

The following script fully complies with the requirement. I have created a new words2.txt file where the first line is a list of letters: aeiortgsdf (and the rest of the file the same list of words). The script looks like this:

my @lines = 'words2.txt'.IO.lines;
my $first-line = shift @lines;
my $letters = $first-line.comb.sort.join("");
for @lines -> $line { 
    my $norm = $line.comb.sort.join(""); 
    say $line if defined index($letters, $norm);
}

And this is the output:

$ perl6 words.p6
ad
da
de
deaf
ef
fade
fadge
fed
fidge
fig
firedog
firedogs
fogie
giro
giros
or
ors
orts
riots
rots
sori
sort
tiros
tors
torsi
trigos
trios
trois

Wrapping up

The next week Perl Weekly Challenge is due to start very soon. If you're interested in participating in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 6 p.m. BST (British summer time) on next Sunday, April, 28. And, please, also spread the word about the Perl Weekly Challenge if you can.

1 comment

Perl Weekly Challenge: Week 3

By laurent_r on April 14, 2019 5:39 PM

These are some answers to the Week 3 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Challenge #1: 5-Smooth Numbers

Create a script to generate 5-smooth numbers, whose prime divisors are less or equal to 5. They are also called Hamming/Regular/Ugly numbers. For more information, please check this wikipedia page.

Regular or 5-smooth numbers (or Hamming numbers) are numbers whose prime divisors are only 2, 3, and 5, so that they evenly divide some powers of 30.

A Perl 5 Solution

Generating just some 5-smooth numbers is a trivial problem. For example, if you want 6 such numbers, you only need to generate the first six powers of 2 (or the first six powers of 3, or six powers of 5), as in this Perl one-liner:

$ perl -E 'say 2 ** $_ for 1..6;'
2
4
8
16
32
64

This is really too simple, so my guess is that, perhaps, what is wanted is maybe something like: generate a sequence of all 5-smooth numbers smaller than a given upper bound (say 100). Such a sequence is sometimes called a Hamming sequence. Or maybe that's not really the requirement, but let's do it for the fun of it.

We could do it with a brute-force approach: check all integers between 1 and 100, perform a prime factor decomposition of each of them and check whether any of the prime factors is larger than 5. This would be rather inefficient, though, with a lot of useless computations. An alternative would be to generate a list of primes between 1 and the upper bound and to check for each number in the range whether it can be evenly divided by any of the primes larger than 5. In either case, we need to build a list of prime numbers.

Building a List of Prime Numbers

There are several fast CPAN modules for prime numbers calculation (e.g. https://metacpan.org/pod/Math::Prime::Util1, https://metacpan.org/pod/Math::Prime::XS, https://metacpan.org/pod/Math::Prime::FastSieve, etc.), but this being a coding challenge, using other libraries might be frowned upon. Let's build such a list of primes between 1 and 100 in pure Perl.

#!/usr/bin/perl
use strict;
use warnings;
use feature "say";
use constant largest_num => 100;

sub find_primes {
    my $num = 5;
    my @primes = (2, 3, 5);
    while (1) {
        $num += 2;     # check only odd numbers
        last if $num > largest_num;
        my $limit = int $num ** 0.5;
        my $num_is_prime = 1;
        for my $prime (@primes) {
            last if $prime > $limit;
            if ($num % $prime == 0) {
                # $num evenly divided by $prime, $num is not prime and exit the for loop
                $num_is_prime = 0;
                last;
            }
        }
        push @primes, $num if $num_is_prime; #  Found a new prime, add it to the array of primes
    }
    return @primes;
}
my @prime_numbers = find_primes;  
print "@prime_numbers \n";

This display the following list:

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

If we need a larger list of prime numbers, we can just change the value of the largest_num constant.

The find_primes subroutine uses the basic naive algorithm, with just a few performance improvements. The most basic algorithm to list prime numbers is to go through the sequence of integers (the "prime candidates") and, for each of them, try to divide it by all integers smaller than it, from the smallest (2) to the largest. The search can be stopped as soon as we find a number that evenly divides the number being checked for primality. These are some possible performance improvements:

The first improvement is to verify primality only for odd numbers, because even numbers can be divided evenly by 2 and are therefore not prime (with the exception of 2 itself, which is even and prime and may therefore need to be treated as a special case).
The second improvement is that we can also use odd numbers for the divisors to be tried (since we are checking only odd numbers, a divisor cannot be even). These two changes reduce by a factor of close to 4 the number of even divisions to be performed (at least for a number that turns out to be prime).
A further improvement is that we can check divisors up to the square root of the prime candidate, because, since we have checked all smaller possible divisors.
Finally, the last improvement implemented above is that we construct the list of prime numbers as we go (in the @primes array). Rather than trying to evenly divide each prime candidates by each odd number below the limit (square root of the candidate), we only try even division by the prime numbers found this far (in the @primes array).

We can't do much more to improve the algorithm (well, it is possible to find some additional refinements, but the new performance gains will be quite small and relatively insignificant compared to those already achieved). It is possible however to use totally different algorithms that will be far more efficient, especially for large prime candidates, such as the Miller-Rabin algorithm that will be mentioned below in the Perl 6 section.

Building the Hamming Sequence

Now if is easy to go through all the integers between 1 and 100 and find out if they can be divided evenly by any of the primes larger than 5:

my @prime_numbers = grep $_ > 5, find_primes;
my @regulars;
for my $num (1 .. 100) {
    my $is_regular = 1;
    for my $prime (@prime_numbers) {
        last if $prime > $num;
        if ( $num % $prime == 0) {
            $is_regular = 0;
            last;
        }
    }
    push @regulars, $num if $is_regular;    
}
print "@regulars \n";

Thus the Hamming sequence starts as follows:

1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 72 75 80 81 90 96 100

Well, after all, it seems that all this may be a little be too complicated for the beginner's challenge.

Another Approach: Producing Directly the Products of the Powers of 2, 3, and 5

Let's consider another approach: can we construct directly a list of numbers less than 100 that are products of powers of 2, 3, and 5?

The largest power of 2 will be 6 (2 ** 7 is 128), the largest power of 3 will be 4 and the largest power of 5 will be 2. We will get some numbers larger than 100, but we can filter them out afterwards. This might give something like this:

use strict; 
use warnings;
use constant limit => 100;
my @raw_hamming;
for my $pow2 (0..6) {
    for my $pow3 (0..4) {
        for my $pow5 (0..2) {
            push @raw_hamming, 2 ** $pow2 * 3 ** $pow3 * 5 ** $pow5;
        }
    }
}
my @hamming_sequence = sort { $a <=> $b } grep $_ <= limit, @raw_hamming;
print "@hamming_sequence \n";

And we get the same Hamming sequence as before:

1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 72 75 80 81 90 96 100

In the code above, I've hard-coded the ranges for simplicity and convenience, but the range upper bounds can be computed as follows: for the powers of 2, the range upper bound is the binary logarithm of the limit. Similarly, for the powers of 3, the upper range is the logarithm to base 3 of the limit, and likewise for 5. And since Perl 5 only has natural logarithms, remember, for example, that the binary logarithm of 1000 is equal to log 1000 / log 2 and that, similarly, the logarithm to base 3 of 1000 if log 1000 / log 3. The next code snippet shows the full calculation.

This works properly and the code is much simpler than before and probably significantly more efficient, but we're still doing quite a lot of useless calculations. And this does not scale very well, because the number of useless calculations presumably increases faster than the number of useful calculations when the upper limit grows higher. Let's try to cut down useless calculations. We can stop the various for loops as soon as an intermediary or final result becomes too large. This will also make it possible to remove hard-coded limits.

use strict;
use warnings;
use constant limit => 100;
my $log_limit = log limit;
my @unsorted_hamming;
my ($max2, $max3, $max5) = (int $log_limit/log 2, int $log_limit/log 3, int $log_limit/log 5);
for my $pow2 (0..$max2) {
    my $result_2 = 2 ** $pow2;
    last if $result_2 > limit;
    for my $pow3 (0..$max3) {
        my $result_2_3 = $result_2 * 3 ** $pow3;
        last if $result_2_3 > limit;
        for my $pow5 (0..$max5) {
            my $result_2_3_5 = $result_2_3 * 5 ** $pow5;
            last if $result_2_3_5 > limit;
            push @unsorted_hamming, $result_2_3_5;
        }
    }
}
my @hamming_sequence = sort { $a <=> $b } @unsorted_hamming;
print "@hamming_sequence \n";

And we get the same Hamming sequence as before:

1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 72 75 80 81 90 96 100

The code is now a bit more complicated, but it does no longer perform any useless calculation and is now very efficient. Changing the limit constant to 10 million, I computed all the 5-smooth numbers below 10 million (there are 768 of them) in about .2 second on my 8-year old laptop:

$ perl hamming.pl
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 72 
(...)
9447840 9565938 9600000 9720000 9765625 9830400 9841500 9953280 10000000

real    0m0.215s
user    0m0.000s
sys     0m0.077s

I was disappointed, however, that the optimized version does not run faster than the initial one, despite that fact that it supposedly does less work. I guess the lesson is clear: don't try to (micro-) optimize something that runs fast enough anyway.

Challenge #1 in Perl6

If we want to use the first approach above, Perl 6 has some features that are worth mentioning.

First, there is a built-in is-prime subroutine, which implements the very fast Miller-Rabin algorithm for figuring out whether an integer is prime. The is-prime subroutine returns False if this integer is not a prime, and it returns True if the integer is a known prime or if it is likely to be a prime based on the probabilistic Miller-Rabin test. In other words, the Miller-Rabin test is probabilistic and it is possible (though very unlikely) that is-prime will return True for a number that is not prime. In fact, the probability of occurrence of such an event is so low that it is said to be much less likely to happen than having a cosmic ray hitting your CPU at the wrong moment and disrupting its function to the point of giving you the wrong answer.

So building the list of primes between 1 and 100 is just one line of code, shown here in the REPL:

> my @primes = grep {.is-prime}, 1..100
[2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97]

We don't even need to specify how many primes we want in our list: we can build a lazy infinite list of prime numbers:

> my $primes := grep {.is-prime}, $list;
(...)
> say $primes[4];   # Fifth prime number
11
> say $primes[999]; # Thousandth prime number
7919

Here, $primes is an infinite list of prime numbers. Quite obviously, the computer did not calculate and populate an infinite list of primes. It is a lazy list, which means that the program now knows how to calculate any element of the list, but it will actually do so only when required. This is great because we don't need to know in advance how many primes we really need: we just prepare a lazy infinite list, and the program will compute only the primes that are actually needed by the program.

Building the Hamming Sequence

We can translate in Perl 6 our original Perl 5 program to display a Hamming sequence:

my @prime_numbers = grep {.is-prime}, 5^..Inf;    # we need only primes strictly larger than 5
my @regulars;
for (1 .. 100) -> $num {
    my $is_regular = True;
    for @prime_numbers -> $prime {
        last if $prime > $num;
        if ( $num %% $prime ) {
            $is_regular = False;
            last;
        }
    }
    push @regulars, $num if $is_regular;    
}
say @regulars;

This will print the same Hamming sequence as before:

[1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 72 75 80 81 90 96 100]

Producing Directly the Products of the Powers of 2, 3, and 5

Rather than translating into Perl 6 our Perl 5 script, we will use the Perl 6 cross (X) metaoperator to generate all the products. I knew intuitively that this could certainly be done, but I must admit it took a little bit of thinking to figure out a good way to implement it.

The cross operator operates on two or more lists and generates a Cartesian product of all elements. Here is an example in the REPL:

> say <a b c> X <1 2 3>;
((a 1) (a 2) (a 3) (b 1) (b 2) (b 3) (c 1) (c 2) (c 3))

Used as a metaoperator, X will apply the associated operator to all the generated tuples. For, example, we can use it with the concatenation operator (X~) to generate strings from the tuples:

> say <a b c> X~ <1 2> X~ <y z>;
(a1y a1z a2y a2z b1y b1z b2y b2z c1y c1z c2y c2z)

We can use the cross metaoperator together with the multiplication operator (X*) to generate the products of the various powers of 2, 3 and 5:

my %powers;
for 2, 3, 5 -> $n {%powers{$n} = (1, $n, $n**2 ... *);} 
my @hamming_sequence = sort grep { $_ <= 100}, 
    (%powers{2}[0..6] X* %powers{3}[0..4] X* %powers{5}[0..2]);

First, we use the sequence (...) operator to generate infinite lists of the powers of 2, 3 and 5 (stored in the %powers hash), and, then, we use X* to generate all the products, and finally apply a grep to keep only the 5-smooth numbers smaller than 100 and sort the result.

We obtain the same Hamming sequence as before.

Challenge #2: Pascal's Triangle

Create a script that generates Pascal Triangle. Accept number of rows from the command line. The Pascal Triangle should have at least 3 rows. For more information about Pascal Triangle, check this wikipedia page.

The most typical way to construct Pascal's triangle is to deduct one line from the previous one. For example, the 3rd line is 1 2 1. We mentally add a 0 before and after the list and construct the 4th line by adding two by two the numbers of the third line:

0 1 2 1 0
 1 3 3 1

Next, we do it again with the fourth line just produced:

0 1 3 3 1 0
 1 4 6 4 1

You don't really need to mentally add the leading and trailing zeroes: you can just say that the coefficients are either the sum of the two coefficients above, and 1 when there is only one coefficient above. But adding a 0 before and after the list will be useful to avoid Use of uninitialized value warnings in some of the solutions below (although it could be done in several other ways).

Perl 5

Two Nested Loops

We can use simply two nested for loops like this:

use strict;
use warnings;
use feature "say";
my $nb_rows = shift;
my @row = (1);
for (1 .. $nb_rows) {
    say "@row";
    my @temp = (0, @row, 0);
    @row = ();
    for my $index (0 .. $#temp -1) {
        push @row, $temp[$index] + $temp[$index + 1];
    }
}

With an input parameter of 10 rows, this displays the following:

$ perl pasc.pl 10
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1

This is not very complicated, but, as it is often the case with such loops, it is a little bit difficult to get the management of indices right. Also, I frankly don't like too much having to use a @temp array.

A Recursive Approach

Since each line is derived from the previous one, it seems it might be interesting to try a recursive approach:

use strict;
use warnings;
use feature "say";
sub pascal {
    my @row;
    my $line_count = shift;
    return unless $line_count;
    for my $index (0 .. $#_ - 1) {
        push @row, $_[$index] + $_[$index + 1];
    }
    say "@row";
    pascal ($line_count - 1, 0, @row, 0);
}
my $nb_rows = shift;
my @line = (1);
say "@line";
pascal ($nb_rows - 1, 0, @line, 0);

This works fine and produces the same output as before, but this is not really better.

Functional Programming Approach

Let's try to see whether we can do better with a functional programming oriented approach.

use strict;
use warnings;
use feature "say";
my $nb_rows = shift;
my @line = (1);
for my $row (1 .. $nb_rows) {
    say "@line";
    @line = (1, (map $line[$_] + $line[$_ + 1], 0 .. $row - 2), 1);
}

The code is now much shorter and, in my humble opinion, also clearer.

Perl 6

We can first translate into Perl 6 the last P5 solution (functional programming):

sub pascal ($nb-rows) {
    my @line = 1,;
    for 1 .. $nb-rows -> $row {
        @line.join(" ").say;
        @line = flat 1, (map {@line[$_] + @line[$_ + 1]}, 0 .. $row - 2), 1;
    }
}
sub MAIN (Int $rows where * > 0) {
    pascal $rows
}

Aside from a few small syntactic changes, the only significant difference with P5 is that we've used the MAIN subroutine signature to validate the input parameter, which obviously has to be a positive integer.

Lets try a recursive version:

sub pascal ($nb-rows, @line is copy) {
    return unless $nb-rows;
    @line.join(" ").say;
    @line = flat 1, (map {@line[$_] + @line[$_ + 1]}, 0 .. @line.elems - 2), 1;
    pascal $nb-rows - 1, @line;
}
sub MAIN (Int $rows where * > 0) {
    pascal $rows, (1,);
}

This recursive version is slightly more concise (one line less), but the gain is small.

Using the sequence operator

Since each line is generated from the previous one, we can use the ... sequence operator with a generator to produce a list of lines of Pascal's triangle:

sub pascal ($nb-rows) {
    my @lines = 1, -> $line { 
        flat 1, (map {$line[$_] + $line[$_ + 1]}, 0 .. $line.elems - 2), 1 
    } ... +$nb-rows;  
}
sub MAIN (Int $rows where * > 0) {
    my @triangle = pascal $rows;
    .join(" ").say for @triangle;
}

Note that we need to numify $nb-rows using the + prefix operator for the sequence to work properly (in some cases, the sequence operator doesn't seem to work correctly with an IntStr, which I think is a Rakudo bug).

Using the zip operator

We can do slightly better, however, using the Z zip metaoperator operator. To start exploring this, let's first consider any line of Pascal's triangle (for example 1 2 1) and try to build the next one. For now, we do it in the REPL:

> @a = 1, 2, 1;
[1 2 1]
> say (0, @a).flat Z (@a, 0).flat;
((0 1) (1 2) (2 1) (1 0))
> say 0, |@a Z |@a, 0;
((0 1) (1 2) (2 1) (1 0))
> @a = 0, |@a Z+ |@a, 0;
[1 3 3 1]
> @a = 0, |@a Z+ |@a, 0;
[1 4 6 4 1]

As you can see, combining [ 0, 1, 2, 1] and [ 1, 2, 1, 0] with the zip operator produces 4 pairs (((0 1) (1 2) (2 1) (1 0))) whose sums are the coefficients of the next line of Pascal's triangle. Note that we need to flatten the operands of the zip operator with the flat method invocation or with the | operator. When using the zip metaoperator with the addition operator we obtain the next line ([1 3 3 1]) of Pascal's triangle. And since we are storing the result in @a, doing it again will produce the following line, and so on.

We can rewrite the resursive version as follows:

sub pascal ($nb-rows, @line is copy) {
    return unless $nb-rows;
    @line.join(" ").say;
    @line =  0, |@line Z+ |@line, 0;
    pascal $nb-rows - 1, @line;
}
sub MAIN (Int $rows where * > 0) {
    pascal $rows, (1,);
}

Now, combining this with the previous improvement, we can use the ... sequence operator to produce a list of lines of Pascal's triangle.

> for 1, -> $line { [0, |$line Z+ |$line, 0] } ... 10 { .join(" ").say };
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1

This is the new version (no longer recursive) of the pascal subroutine:

sub pascal ($nb-rows) {
    for 1, -> $line { [0, |$line Z+ |$line, 0] } ... $nb-rows { 
        .join(" ").say 
    }
}
sub MAIN (Int $rows where * > 0) {
    pascal +$rows;
}

The pascal subroutine is now so short that we no longer need to store its code in a separate subroutine.

sub MAIN (Int $rows where * > 0) {
    .join(" ").say for 1, -> $line { [0, |$line Z+ |$line, 0] } ... +$rows;
}

Finally, we could use an infinite (lazy) sequence and print the range that we need:

sub MAIN (Int $rows where * > 0) {
    my @lines = 1, -> $line { [0, |$line Z+ |$line, 0] } ... *;
    .join(" ").say for @lines[^$rows]
}

Wrapping up

The next week Perl Weekly Challenge is due to start very soon. If you're interested in participating in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 6 p.m. BST (British summer time) on next Sunday, April, 21. And, please, also spread the word about the Perl Weekly Challenge if you can.

1 comment

Perl Weekly Challenge: Week 2

By laurent_r on April 7, 2019 5:44 PM

These are some answers to the Week 2 of the Perl Weekly Challenge organized by the hugely prolific CPAN author (and, besides, very nice chap) Mohammad S. Anwar.

Challenge #1: Removing Leading 0's

Write a script or one-liner to remove leading zeros from positive numbers.

A Perl 5 One-Liner

This uses a simple regular expression to remove zeros from the start of the string. Here, we're just using bash for piping four lines with input numbers to a very simple 7-character Perl 5 one-liner.

$ echo '0456
> 0007865
> 8976
> 0000123456' | perl -pe 's/^0+//'
456
7865
8976
123456

Nothing complicated. The s/^0+// regular expression substitution looks for one or more zeroes at the beginning of the string and replaces them with nothing.

Perl 6 Solution under the REPL

I could basically use the same one-liner in Perl 6, but decided instead to simply use the prefix + operator to numify the strings containing the numbers into actual numbers, thereby removing leading 0's (the number is in fact converted back to a string by the say built-in, but that's immaterial). So the P6 solution under the REPL (Read Evaluate Print Loop) is dead simple:

> say +$_ for  qw /0456 0007865 8976 0000123456/;
456
7865
8976
123456

Challenge #2: Convert Numbers into Base-35 Representations

Write a script that can convert integers to and from a base35 representation, using the characters 0-9 and A-Y.

A Perl 5 Conversion Subroutine

To perform such a conversion, we need to apply a series of integer division (also known as Euclidean division) and modulo.

Take any number, say 342 . What does the number 342 mean? Well in our number representation system, it is 3 * 100 + 4 * 10 + 2 * 1. Suppose we want to convert 342 into a base 10 representation, which is of course a trivial problem since 342 is already represented in base 10. We start by dividing 342 by the target base, 10, and we get a result of 34 and a remainder of 2. Next, we divide the previous result 34 by 10, and get 3 and a remainder of 4. Finally, we divide the previous result 3 by 10 and get a result of 0 and a remainder of 2. Notice that the series of remainders is 2, 4 and 3; when flipped around, this series of remainders gives 342, i.e. the original number converted to base 10.

As a further simple example, suppose we want to convert 11 into a binary representation. We start by dividing 11 by 2 (with integer division): the result is 5 and the remainder 1. Next, we divide 5 by 2; the result is 2 and the remainder 1. Next, we divide 2 by 2, the result is 1 and the remainder 0. Finally, we divide 1 by 2, the result is now 0 and the remainder 1. Since we've reached 0, the process is now completed. We now know that :

11 = (1 * 2 ** 3 ) +  (0 * 2 ** 2) + (1 * 2 ** 1 ) +  (1 * 2 ** 0 ).

If we flip around the series of remainders (1, 1, 0, and 1), we get 1011, which is the binary representation of 11.

Hopefully, these examples make the algorithm of the convert_base subroutine below clear.

The Perl 5 convert_base subroutine below should be able to convert positive integers to any base between 2 and 36:

use strict;
use warnings;
use feature "say";
use constant lookup => ('0'..'9','A'..'Z');

sub convert_base {
    my ($num, $base) = @_;
    my $result = "";
    do {
        $result .= (lookup)[$num % $base];
        $num = int ($num/$base);
    } while $num > 0;
    $result = reverse $result;
}
for my $number (0..45, qw/1757 533 658467/) {
    say "$number\t:\t", convert_base $number, 35;
}

Note that I'm using here a do {...} while loop, a syntactic construct that I rather rarely use. My initial coding attempt had a simple while loop, but that meant that the number 0 would not be properly converted (I would get an empty string). I felt that using a do ... while loop to fix the problem was somewhat nicer than making 0 a special case (for example simply returning 0 if the first input parameter is 0).

Of course, in real production code, we would need to check the validity of the input parameters (arity, type and range) of the convert_base subroutine. That's left as an exercise for the reader.

Result:

$ perl  convert.pl
0       :       0
1       :       1
2       :       2
3       :       3
    ... some output lines omitted for brevity ...
28      :       S
29      :       T
30      :       U
31      :       V
32      :       W
33      :       X
34      :       Y
35      :       10
36      :       11
37      :       12
38      :       13
39      :       14
40      :       15
41      :       16
42      :       17
43      :       18
44      :       19
45      :       1A
1757    :       1F7
533     :       F8
658467  :       FCIC

Note that I chose here to manually implement the full algorithm to perform the conversion because this is a coding challenge, after all (it is an interesting exercise in its own right and, since I had not done any base-conversion algorithm for a number of years, it took me a few minutes to figure out how to do it), but there is at least half a dozen CPAN modules that can do the conversion such as Math::Int2Base, Math::BaseCalc, etc. I haven't really checked how they're implemented, but I would suppose that they do the right input parameter checks.

I've just noticed at the eleventh hour that I missed part of the original requirement: write a script that can convert integers to and from a base35 representation: I actually missed the and from requirement in the solution I originally submitted for the challenge a few days ago. That a fairly easy-to-solve problem:

use strict;
use warnings;
use feature "say";
my $c = 0;
my %lookup = map {$_ => $c++} ("0".."9","A".."Z");

sub convert_back {
    my ($num, $base) = @_;
    my $result = 0;
    for my $i (split //, uc $num) {
        $result = $base * $result + $lookup{$i};
    }
    return $result;
}
say convert_back $_, 35 for qw/ 10 1A 1F7 F8 FCIC/;

The results are consistent with what we obtained with the convert-base subroutine:

Perl 6 Number Conversion

We could easily roll out the same algorithm in Perl 6 with just a few changes. Although, as we'll see below, this is useless work, let's do it just for the sake of showing how it might look like in Perl 6:

sub convert-base (Int $num is copy where * >= 0, Int $base where 1 < * < 37) {
    constant @lookup = flat('0'..'9','A'..'Z');
    my $result = "";
    repeat {
        $result ~= @lookup[$num % $base];
        $num = floor $num/$base;
    } while $num > 0;
    return $result.flip;
}
say "$_\t", convert-base($_, 35) for flat(0..40);

There are a few minor syntactic changes compared to the P5 version, but the most important difference is that the convert-base subroutine signature does all the input parameter validation that is needed for robust code: we need two arguments, a non-negative integer for the number to be converted, and an integer between 2 and 36 for the base. The subroutine will fail with an explicit type check error if we pass a negative integer or a rational number for any of the two arguments.

This is not what I did, however, in the answer I submitted for the challenge, because it is a bit silly: Perl 6 has a built-in base method for such base conversions. This is my submitted one-line solution in Perl 6 under the REPL:

> say "$_\t", $_.base(35) for flat(0..45, 1757, 533, 658467);
0       0
1       1
2       2
3       3
    ... some output lines omitted for brevity ...
28      S
29      T
30      U
31      V
32      W
33      X
34      Y
35      10
36      11
37      12
38      13
39      14
40      15
41      16
42      17
43      18
44      19
45      1A
1757    1F7
533     F8
658467  FCIC

As mentioned previously, I missed the and from requirement in my submitted solution. In Perl 6, there is a parse-base built-in that does the reciprocal conversion of parse:

> say "$_\t", $_.parse-base(35) for qw/10 20 FCIC/
10      35
20      70
FCIC    658467

Note that, in the various code examples above, I implemented manually a number of test cases. It may be argued that it would be better to use the P6 Test module (and, similarly, one of the numerous P5 test modules such as the Test core module or, possibly better, Test::More for the P5 implementation). Well, yes, but I felt too lazy to calculate by hand beforehand the 35-base values of such numerous test cases, and there is a relatively high chance of making mistakes in such manual computations. In this specific case, I'm happy enough to check the general consistency of the results (for example that 35 base 35 is 10) and to verify that my P5 algorithm and the P6 built-in base function yield the same results.

While We Are Speaking About Testing...

This is a very short and incomplete example of how you could use the Test module of Perl 6 to verify that the built-in base function works as expected:

use v6;
use Test;
plan 8;

is 4.base(2), "100", "4 base 2";
is 15.base(16), 'F', "15 base 15";
is $_.base($_), "10", "$_ base $_ is 10" for 2, 8, 10, 16, 35; 
is 70.base(35), "20", "70 base 35";

Presumably, the base built-in subroutine of Perl 6 has much more thorough tests than the above. Our aim here is just to very briefly illustrate how this works. The first line ensures that we are using Perl 6. The second line, with the plan keyword says that we are going to run 8 tests (the third test line is testing 5 cases and counts as 5 tests). And each of the test line starting with the is keyword does a string comparison between the result of the i.base(k) statement and the expected result. A string comparison is the right thing to do here because we are not comparing numbers but strings representing the base k of some numbers. The last (and optional) argument to the is subroutine is just a comment, more precisely a short description of the test that can help make tests interpretation a bit easier when any of the tests fails.

Running this test program produces the following output:

1..8
ok 1 - 4 base 2
ok 2 - 15 base 15
ok 3 - 2 base 2 is 10
ok 4 - 8 base 8 is 10
ok 5 - 10 base 10 is 10
ok 6 - 16 base 16 is 10
ok 7 - 35 base 35 is 10
ok 8 - 70 base 35

If one of the tests had failed, the result might have had something like this:

...
# Failed test '70 base 35'
# at test_base.pl line 8
# expected: '21'
#      got: '20'
# Looks like you failed 1 test of 8

If you want to know more about testing in Perl 6, please look at the https://docs.perl6.org/language/testing tutorial, the https://docs.perl6.org/type/Test module documentation, or the Debugging section of Chapter 14 of my Perl 6 book (https://github.com/LaurentRosenfeld/thinkperl6/raw/master/PDF/thinkperl6.pdf). Hopefully, one of the next Perl Weekly Challenges will provide the opportunity to cover Perl 6 testing in a more detailed fashion.

Wrapping up

The next week Perl Weekly Challenge is due to start very soon. If you're interested in participating in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 6 p.m. BST (British summer time) on next Sunday, April, 14. And, please, also spread the word about the Perl Weekly Challenge if you can.

11 comments

Perl Weekly Challenge, First Week

By laurent_r on April 3, 2019 6:56 PM

I am glad that Mohammad Anwar started the Perl Weekly Challenge. Since it seems that the entries will not be published by Mohammad, this gives me the opportunity to finally publish my first post here, about 8 months after having registered.

Week 1, Challenge # 1: Letter Substitutions

Substitute every ‘e’ with upper-case 'E' in the string “Perl Weekly Challenge” and count every occurrence of ‘e’.

For this challenge, I proposed only a Perl 5 solution, in the form of a Perl one-liner:

$ perl -E 'my $c = shift; my $num = $c =~ tr/e/E/; say $c; say "Number of replacements: $num";'  'Perl 6 Weekly Challenge'
PErl 6 WEEkly ChallEngE
Number of replacements: 5

Nothing special about it, except that the tr/// operator returns the number of substitutions it has performed, so there is no need to count separately the 'e'.

I did not submit any Perl 6 entry for that (I wasn't sure I could submit two entries), but we could use almost the same code, this time under the Perl 6 REPL:

> my $string = 'Perl Weekly Challenge';
Perl Weekly Challenge
> my $distance = $string ~~ tr/e/E/;
StrDistance.new(before => "Perl Weekly Challenge", after => "PErl WEEkly ChallEngE")
> say $string;
PErl WEEkly ChallEngE
> say +$distance;
5

In Perl 6, the tr/// operator does not return a substitution count, but a StrDistance object that measures the distance between original value and the resultant string., All we need to change (compared to P5) is to use the prefix + operator to numify the string distance object into a substitution count.

Week 1, Challenge # 2: Fizz Buzz

Program a one-liner that solves the Fizz Buzz challenge for every integer from 1 through 20. But if the integer is divisible with 3 or 5 or both, the integer should be replaced with fizz, buzz or fizz buzz respectively.

Perl 5 One Liner:

$ perl -E 'say $_ % 15 ? $_ % 3 ? $_ % 5 ? $_ : "buzz" : "fizz" : "fizz buzz" for (1..20); '
1
2
fizz
4
buzz
fizz
7
8
fizz
buzz
11
fizz
13
14
fizz buzz
16
17
fizz
19
buzz

Note that, since I wanted to avoid comparing the result of the modulo operator to 0, the ternary operators are nested, which is not the most usual way to use several ternary operators. This is the code of the oneliner reformatted for better comprehension:

say $_ % 15 ? 
    $_ % 3  ? 
    $_ % 5  ? 
              $_ : 
              "buzz" : 
              "fizz" : 
              "fizz buzz" 
    for (1..20);

And the same in Perl 6 (we could use the given ... when construct, but I decided it was more fun to do it similar to the P5 version using the ternary operator. Note that this one-liner is on Windows (so the single and double quotes are swapped), because I din't have access to my Linux box where P6 is installed.)

C:\Users\Laurent>perl6 -e "for 1..20 -> $c { say $c %% 15 ?? 'fizz buzz' !! $c %% 3 ?? 'fizz' !! $c %% 5 ?? 'buzz' !! $c;}"
1
2
fizz
4
buzz
fizz
7
8
fizz
buzz
11
fizz
13
14
fizz buzz
16
17
fizz
19
buzz

Since I'm using here the Perl 6 %% divisibility operator, the logic flow is simpler than for the P5 version (the ternary operators are chained, rather than nested).

Just a small additional note: Mohammad really asked for title-case "Fizz, Buzz and Fizz Buzz," but I did not consider those upper-case letters to be part of the specification.

The second week challenge is already started, and I have already submitted my entries by mail, but since I don't want to spoil anyone, I'll post my entries only after the submission expiration date (Sunday, April 7, 2019). If you're interested in participating in this challenge, make sure you answer before 6 p.m. BST (British summer time) on that date.

1 comment

Main Index | Archives | May 2019 »

About laurent_r

I am the author of the "Think Perl 6" book (O'Reilly, 2017) and I blog about the Perl 5 and Raku programming languages.

More info »

laurent_r

April 2019 Archives

Perl Weekly Challenge # 5: Anagrams

Challenge #1: Anagrams of a Word

Perl 5 Solution: the is_anagram Subroutine

Finding All Anagrams of a Word From a Word List

Perl 6 Solution: the is-anagram Subroutine

Normalizing the Words

Using a Bag

Creating an Anagram Operator

Challenge #2: Highest Number of Anagrams

Perl 5 Solution

Perl 6 Solution

Performance Issue with Perl 6

Wrapping up

Perl Weekly Challenge: Week 4

Challenge #1: Pi Digits

A Perl 5 Solution

A Perl 6 Solution

Challenge #2: Letters and Word List

A Perl 5 One-Liner

A Perl 6 Solution

Wrapping up

Perl Weekly Challenge: Week 3

Challenge #1: 5-Smooth Numbers

A Perl 5 Solution

Building a List of Prime Numbers

Building the Hamming Sequence

Another Approach: Producing Directly the Products of the Powers of 2, 3, and 5

Challenge #1 in Perl6

Building the Hamming Sequence

Producing Directly the Products of the Powers of 2, 3, and 5

Challenge #2: Pascal's Triangle

Perl 5

Two Nested Loops

A Recursive Approach

Functional Programming Approach

Perl 6

Using the sequence operator

Using the zip operator

Wrapping up

Perl Weekly Challenge: Week 2

Challenge #1: Removing Leading 0's

A Perl 5 One-Liner

Perl 6 Solution under the REPL

Challenge #2: Convert Numbers into Base-35 Representations

A Perl 5 Conversion Subroutine

Perl 6 Number Conversion

While We Are Speaking About Testing...

Wrapping up

Perl Weekly Challenge, First Week

About laurent_r

Search this blog

Perl 5 Solution: the `is_anagram` Subroutine

Perl 6 Solution: the `is-anagram` Subroutine