Perl Weekly Challenge # 15: Strong and Weak Primes and Vigenère Encryption

These are some answers to the Week 15 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in several days from now (July 7, 2019). This blog post offers some solutions to this challenge, please don't read on if you intend to complete the challenge on your own.

Challenge # 1: Strong and Weak Prime Numbers

Write a script to generate first 10 strong and weak prime numbers.

For example, the nth prime number is represented by p(n).

  p(1) = 2
  p(2) = 3
  p(3) = 5
  p(4) = 7
  p(5) = 11

  Strong Prime number p(n) when p(n) > [ p(n-1) + p(n+1) ] / 2
  Weak   Prime number p(n) when p(n) < [ p(n-1) + p(n+1) ] / 2

A strong prime is a prime number that is greater than the arithmetic mean of the nearest prime above and below (in other words, it's closer to the following than to the preceding prime). A weak prime is a prime number that is less than the arithmetic mean of the nearest prime above and below. Obviously, a prime number cannot both strong and weak, but some prime numbers, such as 5 or 53 (we'll see more of them later), are neither strong, nor weak (they're called balanced primes): 5 is equal to the arithmetic mean of 3 and 7. Therefore, the fact that a prime is not strong doesn't mean that it is weak.

For this challenge, I see some advantages to Perl 6 compared to Perl 5, so I'll start with Perl 6.

Strong and Weak Prime Numbers in Perl 6

We don't know in advance how many prime numbers we'll need to check to find 10 strong and 10 weak primes. This is a typical situation where using infinite lazy lists is very practical.

In the first code example below, we first build a lazy infinite list of prime numbers, and then use grep to filter the strong (and weak) primes, so as to construct lazy infinite lists of strong and weak primes, and finally print out the first 10 numbers of each such list. This is fairly straight forward:

use v6;

my @p = grep { .is-prime }, 1..*;   #Lazy infinite list of primes
my @strong = map { @p[$_] }, 
    grep { @p[$_] > (@p[$_ - 1] + @p[$_ + 1]) / 2 }, 1..*;
my @weak = map { @p[$_] }, 
    grep { @p[$_] < (@p[$_ - 1] + @p[$_ + 1]) / 2 }, 1..*;
say "Strong primes: @strong[0..9]";
say "Weak primes: @weak[0..9]";

This script displays the following output:

$ perl6 strong_primes.p6
Strong primes: 11 17 29 37 41 59 67 71 79 97
Weak primes: 3 7 13 19 23 31 43 47 61 73

We don't really need to build the intermediate @strong and @weak lazy infinite lists, but can print out the results directly:

use v6;

my @p = grep { .is-prime }, 1..*;   # Lazy infinite list of primes
say "Strong primes: ", (map { @p[$_] }, 
    grep { @p[$_] > (@p[$_ - 1] + @p[$_ + 1]) / 2 }, 1..*)[0..9];
say "Weak primes: ", (map { @p[$_] }, 
    grep { @p[$_] < (@p[$_ - 1] + @p[$_ + 1]) / 2 }, 1..*)[0..9];

This prints out the same lists as before:

perl6 strong_primes.p6
Strong primes: (11 17 29 37 41 59 67 71 79 97)
Weak primes: (3 7 13 19 23 31 43 47 61 73)

We're now down to three code lines instead of five (except that I'll probably have to format each of the two last lines over two lines to fit cleanly on this blog post).

Categorizing or Classifying Primes

One slight problem with the implementation above is that, once we have generated our list of primes, we need to go through it twice with the map ... grep chained statements, one for the strong primes and once for the weak primes; and we'd need to visit the prime list a third time for finding balanced primes. Although the script runs very fast, it would be better if we could do the categorizing in one go. Perl 6 has two built-in routines to do that, categorize and classify. Let's use the first one:

use v6;

my @p = grep { .is-prime }, 1..*;   # Lazy infinite list of primes
sub mapper(UInt $i) {
    @p[$i] > (@p[$i - 1] + @p[$i + 1])/2 ?? 'Strong' !!
    @p[$i] < (@p[$i - 1] + @p[$i + 1])/2 ?? 'Weak'   !!
    'Balanced';
}
my %categories = categorize &mapper, 1..120;
for sort keys %categories -> $key {
    say "$key primes:  ", map {@p[$_]}, %categories{$key}[0..9];
}

Running this program produces the following output:

$ perl6 strong_primes.p6
Balanced primes:  (5 53 157 173 211 257 263 373 563 593)
Strong primes:  (11 17 29 37 41 59 67 71 79 97)
Weak primes:  (3 7 13 19 23 31 43 47 61 73)

Here, we define a mapper subroutine to find out whether a given prime is strong, weak or balanced. Then, we pass to categorize two arguments: the subroutine and a list of subsequent integers (the indices of the @p prime number list) starting with 1 (the first prime cannot be weak or strong or balanced, since it has no predecessor) and store the result in the %categories hash, which is in fact a hash of arrays with three keys (one for each type of primes) and values being the index in the @p prime array of primes belonging to the corresponding type.

For example, with an input range of 1..30, the %categories hash has the following contents:

{ 
    Balanced => [2 15], 
    Strong => [4 6 9 11 12 16 18 19 21 24 25 27 30], 
    Weak => [1 3 5 7 8 10 13 14 17 20 22 23 26 28 29]
}

Remember that the numbers in the three lists above are not the primes, but the indices of the primes.

Then, the for loop extracts 10 numbers from each key of hash (with a full input range of 1..120).

This categorize built-in is very useful and practical for cases where you want to split some input data into different categories, but it isn't well adapted to our case in point, because it does not work with lazy lists. And since balanced primes are much less common than strong and weak primes, I was forced to use a relatively large range of 1..120 to make sure that I would get 10 balanced primes. For this specific problem, the classify built-in subroutine works essentially as categorize and also reports the Cannot classify a lazy list error message when trying to use it on a lazy infinite list. The difference between categorize and classify is that the latter returns a scalar whereas the former can return a list; so, in our example, it might have been slightly better to use classify rather than categorize, but the difference between the two built-ins is insignificant in our case.

I might come back to this issue in a later blog post (update: see this new post).

Strong and Weak Prime Numbers in Perl 5

In my blog post about Perl Weekly Challenge # 8 related to perfect numbers and Mersenne's numbers, and in a couple of other places before, I've shown a set of two somewhat complex subroutines (is_prime and find_primes) to generate relatively efficiently large primes with trial division. This time, we don't need large primes and can use a simpler (and less efficient, but sufficient) algorithm to check primality. This is done in the is_prime subroutine below.

As noted in the introduction above, a strong prime is closer to the following prime than to the preceding prime. So, rather than computing the arithmetic mean of the nearest prime above and below (as we did in our P6 implementation), we will this time use this alternate definition and compare the differences between a given prime and its preceding and succeeding one:

  Strong Prime p(n) when  [ p(n) - p(n-1) ] > [ p(n+1) - p(n) ]
  Weak   Prime p(n) when  [ p(n) - p(n-1) ] < [ p(n+1) - p(n) ]

Since Perl 5 does not support infinite lists as Perl 6, we need to specify some hard-coded ranges for our list or primes. Given that we've solved the problem in P6, it isn't difficult to figure out quite precisely the ranges that we need. If we had not done it before in P6, we would have had to choose somewhat larger ranges to be on the safe side of things.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

sub is_prime{
    my $num = shift;
    for my $i (2 .. $num ** .5) {
        return 0 if $num % $i == 0;
    }
    return 1;
}
my @p = grep is_prime($_), 2..105;
my @strong = map $p[$_], 
    grep { $p[$_] - $p[$_-1] > $p[$_+1] - $p[$_] } 1..25;
my @weak = map $p[$_], 
    grep { $p[$_] - $p[$_-1] < $p[$_+1] - $p[$_] } 1..25;
say "Strong: @strong[0..9]";
say "Weak: @weak[0..9]";

This script displays the same primes as in P6:

$ perl strong_primes.pl
Strong: 11 17 29 37 41 59 67 71 79 97
Weak: 3 7 13 19 23 31 43 47 61 73

Challenge # 2: Vigenère Encryption

Write a script to implement Vigenère cipher. The script should be able encode and decode. Checkout [wiki page](https://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher] for more information.

The Vigenère cipher is actually a misnomer: in the nineteenth century, it has been mis-attributed to French diplomat and cryptographer Blaise de Vigenère, who published the method in 1586, and this is how it acquired its present name. But the method had been described more than three decades earlier (in 1553) by Italian cryptanalyst Giovan Battista Bellaso. It essentially resisted all attempts to break it until 1863, three centuries later.

To understand the Vigenère cipher, we can first consider what is known as the Caesar cipher, in which each letter of the alphabet is shifted along some number of places. For example, in a Caesar cipher of shift 3, A would become D, B would become E, Y would become B and so on. So, for instance, "cheer" rotated by 7 places is "jolly" and "melon" rotated by -10 (or + 16) is "cubed". In the movie A Space Odyssey, the ship's computer is called HAL, which is IBM rotated by -1. One famous such cipher is ROT13, which is a Caesar cipher with rotation 13. Since 13 is half the number of letters in our alphabet, applying rotation 13 twice returns the original message, so that the same procedure can be used for both encoding and decoding in rotation 13. Rotation 13 has been used very commonly on the Internet to hide potentially offensive jokes or to weakly hide the solution to a puzzle.

A Caesar cipher is very easy to break through letter frequency analysis.

In Edgar Allan Poe’s short story The Gold Bug, one of the characters, William Legrand, uses letter frequencies to crack a cipher. He explains:

Now, in English, the letter which most frequently occurs is e. Afterwards, the succession runs thus: a o i d h n r s t u y c f g l m w b k p q x z. E however predominates so remarkably that an individual sentence of any length is rarely seen, in which it is not the prevailing character.

Edgar Poe's character is slightly wrong on part of the succession of letters: for example, he grossly underestimated the frequency of letter t, which is the second most common letter in English. But what he says about letter E is correct.

So, if you want to decipher a message encoded with a Caesar cipher in English, one way is to find out the most common letter in the encoded text, and that most common letter is likely to be an E. From there, you can figure out by which value each letter has shifted and decipher the whole message. If you were unlucky, just give a try with the second most common letter, and then the third. You're very likely to succeed very quickly. Another possibility is brute force attack by trying all 26 possible values by which the letter are shifted. This is easy by hand, and very fast with a computer. A Caesar cipher is a very weak encryption system.

The idea of the Vigenère cipher is to shift the letters of the message by a different number of places. If your encryption code is 1452, you rotate the first letter by one place, the second one by 4 places, the third by 5 places, the fourth by 2 places; if you have more letters to encode in your message, then your start again with the beginning of the code, and so on. For example, if you want to encode the word "peace," you get:

p + 1 => q
e + 4 => i
a + 5 => f
c + 2 => e
e + 1 => f
Encoded message: qifef.

In brief, a Vigenère cipher is using a series of interwoven Caesar ciphers. With such a system, frequency analysis becomes extremely difficult because, as we can see in the example above, the letter E is encoded into I in the first instance, and into F in the second instance. In fact, if the encryption key is a series of truly random bytes and is at least as long as the message to be encoded (and is used only once), the code is essentially unbreakable. In practice, a Vigenère cipher is not using a number as encryption key, but generally a password or a pass-phrase: the letters of the password are converted to a series of numbers according to their rank in the alphabet and those numbers are used as above to rotate the letters of the message to be encoded. Since the encryption code is probably no longer truly random, it becomes theoretically possible to break the code, but this is still very difficult, and that's the reason the Vigenère cipher has been considered unbreakable for about three centuries.

Vigenère Cipher in Perl 6

For this challenge, we will use the built-in functions ord, which converts a character to a numeric code (Unicode code point), and chr which converts such numeric code back to a characters. Letters of the alphabet are encoded in alphabetic order, so that, for example:

say ord('c') - ord('a'); 2

because 'c' is the second letter after 'a'.

Originally, I kept letters within the a..z range (folding the input message to lowercase), because the numeric codes for uppercase letters are different, in order to keep as close as possible to the original Vigenère cipher. But the original cipher was limited to this range only because of the way encoding was done manually at the time. With a computer, there is no reason to limit ourselves to such range. So, the script below use the full range of an octet (0..255), i.e. the full extended ASCII range. This way we can also encode spaces, punctuation symbols, etc. Of course, this implies that the partner uses the script (or, at least, same algorithm).

In this script, the bulk of the work is done in the rotate-msg and rotate-one-letter subroutines. The encode and decode subroutines are only calling them with the proper arguments. And the create-code subroutine is used to transform the password into an array of numeric values.

use v6;

subset Letter of Str where .chars == 1;

sub create-code (Str $passwd) {
    # Converts password to a list of numeric codes
    # where 'a' corresponds to a shift of 1, etc.
    return $passwd.comb(1).map: {.ord - 'a'.ord + 1}
}
sub rotate-one-letter (Letter $letter, Int $shift) {
    # Converts a single letter and deals with cases 
    # where applying the shift would get out of range
    constant $max = 255;
    my $shifted = $letter.ord + $shift;
    $shifted = $shifted > $max ?? $shifted - $max !!
        $shifted < 0 ?? $shifted + $max !!
        $shifted;
    return $shifted.chr;
}
sub rotate-msg (Str $msg, @code) {
    # calls rotate-one-letter for each letter of the input message
    # and passes the right shift value for that letter
    my $i = 0;
    my $result = "";
    for $msg.comb(1) -> $letter {
        my $shift = @code[$i];
        $result ~= rotate-one-letter $letter, $shift;
        $i++;
        $i = 0 if $i >= @code.elems;
    }
    return $result;
}
sub encode (Str $message, @key) {
    rotate-msg $message, @key;
}
sub decode (Str $message, @key) {
    my @back-key = map {- $_}, @key;
    rotate-msg $message, @back-key;
}
multi MAIN (Str $message, Str $password) {
    my @code = create-code $password;
    my $ciphertext = encode $message, @code;
    say "Encoded cyphertext: $ciphertext";
    say "Roundtrip to decoded message: {decode $ciphertext, @code}";
}
multi MAIN ("test") {
    use Test; # Minimal tests for providing an example
    plan 6;
    my $code = join "", create-code("abcde");
    is $code, 12345, "Testing create-code";
    my @c = create-code "password";
    for <foo bar hello world> -> $word {
        is decode( encode($word, @c), @c), $word, 
            "Round trip for $word";
    }
    my $msg = "One small step for man, one giant leap for mankind!";
    my $ciphertext = encode $msg, @c;
    is decode($ciphertext, @c), $msg, 
        "Message with spaces and punctuation";
}

In the script above, we have two MAIN multi subroutines. When the single argument is "test", the script runs a series of basic tests (which would probably have to be expanded in a real life project); when the arguments are two strings (a message to be encoded and a password), the script runs with the input arguments.

This is an example run with the "test" argument:

$ perl6  vigenere.p6 test
1..6
ok 1 - Testing create-code
ok 2 - Round trip for foo
ok 3 - Round trip for bar
ok 4 - Round trip for hello
ok 5 - Round trip for world
ok 6 - Message with spaces and punctuation

and with two arguments:

$ perl6  vigenere.p6 AlphaBeta password
Encoded cyphertext: Qmƒ{xQwxq
Roundtrip to decoded message: AlphaBeta

Vigenère Cipher in Perl 5

Translating this P6 script into Perl 5 is fairly easy. Just as in P6, we will use the built-in functions ord, which converts a character to a numeric code (Unicode code point), and chr which converts such numeric code back to a characters.

In this script, the bulk of the work is done in the rotate_msg and rotate_one_letter subroutines. The encode and decode subroutines are only calling them with the proper arguments. And the create_code subroutine is used to transform the password into an array of numeric values.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;


sub create_code {
    # Converts password to a list of numeric codes
    # where 'a' corresponds to a shift of 1, etc.
    my $passwd = shift;
    return map { ord($_) - ord('a') + 1 } split //, $passwd;
}
sub rotate_one_letter {
    # Converts a single letter and deals with cases where 
    # applying the shift would get the result out of range
    my ($letter, $shift) = @_;
    my $max = 255;
    my $shifted = $shift + ord $letter;
    $shifted = $shifted > $max ? $shifted - $max :
        $shifted < 0 ? $shifted + $max :
        $shifted;
    return chr $shifted;
}
sub rotate_msg {
    # calls rotate-one-letter for each letter of the input message
    # and passes the right shift value for that letter
    my ($msg, @code) = @_;
    my $i = 0;
    my $result = "";
    for my $letter (split //, $msg) {
        my $shift = $code[$i];
        $result .= rotate_one_letter $letter, $shift;
        $i++;
        $i = 0 if $i >= @code;
    }
    return $result;
}
sub encode {
    my ($message, @key) = @_;
    rotate_msg $message, @key;
}
sub decode  {
    my ($message, @key) = @_;
    my @back_key = map {- $_} @key;
    rotate_msg $message, @back_key;
}
sub run_tests {
    use Test::More; # Minimal tests for providing an example
    plan tests => 6; # needed on a separate code line to avoid 
                     # useless output when not running the tests
    my $code = join "", create_code("abcde");
    is $code, 12345, "Testing create_code";
    my @c = create_code "password";
    for my $word ( qw/foo bar hello world/) {
        is decode( encode($word, @c), @c), $word, 
            "Round trip for $word";
    }
    my $msg = "One small step for man, one giant leap for mankind!";
    my $ciphertext = encode $msg, @c;
    is decode($ciphertext, @c), $msg, 
        "Message with spaces and punctuation";
}


if (@ARGV == 1 and $ARGV[0] eq "test") {
    run_tests;
} elsif ( @ARGV == 2) {
    my ($message, $password) = @ARGV;
    my @code = create_code $password;
    my $ciphertext = encode $message, @code;
    say "Encoded cyphertext: $ciphertext";
    say "Roundtrip to decoded message: ", decode $ciphertext, @code;
} else {
    say "Wrong arguments";
}

Asides from the small syntax adjustments between P6 and P5, the main difference is that core Perl 5 doesn't have a MAIN subroutine and does not have multiple dispatch for subroutines. So, I just mimicked the P6 version by checking the arguments passed to the script and calling the run_test subroutine when needed.

This gives the same output as the P6 script:

$ perl vigenere.pl test
1..6
ok 1 - Testing create_code
ok 2 - Round trip for foo
ok 3 - Round trip for bar
ok 4 - Round trip for hello
ok 5 - Round trip for world
ok 6 - Message with spaces and punctuation

Laurent@LAPTOP-LHI8GLRC ~
$ perl vigenere.pl TangoCharlieJuliet password
Encoded cyphertext: dbz†Rze‚m|xa„~muu
Roundtrip to decoded message: TangoCharlieJuliet

Wrapping up

The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, July 14. And, please, also spread the word about the Perl Weekly Challenge if you can.

Leave a comment

About laurent_r

user-pic I am the author of the "Think Perl 6" book (O'Reilly, 2017) and I blog about the Perl 5 and Raku programming languages.