Perl Weekly Challenge 166: Hexadecimal Words and K-Directory Diff

These are some answers to the Week 166 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Task 1: Hexadecimal Words

As an old systems programmer, whenever I needed to come up with a 32-bit number, I would reach for the tired old examples like 0xDeadBeef and 0xC0dedBad. I want more!

Write a program that will read from a dictionary and find 2- to 8-letter words that can be “spelled” in hexadecimal, with the addition of the following letter substitutions:

  • o ⟶ 0 (e.g., 0xf00d = “food”)
  • l ⟶ 1
  • i ⟶ 1
  • s ⟶ 5
  • t ⟶ 7

You can use your own dictionary or you can simply open ../../../data/dictionary.txt (relative to your script’s location in our GitHub repository) to access the dictionary of common words from Week #161.

Optional Extras (for an 0xAddedFee, of course! * Limit the number of “special” letter substitutions in any one result to keep that result at least somewhat comprehensible. (0x51105010 is an actual example from my sample solution you may wish to avoid!)*

  • Find phrases of words that total 8 characters in length (e.g., 0xFee1Face), rather than just individual words.*

I will not try to fulfill the second optional extra, as I do not really understand what “phrases of words” is supposed to mean. If the idea is to find a list of 8-character words, it is really too easy; if the idea is to manufacture meaningful (and/or grammatically correct) phrases, then it is en entirely different venture.

Hexadecimal Words in Raku

Initial Requirement

The initial task is quite easy: we just skip any word less than 2 characters or more than eitght character, and any word with letters not in the a..f range and not in the olist list of characters. We then use the TR/// non-destructive transliteration operator, which returns the modified string.

for "words.txt".IO.lines -> $line {
    next unless 2 <= $line.chars <= 8;
    next if $line ~~ /<-[a..f olist]>/;
    say "$_ -> 0x", TR/olist/01157/ with $line;
}

This script displays the following output:

$ raku ./hexawords.raku aa -> 0xaa aal -> 0xaa1 aalii -> 0xaa111 aaliis -> 0xaa1115 aals -> 0xaa15 aas -> 0xaa5 aba -> 0xaba abaca -> 0xabaca abacas -> 0xabaca5 abaci -> 0xabac1 abaft -> 0xabaf7 … (Lines omitted for brevity) decibels -> 0xdec1be15 decide -> 0xdec1de decided -> 0xdec1ded decides -> 0xdec1de5 decile -> 0xdec11e … (Lines omitted for brevity) tsade -> 0x75ade tsades -> 0x75ade5 tsadi -> 0x75ad1 tsadis -> 0x75ad15 tsetse -> 0x75e75e tsetses -> 0x75e75e5

Optional extra: Limiting the Number of substitutions

This time, we use the tr/// in-place transliteration operator, which conveniently returns the edit distance between the original value and the resultant string (i.e., in this case, the number of substitutions performed). The maximum number of “special” letter substitutions is passed as an argument to the script (with a default value of 4):

sub MAIN (Int $limit = 4) {
    for "words.txt".IO.lines -> $line {
        next unless 2 <= $line.chars <= 8;
        next if $line ~~ /<-[a..f olist]>/;
        my $word = $line;
        my $dist =  +tr/olist/01157/ for $word;
        say "$line -> 0x", $word if $dist <= $limit
    }
}

This script displays the following output:

$ raku ./hexawords.raku 3
aa -> 0xaa
aal -> 0xaa1
aalii -> 0xaa111
aals -> 0xaa15
aas -> 0xaa5
aba -> 0xaba
...
toted -> 0x707ed
tsade -> 0x75ade
tsades -> 0x75ade5
tsadi -> 0x75ad1

$ ./raku hexawords.raku 2 | wc
   1291    3873   23250

$ ./raku hexawords.raku 5 | wc
   3490   10470   68618

Hexadecimal Words in Perl

Initial Requirement

Using the tr//r option, the operator returns the modified string.

use strict;
use warnings;
use feature "say";

my $file_in = "./words.txt";
open my $IN, "<", $file_in or die "unable to open $file_in";
while (my $line = <$IN>) {
    chomp $line;
    next if length $line < 2 or length $line > 8;
    next if $line =~ /[^a-folist]/;
    say "$_ -> 0x", tr/olist/01157/r for $line;
}

This script displays the following output:

$ perl ./hexawords.pl
aa -> 0xaa
aal -> 0xaa1
aalii -> 0xaa111
aaliis -> 0xaa1115
aals -> 0xaa15
aas -> 0xaa5
aba -> 0xaba
abaca -> 0xabaca
abacas -> 0xabaca5
abaci -> 0xabac1
abaft -> 0xabaf7
abas -> 0xaba5
abase -> 0xaba5e
abased -> 0xaba5ed
abases -> 0xaba5e5
abatable -> 0xaba7ab1e
abate -> 0xaba7e
... (Lines omitted for brevity)
totted -> 0x7077ed
tsade -> 0x75ade
tsades -> 0x75ade5
tsadi -> 0x75ad1
tsadis -> 0x75ad15
tsetse -> 0x75e75e
tsetses -> 0x75e75e5

Optional extra: Limiting the Number of substitutions

Here, we don’t use the tr//r option, because we want the operator to return the number of substitutions performed.

use strict;
use warnings;
use feature "say";

my $max = shift;
$max = 4 unless defined $max;
my $file_in = "./words.txt";
open my $IN, "<", $file_in or die "unable to open $file_in";
while (my $line = <$IN>) {
    chomp $line;
    next if length $line < 2 or length $line > 8;
    next if $line =~ /[^a-folist]/;
    my $word = $line;
    next if ($word =~ tr/olist/01157/) > $max;
    say $line, " -> 0x", $word;
}

This script displays the following output with a parameter of 2:

$ perl hexawords.pl 2
aa -> 0xaa
aal -> 0xaa1
aals -> 0xaa15
aas -> 0xaa5
aba -> 0xaba
abaca -> 0xabaca
abacas -> 0xabaca5
abaci -> 0xabac1
abaft -> 0xabaf7
...
to -> 0x70
toad -> 0x70ad
tod -> 0x70d
toe -> 0x70e
toed -> 0x70ed
toff -> 0x70ff
toffee -> 0x70ffee
tsade -> 0x75ade

Task 2: K-Directory Diff

Given a few (three or more) directories (non-recursively), display a side-by-side difference of files that are missing from at least one of the directories. Do not display files that exist in every directory.

Since the task is non-recursive, if you encounter a subdirectory, append a /, but otherwise treat it the same as a regular file.

Example:

Given the following directory structure:

dir_a:
Arial.ttf  Comic_Sans.ttf  Georgia.ttf  Helvetica.ttf  Impact.otf  Verdana.ttf  Old_Fonts/

dir_b:
Arial.ttf  Comic_Sans.ttf  Courier_New.ttf  Helvetica.ttf  Impact.otf  Tahoma.ttf  Verdana.ttf

dir_c:
Arial.ttf  Courier_New.ttf  Helvetica.ttf  Impact.otf  Monaco.ttf  Verdana.ttf

The output should look similar to the following:

dir_a          | dir_b           | dir_c
-------------- | --------------- | ---------------
Comic_Sans.ttf | Comic_Sans.ttf  |
               | Courier_New.ttf | Courier_New.ttf
Georgia.ttf    |                 |
               |                 | Monaco.ttf
Old_Fonts/     |                 |
               | Tahoma.ttf      |

I am very late on this and have very little time left, so I’ll do the work of finding the missing files, but will not try obtain the same display.

K-Directory Diff in Raku

Raku as a Set type with operators such as the intersection operator and the does not belong to operator, which make the solution fairly easy:

my @dirs = map {$_ ~~ /\w+$/}, dir("./rootdir");
my %dircontent;
for @dirs -> $dir {
    %dircontent{$dir} = map {~($_ ~~ /\w+$/)}, dir("./rootdir/$dir");
}
say "Content of the dirs: ", %dircontent;
my $intersection = [∩] values %dircontent;
say "Files common to all directories: ", $intersection.keys;
for @dirs -> $dir {
    say "$dir -> ", grep {$_ ∉ $intersection}, values %dircontent{$dir};
}

This script displays the following output

$ raku ./dir_diff.raku
Content of the dirs: {a => (bar bar_a foo foo_a), b => (bar bar_b foo foo_b), c => (bar bar_c foo foo_c)}
Files common to all directories: (foo bar)
a -> (bar_a foo_a)
b -> (bar_b foo_b)
c -> (bar_c foo_c)

K-Directory Diff in Perl

Perl doesn’t have Sets and associated operators, so we will use a hash to record the number of occurrences of various file names. If a file name has a value equal to the number of directories (3 in the example), then this file is present in all directories and should be dismissed from the output.

use strict;
use warnings;
use feature "say";

my @dirs = glob("./rootdir/*");
my $nb_dirs = scalar @dirs;
my %dircontent;
for my $dir (@dirs) {
    $dircontent{$dir} = [ map {/(\w+$)/} glob "$dir/*" ];
}
say "Contents of the directories: ";
for my $dir (@dirs) {
    say "$dir: ", join " ", @{$dircontent{$dir}}
}

my %files;
for my $dir (@dirs) {
    $files{$_}++ for @{$dircontent{$dir}}
}
say "\nCommon files: ", join " ", grep { $files{$_} == $nb_dirs } keys %files; 
say "\nFiles not common to all directories: ";
for my $dir (@dirs) {
    say "$dir: ", join " ", grep { $files{$_} < $nb_dirs } @{$dircontent{$dir}};
}

This script displays the following output:

$ perl ./dir_diff.pl
Contents of the directories:
./rootdir/a: bar bar_a foo foo_a
./rootdir/b: bar bar_b foo foo_b
./rootdir/c: bar bar_c foo foo_c

Common files: bar foo

Files not common to all directories:
./rootdir/a: bar_a foo_a
./rootdir/b: bar_b foo_b
./rootdir/c: bar_c foo_c

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on June 5, 2022. And, please, also spread the word about the Perl Weekly Challenge if you can.

Leave a comment

About laurent_r

user-pic I am the author of the "Think Perl 6" book (O'Reilly, 2017) and I blog about the Perl 5 and Raku programming languages.