Perl Weekly Challenge 90: DNA Sequence and Ethiopian Multiplication

These are some answers to the Week 90 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a few days (December 13, 2020). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

Task 1: DNA Sequence

DNA is a long, chainlike molecule which has two strands twisted into a double helix. The two strands are made up of simpler molecules called nucleotides. Each nucleotide is composed of one of the four nitrogen-containing nucleobases cytosine (C), guanine (G), adenine (A) and thymine (T).

You are given DNA sequence, GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG.

Write a script to print nucleobase count in the given DNA sequence. Also print the complementary sequence where Thymine (T) on one strand is always facing an adenine (A) and vice versa; guanine (G) is always facing a cytosine (C) and vice versa.

To get the complementary sequence use the following mapping:

T => A
A => T
G => C
C => G

DNA Sequence in Raku

For the nucleotide histogram, we can comb the string into individual letters, use a hash to store the letter count and print out the hash pairs. This is a quite typical way of building histograms (but there are simpler solutions in Raku, as we shall see).

For the complementary sequence, we could use a hash to store the nucleotide mapping and use a map to perform the necessary conversion.

use v6;

my $dna = 'GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG';

# count
my %histo;
%histo{$_}++ for $dna.comb;
say "Histogram:";
.say for %histo.pairs;

# Complementary sequence
my %complement = T => 'A', A => 'T', G => 'C', C => 'G';
.say for "Complement:", $dna.comb.map({%complement{$_}}).join: '';

This displays the following output:

$ raku dna.raku
Histogram:
T => 22
A => 14
C => 18
G => 13
Complement:
CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC

This program is relatively concise, but we can do much shorter code without sacrificing legibility. Each of the subtasks can be done in just one line of code.

For the nucleotide histogram, we comb the string into individual letters, feed them into an anonymous bag and print out that bag’s pairs.

For the DNA complement, we can use the TR/// non-destructive transliteration operator:

use v6;

my $dna = 'GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG';

# count
say "Histogram:"; .say for Bag.new($dna.comb).pairs;

# Complementary sequence
say  "Complement:\n", TR/TAGC/ATCG/ with $dna;

The output is almost the same as before:

Histogram:
T => 22
A => 14
G => 13
C => 18
Complement:
CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC

DNA Sequence in Perl

This program is basically a port to Perl of a combination of the Raku programs above.

use strict;
use warnings;
use feature "say";

my $dna = 'GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG';
# count
my %histogram;
$histogram{$_}++ for split '', $dna;
say "$_: $histogram{$_}" for keys %histogram;

# Complementary sequence
say for "Complement:", $dna =~ tr/TAGC/ATCG/r;

This displays the following output:

$ perl dna.pl
C: 18
G: 13
A: 14
T: 22
Complement:
CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC

Task 2: Ethiopian Multiplication

You are given two positive numbers $A and $B.

Write a script to demonstrate Ethiopian Multiplication using the given numbers.

Ethiopian multiplication (also known as ancient Egyptian multiplication, Russian multiplication, or peasant multiplication) is an ancient method for multiplying two integers that does not require the multiplication table, only the ability to multiply and divide by 2, and to add. The basic idea is to apply repeatedly integer (or Euclidean) division by 2 to the first number (discarding any remainder), and to multiply repeatedly the second number by 2, until the first number becomes 1. Then we add the values of the second number for which the corresponding first number is odd. In practical terms, to do this manually, we can set up two columns with the first number on the left and the second on the right. If we want to multiply 19 and 42, we have the following process:

1st    2nd    Action       Sum so far
19      42    kept          42
 9      84    kept         126
 4     168    discarded    126    
 2     336    discarded    126
 1     672    kept         798

On the first line, the first number (17) is odd, so we will use the second number (42) in the final summation. On the second line, the first number (9) is odd again, so we will use the second number (84) in the final summation. On the third and fourth lines, the first number (4 and 2) are even, so the second number is not used in the final sum. Finally, on the last line, the first number is 1, so the process stops there and, since 1 is odd, the second number (672) is used in the final sum. When we add the numbers of the right column where the number on the left column is odd, we have: 42 + 84 + 672 = 798, which is the product of 17 by 42.

Ethiopian Multiplication in Raku

We implement a while loop whose stopping condition is when the first number ($a) becomes equal to 1. At each iteration, we use the Raku built-in div integer division operator to halve (rounding down the result) $a et we multiply the second number ($b) by 2. We also have an accumulator, $result, which accumulate the values of $b for which the corresponding $a is an odd integer (i.e. when $a % 2 is not 0).

use v6;

my ($a, $b) = map {$_.Int}, @*ARGS;
my $result = $a % 2 ?? $b !! 0;
while $a > 1 {
    $a div= 2;
    $b *= 2;
    $result += $b if $a % 2;
}
say $result;

This are a couple of example runs:

$ raku ethiopian-mult.raku 14 12
168

$ raku ethiopian-mult.raku 19 42
798

Laurent@LAPTOP-LHI8GLRC ~
$ raku ethiopian-mult.raku 42 19
798

$ raku ethiopian-mult.raku 300 600
180000

Ethiopian Multiplication in Perl

This is a port to Perl of the Raku program above. Since Perl doesn’t have a div integer operator, we could use the standard division operator and round down the result with the int operator, but since it is a division by two, it is slightly shorter to use the >> right bit shift operator by one bit, which does an integer division by 2.

use strict;
use warnings;
use feature "say";

my ($c, $d) = @ARGV;
my $result = $c % 2 ? $d : 0;
while ($c > 1) {
    $c = $c >> 1; # right shift 1 bit = div by 2
    $d *= 2;
    $result += $d if $c % 2;
}
say $result;

These are a few example runs:

$ perl ethiopian-mult.pl 19 42
798

$ perl ethiopian-mult.pl 14 22
308

$ perl ethiopian-mult.pl 45 59
2655

Perl Weekly Challenge # 90 in Scala

In 2017, I translated into French the book Scala by Example, written by Martin Odersky, the creator of the Scala programming language. At the time, I found the language to be very interesting and fairly similar in spirit to Perl and even more to Raku, since it smoothly combines the object-oriented programming and functional programming paradigms. Although I wrote a few tiny toy programs at the time (or, rather, copied example programs and tried various changes to see what happens), I thought at the time that I probably wanted to learn the language, but never really took the time to do so. Maybe the Perl Weekly Challenge is an opportunity to start learning it. So, I dived into two tutorial books on the Internet and I started to port some of my Raku solutions to Scala over the last few weeks. It is also interesting to see with real examples how Scala compares with Raku and Perl. Caveat: I am a pure beginner in Scala, so my Scala code is certainly neither expressive, nor efficient, and even less idiomatic. It is certainly quite clumsy. Please feel free to sugggest any corrections, improvements, comments, or better practices.

DNA Sequence in Scala

We use the toCharArray to split the DNA string into an array of characters. For each character, we convert the nycleotide into its complement using a match expression. At the same time, for each letter, we increment en histo map entry for that letter.

object Dna extends App {
    import scala.collection.mutable.Map
    val dna = "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG";
    var result = ""
    var histo:Map[Char,Int] = Map('A' -> 0, 'T' -> 0, 'C' -> 0, 'G' -> 0)
    for (char <- dna.toCharArray()) {
        // println(char)
        val charout = char match {
            case 'T' => 'A'
            case 'A' => 'T'
            case 'C' => 'G'
            case 'G' => 'C'
            case _   => char
        }
        result += char
        histo(char) += 1
    }
    println(s"Complement: $result")
    for ((k,v) <- histo) println(s"$k: $v") 
}

Output:

Complement: CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC
A: 14
C: 18
T: 22
G: 13

Ethiopian Multiplication in Scala

This is Scala port of the Raku and Perl programs above.

object Ethiopian extends App {
  mult(15, 24)

  def mult(a: Int, b: Int): Unit = {
    var (i, j) = (a, b)
    var sum = if (i % 2 != 0) j else 0
    while (i > 1) {
      i /= 2;
      j *= 2;
      if (i % 2 != 0) {
        sum += j
      }
    }
    println(s"product of $a and $b is: $sum")

  }
}
// Prints: product of 15 and 24 is: 360

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, December 20, 2020. And, please, also spread the word about the Perl Weekly Challenge if you can.

1 Comment

Leave a comment

About laurent_r

user-pic I am the author of the "Think Perl 6" book (O'Reilly, 2017) and I blog about the Perl 5 and Raku programming languages.