Perl Weekly Challenge 90: DNA Sequence and Ethiopian Multiplication
These are some answers to the Week 90 of the Perl Weekly Challenge organized by Mohammad S. Anwar.
Spoiler Alert: This weekly challenge deadline is due in a few days (December 13, 2020). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.
Task 1: DNA Sequence
DNA is a long, chainlike molecule which has two strands twisted into a double helix. The two strands are made up of simpler molecules called nucleotides. Each nucleotide is composed of one of the four nitrogen-containing nucleobases cytosine (C), guanine (G), adenine (A) and thymine (T).
You are given DNA sequence, GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG
.
Write a script to print nucleobase count in the given DNA sequence. Also print the complementary sequence where Thymine (T) on one strand is always facing an adenine (A) and vice versa; guanine (G) is always facing a cytosine (C) and vice versa.
To get the complementary sequence use the following mapping:
T => A
A => T
G => C
C => G
DNA Sequence in Raku
For the nucleotide histogram, we can comb the string into individual letters, use a hash to store the letter count and print out the hash pairs. This is a quite typical way of building histograms (but there are simpler solutions in Raku, as we shall see).
For the complementary sequence, we could use a hash to store the nucleotide mapping and use a map
to perform the necessary conversion.
use v6;
my $dna = 'GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG';
# count
my %histo;
%histo{$_}++ for $dna.comb;
say "Histogram:";
.say for %histo.pairs;
# Complementary sequence
my %complement = T => 'A', A => 'T', G => 'C', C => 'G';
.say for "Complement:", $dna.comb.map({%complement{$_}}).join: '';
This displays the following output:
$ raku dna.raku
Histogram:
T => 22
A => 14
C => 18
G => 13
Complement:
CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC
This program is relatively concise, but we can do much shorter code without sacrificing legibility. Each of the subtasks can be done in just one line of code.
For the nucleotide histogram, we comb the string into individual letters, feed them into an anonymous bag and print out that bag’s pairs.
For the DNA complement, we can use the TR/// non-destructive transliteration operator:
use v6;
my $dna = 'GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG';
# count
say "Histogram:"; .say for Bag.new($dna.comb).pairs;
# Complementary sequence
say "Complement:\n", TR/TAGC/ATCG/ with $dna;
The output is almost the same as before:
Histogram:
T => 22
A => 14
G => 13
C => 18
Complement:
CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC
DNA Sequence in Perl
This program is basically a port to Perl of a combination of the Raku programs above.
use strict;
use warnings;
use feature "say";
my $dna = 'GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG';
# count
my %histogram;
$histogram{$_}++ for split '', $dna;
say "$_: $histogram{$_}" for keys %histogram;
# Complementary sequence
say for "Complement:", $dna =~ tr/TAGC/ATCG/r;
This displays the following output:
$ perl dna.pl
C: 18
G: 13
A: 14
T: 22
Complement:
CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC
Task 2: Ethiopian Multiplication
You are given two positive numbers $A
and $B
.
Write a script to demonstrate Ethiopian Multiplication using the given numbers.
Ethiopian multiplication (also known as ancient Egyptian multiplication, Russian multiplication, or peasant multiplication) is an ancient method for multiplying two integers that does not require the multiplication table, only the ability to multiply and divide by 2, and to add. The basic idea is to apply repeatedly integer (or Euclidean) division by 2 to the first number (discarding any remainder), and to multiply repeatedly the second number by 2, until the first number becomes 1. Then we add the values of the second number for which the corresponding first number is odd. In practical terms, to do this manually, we can set up two columns with the first number on the left and the second on the right. If we want to multiply 19 and 42, we have the following process:
1st 2nd Action Sum so far
19 42 kept 42
9 84 kept 126
4 168 discarded 126
2 336 discarded 126
1 672 kept 798
On the first line, the first number (17) is odd, so we will use the second number (42) in the final summation. On the second line, the first number (9) is odd again, so we will use the second number (84) in the final summation. On the third and fourth lines, the first number (4 and 2) are even, so the second number is not used in the final sum. Finally, on the last line, the first number is 1, so the process stops there and, since 1 is odd, the second number (672) is used in the final sum. When we add the numbers of the right column where the number on the left column is odd, we have: 42 + 84 + 672 = 798, which is the product of 17 by 42.
Ethiopian Multiplication in Raku
We implement a while
loop whose stopping condition is when the first number ($a
) becomes equal to 1. At each iteration, we use the Raku built-in div integer division operator to halve (rounding down the result) $a
et we multiply the second number ($b
) by 2. We also have an accumulator, $result
, which accumulate the values of $b
for which the corresponding $a
is an odd integer (i.e. when $a % 2
is not 0).
use v6;
my ($a, $b) = map {$_.Int}, @*ARGS;
my $result = $a % 2 ?? $b !! 0;
while $a > 1 {
$a div= 2;
$b *= 2;
$result += $b if $a % 2;
}
say $result;
This are a couple of example runs:
$ raku ethiopian-mult.raku 14 12
168
$ raku ethiopian-mult.raku 19 42
798
Laurent@LAPTOP-LHI8GLRC ~
$ raku ethiopian-mult.raku 42 19
798
$ raku ethiopian-mult.raku 300 600
180000
Ethiopian Multiplication in Perl
This is a port to Perl of the Raku program above. Since Perl doesn’t have a div
integer operator, we could use the standard division operator and round down the result with the int
operator, but since it is a division by two, it is slightly shorter to use the >>
right bit shift operator by one bit, which does an integer division by 2.
use strict;
use warnings;
use feature "say";
my ($c, $d) = @ARGV;
my $result = $c % 2 ? $d : 0;
while ($c > 1) {
$c = $c >> 1; # right shift 1 bit = div by 2
$d *= 2;
$result += $d if $c % 2;
}
say $result;
These are a few example runs:
$ perl ethiopian-mult.pl 19 42
798
$ perl ethiopian-mult.pl 14 22
308
$ perl ethiopian-mult.pl 45 59
2655
Perl Weekly Challenge # 90 in Scala
In 2017, I translated into French the book Scala by Example, written by Martin Odersky, the creator of the Scala programming language. At the time, I found the language to be very interesting and fairly similar in spirit to Perl and even more to Raku, since it smoothly combines the object-oriented programming and functional programming paradigms. Although I wrote a few tiny toy programs at the time (or, rather, copied example programs and tried various changes to see what happens), I thought at the time that I probably wanted to learn the language, but never really took the time to do so. Maybe the Perl Weekly Challenge is an opportunity to start learning it. So, I dived into two tutorial books on the Internet and I started to port some of my Raku solutions to Scala over the last few weeks. It is also interesting to see with real examples how Scala compares with Raku and Perl. Caveat: I am a pure beginner in Scala, so my Scala code is certainly neither expressive, nor efficient, and even less idiomatic. It is certainly quite clumsy. Please feel free to sugggest any corrections, improvements, comments, or better practices.
DNA Sequence in Scala
We use the toCharArray
to split the DNA string into an array of characters. For each character, we convert the nycleotide into its complement using a match
expression. At the same time, for each letter, we increment en histo
map entry for that letter.
object Dna extends App {
import scala.collection.mutable.Map
val dna = "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG";
var result = ""
var histo:Map[Char,Int] = Map('A' -> 0, 'T' -> 0, 'C' -> 0, 'G' -> 0)
for (char <- dna.toCharArray()) {
// println(char)
val charout = char match {
case 'T' => 'A'
case 'A' => 'T'
case 'C' => 'G'
case 'G' => 'C'
case _ => char
}
result += char
histo(char) += 1
}
println(s"Complement: $result")
for ((k,v) <- histo) println(s"$k: $v")
}
Output:
Complement: CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC
A: 14
C: 18
T: 22
G: 13
Ethiopian Multiplication in Scala
This is Scala port of the Raku and Perl programs above.
object Ethiopian extends App {
mult(15, 24)
def mult(a: Int, b: Int): Unit = {
var (i, j) = (a, b)
var sum = if (i % 2 != 0) j else 0
while (i > 1) {
i /= 2;
j *= 2;
if (i % 2 != 0) {
sum += j
}
}
println(s"product of $a and $b is: $sum")
}
}
// Prints: product of 15 and 24 is: 360
Wrapping up
The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, December 20, 2020. And, please, also spread the word about the Perl Weekly Challenge if you can.
Updated this blog post with the Scala solutions on Dec. 10, 2020, around 20:05 UTC.