Perl Weekly Challenge 227: Roman Maths

These are some answers to the Week 227, Task 2, of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a few days from now (on July 30, 2023 at 23:59). This blog post offers some solutions to this challenge. Please don’t read on if you intend to complete the challenge on your own.

Task 2: Roman Maths

Write a script to handle a 2-term arithmetic operation expressed in Roman numeral.

Example

IV + V     => IX
M - I      => CMXCIX
X / II     => V
XI * VI    => LXVI
VII ** III => CCCXLIII
V - V      => nulla (they knew about zero but didn't have a symbol)
V / II     => non potest (they didn't do fractions)
MMM + M    => non potest (they only went up to 3999)
V - X      => non potest (they didn't do negative numbers)

Of course, there are some Perl modules on the CPAN to convert from and to Roman numerals, but there wouldn't be any challenge if the idea were to use an existing module.

Most people know more or less how Roman numerals work. They use Latin letters to represent numbers:

|---------------------------------------------------|
| Symbol |  I  |  V  |  X  |  L  |  C  |  D  |  M   |
|---------------------------------------------------|
| Value  |  1  |  5  |  10 |  50 | 100 | 500 | 1000 |
|---------------------------------------------------|

In general, Roman numerals use additive notation: for example, MCLXXIII means 1000 + 100 + 50 + 20 + 3 = 1173. Or, at least, this is so when the symbols are written from left to right in decreasing value order.

If, however, a given symbol has a smaller value than a symbol placed on its right, then this is an example of subtractive notation: in that case, the smaller symbol is subtracted from the one its right. For example, IV means 1 subtracted from 5, i.e. 5 - 1 = 4. Similarly, IX and XC respectively mean 10 - 1 = 9 and 100 - 10 = 90. And MCMXLIX corresponds to 1000 + ( 1000 - 100) + (50 - 10) + (10 - 1) = 1949.

The overall problem, though, is that there is no general standard for Roman numerals. Applying the rules above makes it possible to decode more or less unambiguously any Roman numeral coded according to such aforesaid rules, but there may be several different possible ways to encode a number into a Roman numeral. For example, 99 could be encoded as XCIX or IC (or even XCVIIII or possibly LXXXXVIIII). The first transcription (XCIX) seems to be the most frequent one, so this is the one we will chose when encoding to Roman numerals. Still, IC seems to be a valid Roman numeral for 99, so we will try at least to be able to decode it if we find it.

There is no Roman numeral for integers less than 1 and the largest possible Roman numeral is 3,999.

Performing arithmetic operations directly with Roman numerals would be almost a carry-over nightmare. So, we will convert the Roman numerals into Arabic integers, perform standard arithmetic operations on the Arabic numbers, and then convert the result back to Roman numerals.

Roman Maths in Raku

As said before, we first convert to Arabic numerals (subroutine from-roman), perform the arithmetic operation, and then convert the result back to Roman numerals (subroutine to-roman).

In principle, it should be possible to construct a grammar for parsing Roman numerals, but it appeared to me that this would be more complicated than the algorithm explained below and used in the from-roman subroutine of program below.

For converting Roman numerals to Arabic integers, the idea is to read the symbols one by one from left to right and to add the values, keeping track of the previously seen value. If the current value is larger than the previous value, then we were in a case of a subtractive combination at the previous step, and we need to subtract twice the previous value (once because it is a subtractive combination, and once again because we have previously erroneously added it). That's actually quite simple.

For encoding Arabic numerals to Roman numerals, the easiest is to perform integer division with decreasing values corresponding to Roman numerals (i.e. M D C L X V I). For example, suppose we want to encode 2023. We first try to divide by 1,000 (corresponding to M). We get 2, so the start of the string representing the Roman numeral will be MM. Then we continue with the remainder, i.e. 23. We try integer division successively with 500, 100 and 50 and get 0, so don't do anything with the result. Next we try with 10 and get 2, so the temporary result is now MMXX. The remainder is 3, so the result is MMXXIII, which is a correct Roman numeral for 2023.

If we had started with 2019, we would have obtained MMXVIIII, which is a correct (simplistic) Roman numeral for 2019, but not really what we want, since we want to apply the subtractive combination and get MMXIX. We can observe that if our list of decreasing Roman values also includes IX (9), then it will work straight without any need to reprocess the result. So, our list of decreasing values corresponding to Roman numperals needs to be augmented with subtractive cases to M CM D CD C XC L XL X IX V IV I (corresponding to numbers 1000, 900, 500, 100, 90, 50, 40, 10, 9, 5, 4, 1). Using this list instead of the original one shown above removes any need for special processing for subtractive combinations: we just need to keep doing integer divisions with the decreasing values and continue the processing with the remainder. This what the to-roman subroutine below does.

Note that we define a Roman-str subtype (or, really, a subset of the String type) containing a character class with all letters used for Roman numerals. This wasn't strictly necessary, but it enables some form of simple (and possibly incomplete) validation of the parameter passed to from-roman.

The error messages have been slightly shortened to allow better formatting on this blog post.

subset Roman-str of Str where $_ ~~ /^<[IVXLCDMivxlcdm]>+$/;

my %rom-tab = < I 1  V 5  X 10  L 50  C 100  D 500  M 1000 
               IV 4  IX 9  XL 40  XC 90  CD 400  CM 900 >;
my @ordered_romans = reverse sort { %rom-tab{$_} }, keys %rom-tab;

sub from-roman (Roman-str $roman) {
    my $numeric = 0;
    my $prev_letter = "M";
    for $roman.uc.comb -> $letter {
        $numeric -= 2 * %rom-tab{$prev_letter} 
            if %rom-tab{$letter} > %rom-tab{$prev_letter};
        $numeric += %rom-tab{$letter};
        # say "$letter $numeric";
        $prev_letter = $letter;
    }
    return $numeric;
}

sub to-roman (Int $arabic is copy where  { 0 < $_ < 4000 }) {
    my $roman = "";
    for @ordered_romans -> $key {
        my $num = ($arabic / %rom-tab{$key}).Int;
        $roman ~= $key x $num;
        $arabic -= %rom-tab{$key} * $num; 
    }
    return $roman;
}
sub process-input (Str $in) {
    my ($rom1, $op, $rom2) = split /\s+/, $in;
    my $arabic1 = from-roman $rom1;
    my $arabic2 = from-roman $rom2;
    my $result;
    given $op {
        when '+' { $result = $arabic1 + $arabic2 }
        when '-' { $result = $arabic1 - $arabic2 }
        when '*' { $result = $arabic1 * $arabic2 }
        when '/' { $result = $arabic1 / $arabic2 }
        when '**' { $result = $arabic1 ** $arabic2 }
    }
    return "nulla (they didn't have a symbol for 0)" 
        if $result == 0;
    return "non potest (they didn't do fractions)" 
        if $result.round != $result;
    return "non potest (they only went up to 3999)" 
        if $result >= 4000;
    return "non potest (no negative numbers)"
        if $result < 0;
    return to-roman $result.round;
}

for "IV + V", "M - I", "X / II", "XI * VI",
  "VII ** III", "V - V", "V / II", "MMM + M",
  "V - X ", "X - V"  -> $test-expr {
    printf "%-10s => ", $test-expr;
    say process-input $test-expr;
}

This program displays the following output:

$ raku ./roman_numerals.raku
IV + V     => IX
M - I      => CMXCIX
X / II     => V
XI * VI    => LXVI
VII ** III => CCCXLIII
V - V      => nulla (they didn't have a symbol for 0)
V / II     => non potest (they didn't do fractions)
MMM + M    => non potest (they only went up to 3999)
V - X      => non potest (no negative numbers)
X - V      => V

Roman Maths in Perl

This Perl program is essentially the same as the Raku program above (but contrary to usual, this Perl program is not a port to Perl of the Raku program above, as a good part of the Perl program below (the from_roman and to_roman subroutines) was originally written before the Raku program). Please refer to the previous two sections if you need additional explanations.

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

my %rom_tab = (I => 1,  V => 5, X => 10, L => 50, 
               C => 100, D => 500, M => 1000,
               IV => 4, IX => 9,  XL => 40, XC => 90,
               CD => 400,  CM => 900);

sub from_roman {
    my $roman = uc shift;
    my $arabic = 0;
    my $prev_letter = "M";
    for my $letter (split //, $roman) {
        $arabic -= 2 * $rom_tab{$prev_letter} 
            if $rom_tab{$letter} > $rom_tab{$prev_letter};
        $arabic += $rom_tab{$letter};
        $prev_letter = $letter;
    }
    return $arabic;
}

sub to_roman {
    my $arabic = shift;
    warn "$arabic out of bounds" 
        unless $arabic > 0 and $arabic < 4000;
    my $roman = "";
    for my $key (sort { $rom_tab{$b} <=> $rom_tab{$a} }
        keys %rom_tab) {
        my $num = int ($arabic / $rom_tab{$key});
        $roman .= $key x $num;
        $arabic -= $rom_tab{$key} * $num; 
    }
    return $roman;
}
sub process_input {
    my ($rom1, $op, $rom2) = split /\s+/, $_[0];
    my $arabic1 = from_roman $rom1;
    my $arabic2 = from_roman $rom2;
    my $result = $op eq '+'  ?  $arabic1 + $arabic2 :
                 $op eq '-'  ?  $arabic1 - $arabic2 :
                 $op eq '/'  ?  $arabic1 / $arabic2 :
                 $op eq '*'  ?  $arabic1 * $arabic2 :
                 $op eq '**' ?  $arabic1 ** $arabic2:
                 "illegal";
    return "nulla (they didn't have a symbol for 0)" 
        if $result == 0;
    return "non potest (they didn't do fractions)" 
        if int($result) != $result;
    return "non potest (they only went up to 3999)" 
        if $result >= 4000;
    return "non potest (no negative numbers)"
        if $result < 0;
    return to_roman $result;
}

for my $test ("IV + V", "M - I", "X / II", "XI * VI",
              "VII ** III", "V - V", "V / II", "MMM + M",
              "V - X ", "X - V") {
    printf "%-10s => ", $test;
    say process_input $test;
}

This program displays the following output:

$ perl ./roman-numerals.pl
IV + V     => IX
M - I      => CMXCIX
X / II     => V
XI * VI    => LXVI
VII ** III => CCCXLIII
V - V      => nulla (they didn't have a symbol for 0)
V / II     => non potest (they didn't do fractions)
MMM + M    => non potest (they only went up to 3999)
V - X      => non potest (no negative numbers)
X - V      => V

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on August 6 , 2023. And, please, also spread the word about the Perl Weekly Challenge if you can.

Leave a comment

About laurent_r

user-pic I am the author of the "Think Perl 6" book (O'Reilly, 2017) and I blog about the Perl 5 and Raku programming languages.