Perl Weekly Challenge 173: Sylvester's Sequence in dc

This blog is an answer to the second task of the Week 173 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Originally, the Perl Weekly Challenge called for solutions in Perl and Raku (also known as Perl 6 at the time). But, very soon, people started to provide solutions in other “guest” languages. See for example my blog post providing solutions to the Sylvester’s Sequence task described below in about 15 different guest languages.

One of the languages I tried is dc, and it turned out to be much more difficult and challenging than I initially thought. I actually spent far more time on it than I would wish to admit, at least 5 to 6 hours (not counting the time to write this blog post). One of the problems is that there is only very limited documentation on this old programming language. So I thought it might be useful to describe in some details how I solved it.

The Task

Write a script to generate first 10 members of Sylvester’s sequence. For more informations, please refer to the wikipedia page.

Output:

2
3
7
43
1807
3263443
10650056950807
113423713055421844361000443
12864938683278671740537145998360961546653259485195807
165506647324519964198468195444439180017513152706377497841851388766535868639572406808911988131737645185443

Each number in the sequence is the product of all previous numbers of the sequence plus 1.

A potential difficulty with this problem is that we’re dealing with very large integers, so that, for some programming languages at least, we may encounter an integer overflow error (or values may be converted to floats, with a resulting loss of precision).

Sylvester’s Sequence in Some Other Languages

The dc language is difficult and poorly documented. Before we get to it, I want to illustrate the algorithm I’ll implement with some other more traditional languages.

Sylvester’s Sequence in Raku

Raku’s integers support arbitrary precision, so we don’t have to worry about dealing with very large integers.

Our first implementation reflects directly the definition: we store the Sylvester’s sequence in the @s array. To get a new number, we simply compute the product of all items of the @s array (using the [*] meta-operator) and add it to the array.

my @s = 2;
while @s.elems < 10 {
    push @s, 1 + [*] @s;
}
.say for @s;

This program displays the following output:

$ raku ./sylvester.raku
2
3
7
43
1807
3263443
10650056950807
113423713055421844361000443
12864938683278671740537145998360961546653259485195807
165506647324519964198468195444439180017513152706377497841851388766535868639572406808911988131737645185443

But recomputing the full product each time is inefficient. At any time through the process, the current Sylvester number is one more than the product of all previous numbers of the sequence. For example, in the output displayed above, we can compute the number of the 4th row (43) by multiplying the number of the third row (7) by it minus 1 (7 - 1) and adding 1 to the product: 7 * (7 - 1) + 1 = 43. Using this recursive definition, we can write this:

my $n = 2;
say $n;
$n = $n * ($n - 1) + 1 and say $n for 1..^10;

This produces the same output:

$ ./raku sylvester3.raku
2
3
7
43
1807
3263443
[ Lines omitted for brevity ]

This second implementation reflects the algorithm that we will use from now on, as it has the advantage of not using arrays to store the sequence numbers.

Sylvester’s Sequence in Perl

Perl doesn’t natively support large integer, but we can use the use BigInt; pragma to convert all numeric literals to Math::BigInt objects, which can store arbitrarily large integers. This Perl program is essentially identical to the second Raku implementation above:

use strict;
use warnings;
use feature "say";
use bigint;

my $n = 2;
say $n;
$n = $n * ($n - 1) + 1 and say $n for 1..9;

This displays the same output as before:

 $ perl sylvester3.pl
 2
 3
 7
 43
 1807
 [ Lines omitted for brevity]
 165506647324519964198468195444439180017513152706377497841851388766535868639572406808911988131737645185443

Sylvester’s Sequence in bc

The bc) utility, for basic calculator, is “an arbitrary-precision calculator language” with syntax similar to the C programming language. It first appeared in Version 6 Unix in 1975. It is still included nowadays in most (if not all) Linux distributions. We chose to use it because of its arbitrary precision feature.

n = 2
print n, "\n"
count = 1
while (count < 10) {
    n = (n - 1) * n + 1
    print n, "\n"
    count += 1
}
quit

This displays the same output as before:

$ BC_LINE_LENGTH=0 bc sylvester.bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
2
3
7
43
1807
[ Lines omitted fr brevity ]
165506647324519964198468195444439180017513152706377497841851388766535868639572406808911988131737645185443

Note that we set the BC_LINE_LENGTH environment variable to 0 before calling bc to prevent bc from cutting the last line into two chunks after 70 characters. Otherwise, it would be displayed as:

16550664732451996419846819544443918001751315270637749784185138876653\
5868639572406808911988131737645185443

We will see that we’ll need to do something similar with dc.

bc was initially written as a font-end to dc. As we will see, dc performs arbitrary-precision computations specified in reverse Polish notation. At the time, bc provided a conventional programming-language interface to the same capability via a simple compiler (a single yacc source file comprising a few hundred lines of code), which converted a C-like syntax into dc notation and piped the results through dc. This is in part the reason I got interested with dc.

Introducing dc

According to Wikipedia), dc (desk calculator) is a cross-platform reverse-Polish calculator which supports arbitrary-precision arithmetic. Written by Lorinda Cherry and Robert Morris at Bell Labs, it is one of the oldest Unix utilities, preceding even the invention of the C programming language. Like other utilities of that vintage, it has a powerful set of features but terse syntax. Traditionally, the bc calculator program (with infix notation) was implemented on top of dc.

dc is the oldest surviving Unix language program. When its home Bell Labs received a PDP-11, dc—written in B—was the first language to run on the new computer, even before an assembler.

It uses reverse Polish notation (RPN) which was also used around the same time by Hewlett-Packard hand calculators. RPN is a postfix notation in which you first specify the operands and then the operator.

$ echo '5 6 + p' | dc
11

As you can see, we first input the two operands (5 and 6), and then the + operator, and finally the p operator to print out the result of the addition. Prefix your number with an underscore if you want to specify a negative number (e.g. _5 for -5)

The spaces are not needed (except between 5 and 6) but improve readability. We could have written it this way:

$ echo '5 6+p' | dc
11

dc can also be used in interactive mode:

$ dc
5 6
+
p
11
q

or:

$ dc
5 6 + p q
11

This can be quite convenient to test chunks of code and we will use that feature.

We can also use the -e command-line option to specify a simple program between single quotes:

$ dc -e '5 6 + p'
11

dc uses a stack to perform its operations. The various commands above can be understood as follows:

$ dc
5   # push 5 to stack
6   # push 6 to stack
f   # display stack (displays 6 and 5). Useful for debugging
6
5
+   # pop two items from stack, add them and push result to stack
p   # print top item of the stack (prints 11)
11
q   # quit

Note that the # sign indicates the beginning of a comment (the rest of the line is ignored).

For full details on the dc syntax, please consult the dc GNU manual. We will describe here only the command that we are likely to use for our program. The best tutorial I have found on dc is the Wikipedia dc page).

Printing Commands

p   Prints the value on the top of the stack, not altering the stack. 
n   Prints the value on the top of the stack, popping it off
f   Prints the entire contents of the stack without altering anything.

Stack Control

c   Clears the stack, rendering it empty
d   duplicate the value on top of the stack
r   Reverses the order of (swaps) the top two values on the stack.

Registers

dc provides at least 256 memory registers, each named by a single character. You can store a number in a register and retrieve it later.

sr  Pops the value off the top of the stack, stores it in register r. 
lr  Copies the value in register r, and pushes it onto the stack.
    This does not alter the contents of r.

Each register also contains its own stack. The current register value is the top of the register’s stack.

Strings

dc can operate on strings as well as on numbers. The only things you can do with strings are print them and execute them as macros (which means that the contents of the string are processed as dc commands). Both registers and the stack can hold strings, and dc always knows whether any given object is a string or a number.

[ch] Makes a string containing "ch" and pushes it on the stack.
x   Pops the value from the top of the stack and executes it as a macro
>r  Pops two values off the stack and compares them assuming they are 
    numbers, executing the contents of register r as a macro if the 
    original top-of-stack is greater
<r  Similar but invokes the macro if the original top-of-stack is less
=r  Similar but invokes the macro if the original top-of-stack is equal

Macros

Macros are then implemented by allowing registers and stack entries to be strings as well as numbers. A string can be printed, but it can also be executed (i.e. processed as a sequence of dc commands). So for instance we can store a macro to add 3 and then multiply by 2 into register m:

[3 + 2 *] sm

and then (using the x command which executes the top of the stack) we can use it like this:

3 lm x p

This displays the following:

$ dc -e '[3 + 2 *] sm 3 lm x p'
12

For better understanding, this is a detailed account of what’s going on:

[   # start of macro definition
  3 # push 3 to stack
  + # pop 2 values off the stack, add them and store result on stack
  2 # push 2 on stack
  * # pop 2 values off the stack, multiply them, store result on stack
]   # end of macro definition
sm  # store the macro just defined in register m
3   # push 3 on stack
lm  # copy value in register m (the macro) onto the stack
x   # run the macro
p   # print the result (top of the stack)

We will look at conditionals and loops later on.

Sylvester’s Sequence in dc

Let’s now try to implement Sylvester’s sequence in dc.

Implementing the Basic Formula

Remember that we want to do something equivalent to this Raku program:

my $n = 2;
say $n;
$n = $n * ($n - 1) + 1 and say $n for 1..^10;

We will do it mostly in interactive mode.

First, we want to give an initial value of 2 to n and print it.

$ dc
2snlnp
2

This is a detailed description of the cryptic 2snlnp command:

2    # push 2 on stack
sn   # pops 2 from stack and store in register n
ln   # copy register n onto stack
p    # print top of stack

We could make it one character shorter by using the d duplicate command:

$ dc -e '2dsnp'
2

Then, we want to implement and test the $n = $n * ($n - 1) + 1 formula:

2snlnp
2
1-   # subtract 1 from stack
ln   # load n on stack
*1+p # compute product, add 1 and print new value
3    
sn   # pop new value and store it in register n
ln   # copy new value in n to stack

So, n was 2 and is now set to 3. This is the expected result, so it looks promising. Let’s run again that series of commands a couple of times:

$ dc
2snlnp
2
1-   # subtract 1 from stack
ln   # load n on stack
*1+p # compute product, add 1 and print new value
3
sn   # pop new value and store it in register n
ln   # copy new value in n to stack
1-   # subtract 1 from stack
ln   # load n on stack
*1+p # compute product, add 1 and print new value
7
sn   # pop new value and store it in register n
ln   # copy new value in n to stack
1-   # subtract 1 from stack
ln   # load n on stack
*1+p # compute product, add 1 and print new value
43
sn   # pop new value and store it in register n
ln   # copy new value in n to stack
1-   # subtract 1 from stack
ln   # load n on stack
*1+p # compute product, add 1 and print new value
1807

We obtain the proper sequence of Sylvester numbers: 2, 3, 7, 43, 1807. But, of course, it is a pain in the neck to repeatedly enter this series of 5 commands. We can store it in a macro (I used register m, m as the first letter in macro, because it makes it easier to remember, but you could store a macro in any other register) and then execute the macro any number of times:

$ dc
2snlnp                 # initialization of n to 2
2
[1- ln *1+p sn ln]sm   # store the macro in register m
lm x                   # run the macro
3
lm x
7
lm x
43
lm x
1807
lm x
3263443
lm x
10650056950807
...

So, the results are correct, we have the basic actions to compute the Sylvester’s sequence. We still have to implement a loop to automatize macro execution a given number of times.

Conditionals and Loops in dc

The =, >, !>, <, !<, != conditionals execute the subsequent macro when the two top values of the stack are equal, larger than, not larger than, etc. For example, in:

$ dc -e '[[smaller than]p] sm 6 5 <m'
smaller than

the macro stored in m runs (and prints “smaller than”) because 5 is smaller than 6. The < pops 5 and then 6 from the stack and runs the macro in register m because the first popped value (5) is smaller than the second popped value.

The above-mentioned Wikipedia page then states: “Looping is then possible by defining a macro which (conditionally) reinvokes itself. A simple factorial of the top of the stack might be implemented as:

[d1-d1<F*]dsFxp

I must admit that I did not understand it when I first read it. And still not when I read it a second time and then a third time.

Let’s get away from the Sylvester sequence for a brief moment and look at a simple countdown in this page in the Bash Hackers Wiki:

dc << EOF
[ li       # put our index i on the stack 
  p        # print it, to see what's going on
  1 -      # we decrement the index by one
  si       # store decremented index (i=i-1)
 0 li >L   # if i > 0 then execute L
] sL       # store our macro with the name L
10 si      # let's give to our index the value 10
lLx        # and start our loop
EOF 

10
9
8
[ lines omitted for brevity]
2
1

OK, now I understand it (and I suppose you will if you read the detailed comments) and can complete the task. Basically, the macro is called a first time, and then calls itself recursively so long as the condition is satisfied.

The Complete Sylvester Program in dc

This is now the full Sylvester’s sequence program in dc:

2sn     # push 2 on the stack, pop 2 off the top of the stack
        # and store it into register n
lnp     # copy the value back on the stack and print it
9sc     # give counter c an initial value of 9
[       # start of macro
  1-    # subtract 1 from stack (value n-1)
  ln    # load n to stack
  *1+p  # compute product n * n-1, add 1 and print
  sn    # pop new value and store it in register n
  ln    # copy new value in  n to stack
  lc    # copy counter to stack
  1-    # decrement counter (subtract 1 from c)
  sc    # store decremented counter in c
  0 lc  # store 0 and counter on stack
  >m    # compare c to 0 and, if c > 0, run recursively macro in m
]       # end of macro
d       # duplicate macro on stack
sm      # store macro in register m
x       # execute first iteration of macro

To run it and display properly the last line, we need to set the DC_LINE_LENGTH to 0 in a way similar to what we had to do with bc.

$  DC_LINE_LENGTH=0 dc sylvester.dc
2
3
7
43
1807
3263443
10650056950807
113423713055421844361000443
12864938683278671740537145998360961546653259485195807
165506647324519964198468195444439180017513152706377497841851388766535868639572406808911988131737645185443

But, of course, formatting the program with spaces and comments as above is way too easy and good only for wimps and cowards. Real programmers will prefer this one-line version ( ;-)):

2snlnp9sc[1-ln*1+psnlnlc1-sc0lc>m]dsmx

which you can run as follows:

$ echo '2snlnp9sc[1-ln*1+psnlnlc1-sc0lc>m]dsmx
       ' |  DC_LINE_LENGTH=0 dc
2
3
7
43
1807
3263443
10650056950807
113423713055421844361000443
12864938683278671740537145998360961546653259485195807
165506647324519964198468195444439180017513152706377497841851388766535868639572406808911988131737645185443

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on July 24, 2022. And, please, also spread the word about the Perl Weekly Challenge if you can.

Leave a comment

About laurent_r

user-pic I am the author of the "Think Perl 6" book (O'Reilly, 2017) and I blog about the Perl 5 and Raku programming languages.