August 2019 Archives

With friends like these...

By Damian Conway on August 18, 2019 8:35 AM

C-o-rr-a-ll-i-n-g d-i-tt-o-e-d l-e-tt-e-r-s

I was going to focus this week on the first task of the 20th Weekly Challenge...but what can I say? The task was a break a string specified on the command-line into runs of identical characters:

    "toolless"        →   t  oo  ll  e  ss
    "subbookkeeper"   →   s  u  bb  oo  kk  ee  p  e  r
    "committee"       →   c  o  mm  i  tt  ee

But that’s utterly trivial in Raku:

    use v6.d;

    sub MAIN (\str) {
        .say for str.comb: /(.) $0*/
    }

And almost as easy in Perl:

use v5.30;

my $str = $ARGV[0]
// die "Usage:\n $0 <str>\n";

say $& while $str =~ /(.) \1*/gx;

In both cases, the regex simply matches any character ((.)) and then rematches exactly
the same character ($0 or \1) zero-or-more times (*). Both match operations (str.comb
and $str =~) produce a list of the matched strings, each of which we then output
(.say for... or say $& while...).

As there’s not much more to say in either case, I instead turned my attention to the second task: to locate and print the first pair of amicable numbers.

The friend of my friend is my enemy

Amicable numbers are pairs of integers, each of which has a set of proper divisors
(i.e. every smaller number by which it is evenly divisible) that happen to add up to
the other number.

For example, the number 1184 is divisible by 1, 2, 4, 8, 16, 32, 37, 74, 148, 296, and 592; and the sum of 1+2+4+8+16+32+37+74+148+296+592 is 1210. Meanwhile, the number 1210 is divisible by 1, 2, 5, 10, 11, 22, 55, 110, 121, 242, and 605; and the sum of 1+2+5+10+11+22+55+110+121+242+605 is...you guessed it...1184.

Such pairs of numbers are uncommon. There are only five among the first 10 thousand integers, only thirteen below 100 thousand, only forty-one under 1 million. And the further we go, the scarcer they become: there only 7554 such pairs less than 1 trillion. Asymptotically, their average density amongst the positive integers converges on zero.

There is no known universal formula for finding amicable numbers, though the 9th century Islamic scholar ثابت بن قره did discover a partial formula, which Euler (of course!) subsequently improved upon 900 years later.

So they’re rare, and they’re unpredictable...but they’re not especially hard to find.

In number theory, the function giving the sum of all proper divisors of N
(known as the “restricted divisor function”) is denoted by 𝑠(N):

sub 𝑠 (\N) { sum divisors(N) :proper }

That trailing :proper is an “adverbial modifier” being applied to the call to divisors(N),
telling the function to return only proper divisors (i.e. to exclude N itself from the list).
And, yeah, that’s a Unicode italic 𝑠 we’re using as the function name. Because we can.

Once we have the restricted divisor function defined, we could simply iterate through each integer i from 1 to infinity, finding 𝑠(i), then checking to see if the sum-of-divisors of that number (i.e. 𝑠(𝑠(i)) ) is identical to the original number. If we only need to find the first pair of amicable numbers, that’s just:

for 1..∞ -> \number {
my \friend = 𝑠(number);

say (number, friend) and exit
if number != friend && 𝑠(friend) == number;
}

Which outputs:

    (220, 284)

But why stop at one result? When it’s no harder to find all the amicable numbers:

for 1..∞ -> \number {
my \friend = 𝑠(number);

say (number, friend)
if number < friend && 𝑠(friend) == number;
}

Note that, because the amicable relationship between numbers is (by definition) symmetrical,
we changed the number != friend test to number < friend
...to prevent the loop from printing each pair twice:

(220, 284)
(284, 220)
(1184 1210)
(1210 1184)
(2620 2924)
(2924 2620)
(et cetera)
(cetera et)

The missing built-in

That would be the end of this story, except for one small problem: somewhat surprisingly,
Raku doesn’t have the divisors builtin we need to implement the 𝑠 function. So we’re going to have to build one ourselves. In fact, we’re going to build quite a few of them...

The divisors of a whole number are all the integers by which it can be divided leaving no remainder. This includes the number itself, and the integer 1. The “proper divisors” of a number are all of its divisors except itself. The “non-trivial divisors” of a number are all of its divisors except itself or 1. That is:

    say divisors(12);                # (1 2 3 4 5 6 12)
    say divisors(12) :proper;        # (1 2 3 4 5 6)
    say divisors(12) :non-trivial;   #   (2 3 4 5 6)

The second and third alternatives above, with the funky adverbial modifiers, are really just syntactic honey for a normal call to divisors but with an additional named argument:

    say divisors(12);                # (1 2 3 4 5 6 12)
    say divisors(12, :proper);       # (1 2 3 4 5 6)
    say divisors(12, :non-trivial);  #   (2 3 4 5 6)

In Raku it’s easiest to implement those kinds of “adverbed” functions using multiple dispatch, where each special case has a unique required named argument:

multi divisors (\N, :$proper!) { divisors(N).grep(1..^N) }
multi divisors (\N, :$non-trivial!) { divisors(N).grep(1^..^N) }

Within the body of each of these special cases of divisors, we just call the regular variant of the function (i.e. divisors(N)) and then grep out the unwanted endpoint(s).
The ..^ operator generates a range that excludes its own upper limit,
while the ^..^ operator generates a range that excludes both its bounds.
(Yes, there’s also a ^.. variant to exclude just the lower bound).

So, when the :proper option is specified, we filter the full list returned by divisors(N)
to omit the number itself (.grep(1..^N). Likewise, we exclude both extremal values when the :non-trivial option is included (.grep(1^..^N).

But what about the original unfiltered list of divisors?
How do we get that in the first place?

The naïve way to generate the full list of divisors of a number N, known as “trial division”,
is to simply to walk through all the numbers from 1 to N, keeping all those that divide N
with no remainder...which is easy to test, as Raku has the %% is-divisible-by operator:

    multi divisors (\N) {
        # Track all divisors found so far...
        my \divisors = [];

        # For every potential divisor...
        for 1..N -> \i {
            # Skip if it's not an actual divisor...
            next unless N %% i;

            # Otherwise, add it to the list...
            divisors.push: i;
        }

        # Deliver the results...
        return divisors;
    }

Except that we’re not cave dwellers and we don't need to rub sticks together like that, nor do number theory by counting on our toes. We can get the same result far more elegantly:

    multi divisors (\N) { (1..N).grep(N %% *) }

Here we simply filter the list of potential divisors (1..N), keeping only those that evenly divide N (.grep(N %% *)). The N %% * test is a shorthand for creating a Code object that takes one argument (represented by the *) and returns N %% that argument. In other words, it creates a one-argument function by pre-binding the first operand of the infix %% operator to N. If that’s a little too syntactically mellifluous for you, we could also have written it as an explicit pre-binding of the %% operator:

(1..N).grep( &infix:<%%>.assuming(N) )

...or as a lambda:

(1..N).grep( -> \i { N %% i } )

...or as an anonymous subroutine:

(1..N).grep( sub (\i) { N %% i } )

...or as a named subroutine:

my sub divisors-of-N (\i) { N %% i }

(1..N).grep( &divisors-of-N )

Raku aims to let us express ourselves in whichever notation we find most convenient, comfortable, and comprehensible.

Getting to the root of the problem

It’s hard to imagine a simpler solution to the problem of finding divisors than:

    multi divisors (\N) {  (1..N).grep(N %% *)  }

But it’s also hard to imagine a less efficient one. For example, in order to find the eight divisors of the number 2001, we have to check all 2001 potential divisors, which is 99.6% wasted effort. Even for a number like 2100—which has thirty-six divisors—we’re still throwing away over 98% of the 1..N sequence. And the bigger the number,
the smaller its relative number of divisors, and the longer it takes to find them.
There must be a better way.

And, of course, there is. The simplest improvement we can make was first published back in 1202 by Fibonacci in his magnum opus Liber Abbaci. We start by observing that the divisors of a number always come in complementary pairs; pairs that multiply together to produce the number itself. For example, the divisors of 99 are:

      1    3    9
     99   33   11

...while the divisors of 100 are:

      1    2    4    5   10
    100   50   25   20   10

...and the divisors of 101 are:

      1
    101

Notice that, in each case, the top row of divisors always contains “small” integers no greater than the square-root of the original number. And the bottom row consists entirely of N divided by the corresponding top-row divisor. So we could find half the divisors by searching the range 1..sqrt N (in just O(√N) steps), and then find the other half by subtracting each element of that list from N (also in just O(√N) steps). In Raku that look like this:

    multi divisors (\N) {
        my \small-divisors = (1..sqrt N).grep(N %% *);
        my \big-divisors   = N «div« small-divisors;

        return unique flat small-divisors, big-divisors;
    }

The div operator is integer division, and putting the double angles around it makes it
a vector operator that divides N by each element of the list of small-divisors.
The flat is needed because the two list objects in small-divisors and
big-divisors are not automatically “flattened” into a single list in Raku.
The unique is needed because if N is a perfect square, we would otherwise get two copies of its square-root (as in the above example of 10/10 among the divisor-pairs of 100).

Thinking big

It’s great that we were able to improve our O(N) algorithm to O(√N) so easily,
but even that only gets us so far. The performance of the divisors function up to divisors(10⁹) is entirely acceptable at under 0.1 seconds,
but starts to fall off rapidly after that point:

If we want our function to be usable for very large numbers, we need a better algorithm.
And, happily, the world of cryptography (which is obsessed with factoring numbers)
provides plenty of alternative techniques, ranging from the merely very complex
to the positively eldritch.

One of the easier approaches to understand (and code!) is Pollard’s 𝜌 algorithm, which I explained briefly as part of a Perl Conference keynote few years ago. And which Stephen Schulze subsequently made available as the prime-factors function in a Raku module named Prime::Factor.

I don’t plan to explain the 𝜌 algorithm here, or even discuss Stephen’s excellent implementation of it...though it’s definitely worth exploring the module’s code, especially the sublime shortcut that uses $n gcd 6541380665835015 to instantly detect if the number has a prime factor less than 44.

Suffice it to say that the module finds all the prime factors of very large numbers
very quickly. For example, whereas our previous implementation of divisors would take
around five seconds to find the divisors of 1 trillion, the prime-factors function finds
the prime factors of that number in less than 0.002 seconds.

The only problem is: the prime factors of a number aren’t the same as its divisors.
The divisors of 1 trillion are all the numbers by which it is evenly divisible. Namely:

1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50, 64, 80, 100,
125, 128, 160, 200, 250, 256, 320, 400, 500, 512, 625,
640, 800, 1000, 1024, 1250, 1280, 1600, 2000, 2048, 2500,
[...218 more integers here...]
10000000000, 12500000000, 15625000000, 20000000000,
25000000000, 31250000000, 40000000000, 50000000000,
62500000000, 100000000000, 125000000000, 200000000000,
250000000000, 500000000000, 1000000000000

In contrast, the prime factors of a number is the unique set of (usually repeated) primes which can be multiplied together to reconstitute the original number. For the number
1 trillion, that unique set of primes is:

    2,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5,5,5

...because:

    2×2×2×2×2×2×2×2×2×2×2×2×5×5×5×5×5×5×5×5×5×5×5×5 → 1000000000000

To find amicable pairs we need divisors, nor prime factors. Fortunately, it’s not too hard to extract one from the other. Multiplying the complete list of prime factors produces the original number, but if we select various subsets of the prime factors instead:

                    2×2×2×2×5×5×5    → 2000
          2×2×2×2×2×2×2×2            → 256
                          2×5×5×5×5  → 1250

...then we get some of the actual divisors. And if we select the power set of the prime factors (i.e. every possible subset), then we get every possible divisor.

So all we need to do is to take the complete list of prime factors produced by
prime-factors, generate every possible combination of the elements of that list,
multiply the elements of each combination together, and keep only the unique results.
Which, in Raku, is just:

    use Prime::Factor;

    multi divisors (\N) {
        prime-factors(N).combinations».reduce( &[×] ).unique;
    }

The .combinations method produces a list of lists, where each inner list is one possible combination of some unique subset of the original list of prime factors. Something like:

    (2), (5), (2,2), (2,5), (2,2,2), (2,2,5), (2,5,5), ...

The ».reduce method call is a vector form of the “fold” operation, which inserts
the specified operator between every element of the list of lists on which it’s called.
In this case, we’re inserting infix multiplication via the &infix:<×> operator
...which we can abbreviate to: &[×]

So we get something like:

    (2), (5), (2×2), (2×5), (2×2×2), (2×2×5), (2×5×5), ...

Then we just cull any duplicate results with a final call to .unique.

As simple as possible...but no simpler!

And then we test our shiny new prime-factor based algorithm. And weep to discover that it is catastrophically slower than our original trial division approach:

The problem here is that the use of the .combinations method is leading to a combinatorial explosion in some cases. We found the complete set of divisors by taking all possible subsets of the prime factors:

    2×2×2×2×2×2×2×2×2×2×2×2×5×5×5×5×5×5×5×5×5×5×5×5 → 1000000000000
                    2×2×2×2×5×5×5                   → 2000
          2×2×2×2×2×2×2×2                           → 256
                          2×5×5×5×5                 → 1250

But that also means that we took subsets like this:

    2×2×2                                           → 8
      2×2×2                                         → 8
                      2×2×2                         → 8
    2          ×2        ×2                         → 8

In fact, we took 220 distinct 2×2×2 subsets. Not to mention 495 2×2×2×2 subsets,
792 2×2×2×2×2 subsets, and so on. In total, the 24 prime factors of 1 trillion produce
a power set of 2²⁴ distinct subsets, which we then whittle down to just 168 distinct divisors.
In other words, .combinations has to build and return a list of those 16777216 subsets,
each of which ».reduce then has to process, after which .unique immediately throws away 99.999% of them. Clearly, we need a much better way of combining the factors into divisors.

And, happily, there is one. We can rewrite the multiplication:

    2×2×2×2×2×2×2×2×2×2×2×2×5×5×5×5×5×5×5×5×5×5×5×5 → 1000000000000

...to a much more compact:

2¹² × 5¹² → 1000000000000

We then observe that we can get all the unique subsets simply by varying the exponents of the two primes, from zero up to the maximum allowed value (12 in each case):

    2⁰×5⁰ → 1      2¹×5⁰ → 2       2²×5⁰ → 4       2³×5⁰ → 8    ⋯
    2⁰×5¹ → 5      2¹×5¹ → 10      2²×5¹ → 20      2³×5¹ → 40   ⋯
    2⁰×5² → 25     2¹×5² → 50      2²×5² → 100     2³×5² → 200  ⋯
    2⁰×5³ → 125    2¹×5³ → 250     2²×5³ → 500     2³×5³ → 1000 ⋯
    2⁰×5⁴ → 625    2¹×5⁴ → 1250    2²×5⁴ → 2500    2³×5⁴ → 5000 ⋯
    ⋮              ⋮                ⋮               ⋮            ⋱

In general, if a number has prime factors p_ℓ^𝐼 × p_m^J × p_n^K, then its complete set of divisors is given by p_ℓ^(0..𝐼) × p_m^(0..J) × p_n^(0..K).

Which means we can find them like so:

multi divisors (\N) {
# Find and count prime factors of N (as before)...
my \factors = bag prime-factors(N);

# Short-cut if N is prime...
return (1,N) if factors.total == 1;

# Extract list of unique prime factors...
my \p_lp_mp_n = factors.keys xx ∞;

# Build all unique combinations of exponents...
my \ᴵᴶᴷ = [X] (0 .. .value for factors);

# Each divisor is p_lᴵ × p_mᴶ × p_nᴷ...
return ([×] .list for p_lp_mp_n «**« ᴵᴶᴷ);
}

We get the list of prime factors as in the previous version (prime-factors(N)), but now we put them straight into a Bag data structure (bag prime-factors(N)). A “bag” is an integer-weighted set: a special kind of hash in which the keys are the original elements of the list and the values are the counts of how many times each distinct value appears (i.e. its “weight” in the list).

For example, the prime factors of 9876543210 are (2, 3, 3, 5, 17, 17, 379721).
If we put that list into a bag, we get the equivalent of:

    { 2=>1, 3=>2, 5=>1, 17=>2, 379721=>1 }

So converting the list of prime factors to a bag gives us an easy and efficient way of determining the unique primes involved, and the powers to which each prime must be raised.

However, if there is only one prime key in the resulting bag, and its corresponding count is 1,
then the original number must itself have been that prime (raised to the power of 1).
In which case, we know the divisors can only be that original number and 1,
so we can immediately return them:

    return (1,N) if factors.total == 1;

The .total method simply sums up all the integer weights in the bag.
If the total is 1, there can have been only one element, with the weight 1.

Otherwise, the one or more keys of the bag (factors.keys) are the list of prime factors of the original number (p_l, p_m, p_n, ...), which we extract and store in an appropriate Unicode-named variable: p_lp_mp_n. Note that we need multiple identical copies of these prime-factor lists: one for every possible combination of exponents. As we don’t know (yet) how many such combinations there will be, to ensure we’ll have enough we simply make the list infinitely long: factors.keys xx ∞. In our example, that would produce the list of factor lists like this:

    ((1,3,5,17,379721), (1,3,5,17,379721), (1,3,5,17,379721), ...)

To get the list of exponent sets, we need every combination of possible exponents (I,J,K,...), from zero up to the maximum count for each prime. That is, for our example:

{ 2=>1, 3=>2, 5=>1, 17=>2, 379721=>1 }

we need:

    ( (0,0,0,0,0), (0,0,0,0,1), (0,0,0,1,0), (0,0,0,1,1), (0,0,0,2,0), (0,0,0,2,1),
      (0,0,1,0,0), (0,0,1,0,1), (0,0,1,1,0), (0,0,1,1,1), (0,0,1,2,0), (0,0,1,2,1),
      (0,1,0,0,0), (0,1,0,0,1), (0,1,0,1,0), (0,1,0,1,1), (0,1,0,2,0), (0,1,0,2,1),
      (0,1,1,0,0), (0,1,1,0,1), (0,1,1,1,0), (0,1,1,1,1), (0,1,1,2,0), (0,1,1,2,1),
      (0,2,0,0,0), (0,2,0,0,1), (0,2,0,1,0), (0,2,0,1,1), (0,2,0,2,0), (0,2,0,2,1),
      (0,2,1,0,0), (0,2,1,0,1), (0,2,1,1,0), (0,2,1,1,1), (0,2,1,2,0), (0,2,1,2,1),
      (1,0,0,0,0), (1,0,0,0,1), (1,0,0,1,0), (1,0,0,1,1), (1,0,0,2,0), (1,0,0,2,1),
      (1,0,1,0,0), (1,0,1,0,1), (1,0,1,1,0), (1,0,1,1,1), (1,0,1,2,0), (1,0,1,2,1),
      (1,1,0,0,0), (1,1,0,0,1), (1,1,0,1,0), (1,1,0,1,1), (1,1,0,2,0), (1,1,0,2,1),
      (1,1,1,0,0), (1,1,1,0,1), (1,1,1,1,0), (1,1,1,1,1), (1,1,1,2,0), (1,1,1,2,1),
      (1,2,0,0,0), (1,2,0,0,1), (1,2,0,1,0), (1,2,0,1,1), (1,2,0,2,0), (1,2,0,2,1),
      (1,2,1,0,0), (1,2,1,0,1), (1,2,1,1,0), (1,2,1,1,1), (1,2,1,2,0)  (1,2,1,2,1)
    )

Or, to be express it more concisely, we need the cross-product (i.e. the X operator)
of the valid ranges of each exponent:

    #  2       3        5        17     379721
    (0..1) X (0..2) X (0..1) X (0..2) X (0..1)

The maximal exponents are just the values from the bag of prime factors (factors.values), so we can get a list of the required exponent ranges by converting each “prime count” value to a 0..count range: (0 .. .value for factors)

Note that, in Raku, a loop within parentheses produces a list of the final values
of each iteration of that loop. Or you can think of this construct as a list comprehension,
as in Python: [range(0,value) for value in factors.values()] (but less prosy)
or in Haskell: [ [0..value] | value <- elems factors ] (but with less line noise).

Then we just take the resulting list of ranges and compute the n-ary cross-product
by reducing the list over the X operator: [X](0 .. .value for factors)
and store the resulting list of I,J,K exponent lists in a suitably named variable: ᴵᴶᴷ
(Yes, superscript letters are perfectly valid Unicode alphabetics, so we can certainly
use them as an identifier.)

At this point almost all the hard work is done. We have a list of the prime factors (p_lp_mp_n),
and a list of the unique combinations of exponents that will produce distinct divisors (ᴵᴶᴷ),
so all we need to do now is raise each set of numbers in the first list to the various sets
of exponents in the second list using a vector exponentiation operator (p_lp_mp_n «**« ᴵᴶᴷ)
and then multiply the list of values produced by each exponentiation ([×] .list for …)
in another list comprehension, to produce the list of divisors.

And that’s it. It’s five lines instead of one:

multi divisors (\N) {
my \factors = bag prime-factors(N);
return (1,N) if factors.total == 1;

my \p_lp_mp_n = factors.keys xx ∞;
my \ᴵᴶᴷ = [X] (0 .. .value for factors);

return ([×] .list for p_lp_mp_n «**« ᴵᴶᴷ);
}

...but with no combinatorial explosives lurking inside them. Instead of building O(2ᴺ) subsets of the factors directly, we build O(N) subsets of their respective exponents.

And then we test our shinier newer divisors implementation. And weep tears
...of relief when we find that it scales ridiculously better than the previous one.
And also vastly better than the original trial division solution:

Mission accomplished!

The best of both worlds

Except that, if we zoom in on the start of the graph:

...we see that our new algorithm’s performance is only eventually better.
Due to the relatively high computational overheads of the Pollard’s 𝜌 algorithm
at its heart, and to the need to build, exponentiate, and multiply together the power set
of prime factors, the performance of this version of divisors is marginally worse
than simple trial division...at least on numbers less than N=10000.

Ideally, we could somehow employ both algorithms: use trial division for the “small” numbers, and prime factoring for everything bigger. And that too is trivially easy in Raku.
No, not by muddling them together in some kind of Frankenstein function:

multi divisors (\N) {
if N < 10⁴ {
my \small-divisors = (1..sqrt N).grep(N %% );
my \big-divisors = N «div« small-divisors;
return unique flat small-divisors, big-divisors;
}
else {
my \factors = bag prime-factors(N);
return (1,N) if factors.total == 1;

my \p_lp_mp_n = factors.keys xx ∞;
my \ᴵᴶᴷ = [X] (0 .. .value for factors);
return ([×] .list for p_lp_mp_n «*« ᴵᴶᴷ);
}
}

Instead, we just implement both approaches independently in separate multis,
as we did previously, then modify their signatures to tell the compiler
the range of N values to which they should each be applied:

constant SMALL = 1 ..^ 10⁴;
constant BIG = 10⁴ .. ∞;

multi divisors (\N where BIG) {
my \factors = bag prime-factors(N);
return (1,N) if factors.total == 1;

my \p_lp_mp_n = factors.keys xx ∞;
my \ᴵᴶᴷ = [X] (0 .. .value for factors);

return ([×] .list for p_lp_mp_n «**« ᴵᴶᴷ);
}

multi divisors (\N where SMALL) {
my \small-divisors = (1..sqrt N).grep(N %% *);
my \big-divisors = N «div« small-divisors;

return unique flat small-divisors, big-divisors;
}

The actual improvement in this particular case is only slight; perhaps too slight to be worth the bother of maintaining two variants of the same function. But the principle being demonstrated here is important. The Raku multiple dispatch mechanism makes it very easy to inject special-case optimizations into an existing function...without making the function’s original source code any more complex, any slower, or any less maintainable.

Meanwhile, in a parallel universe...

Now that we have an efficient way to find the proper divisors of any number, we can start locating amicable pairs using the code shown earlier:

for 1..∞ -> \number {
my \friend = 𝑠(number);

say (number, friend)
if number < friend && 𝑠(friend) == number;
}

When we do, we find that the first few pairs are printed out very quickly but, after that, things start to slow down noticeably. So we might start looking for yet another way to accelerate the search.

We might, for example, notice that each iteration of the for loop is entirely independent of any other. No outside information is required to test a particular amicable pair, and no persistent state need be passed from iteration to iteration. And that, we would quickly realize, means that this is a perfect opportunity to introduce a little concurrency.

In many languages, converting our simple linear for loop into some kind of concurrent search would require a shambling mound of extra code: to schedule, create, orchestrate, manage, coordinate, synchronize, and terminate a collection of threads or thread objects.

In Raku, though, it just means we need to add a single five-letter modifier
to our existing for loop:

hyper for 1..∞ -> \number {
my \friend = 𝑠(number);

say (number, friend)
if number < friend&& 𝑠(friend) == number;
}

The hyper prefix tells the compiler that this particular for loop does not need to iterate sequentially; that each of its iterations can be executed with whatever degree of concurrency the compiler deems appropriate (by default, in four parallel threads, though there are extra parameters that allow you to tune the degree of concurrency to match the capacities of your hardware).

The hyper prefix is really just a shorthand for adding a call to the .hyper method to the list being iterated. That method converts the iterator of the object to one that can iterate concurrently. So we could also write our concurrent loop like this:

for (1..∞).hyper -> \number {
my \friend = 𝑠(number);

say (number, friend)
if number < friend&& 𝑠(friend) == number;
}

Note that, whichever way we write this parallel for loop, with multiple iterations happening in parallel, the results are no longer guaranteed to be printed out in strictly increasing order. In practice, however, the low density of amicable pairs amongst the integers makes this extremely likely anyway.

When we convert the previous for loop to a hyper for, the performance of the loop doubles. For example, the regular loop can find every amicable pair up to 1 million in a little over an hour; the hyper loop does the same in under 25 minutes.

To infinity and beyond

Finally, having constructed and optimized all the components of our finder of lost amities,
we can begin our search in earnest. Not just for the first amicable pair, but for the first amicable pair over one thousand, over one million, over one billion, over one trillion,
et cetera:

# Convert 1 → "10⁰", 10 → "10¹", 100 → "10²", 1000 → "10³", ...
sub order (\N where /^ 10* $/) {
10 ~ N.chars.pred.trans: '0123456789' => '⁰¹²³⁴⁵⁶⁷⁸⁹'
}

# For every power of 1000...
for 1, 10³, 10⁶ ... ∞ -> \min {
# Concurrently find the first amicable pair in that range...
for (min..∞).hyper -> \number {
my \friend = 𝑠(number);
next if number >= friend || 𝑠(friend) != number;

# Report it and go on to the next power of 1000...
say "First amicable pair over &order(min):",
"\t({number}, {friend})";
last;
}
}

Which reveals:

First amicable pair over 10⁰: (220, 284)
First amicable pair over 10³: (1184, 1210)
First amicable pair over 10⁶: (1077890, 1099390)
First amicable pair over 10⁹: (1000233608, 1001668568)
First amicable pair over 10¹²: (1000302285872, 1000452085744)
et cetera

Well, reveals them...eventually!

Damian

2 comments

Greed is good, balance is better, beauty is best.

By Damian Conway on August 6, 2019 10:08 PM

Avidis, avidus natura parum est

One of my first forays into Perl programming, 20 years ago now, was a tool that takes a piece of plaintext, analyzes its structure, and formats it neatly for a given line width. It’s a moderately sophisticated line wrapping application that I use daily to tidy up email correspondence, software documentation, and blog entries.

So the second task of the 19th Weekly Challenge—to implement a “greedy”
line-wrapping algorithm—is in many ways an old friend to me.

Greedy line wrapping simply takes each word in the input text and adds it to the
current line of output unless doing so would cause the output line to exceed the required
maximal line width, in which case it breaks the line at that point and continues filling
the second line, et cetera. So a 45-column greedily wrapped paragraph looks like this:

      It is a truth universally acknowledged, that
      a single man in possession of a good fortune
      must be in want of a wife. However little
      known the feelings or views of such a man may
      be on his first entering a neighbourhood,
      this truth is so well fixed in the minds of
      the surrounding families, that he is
      considered the rightful property of some one
      or other of their daughters.

The resulting text is somewhat unbalanced and raggéd on the right margin, but it’s within the required width and tolerably readable. And the algorithm is so simple that it’s possible to implement it in a single Raku statement:

    sub MAIN (:$width = 80) {
        $*IN.slurp.words
            .join(' ')
            .comb(/ . ** {1..$width}  )>  [' ' | $] /)
            .join("\n")
            .say
    }

We take the STDIN input stream ($*IN), slurp up the entire input (.slurp), break it into words (.words). The we rejoin those words with a single space between each
(.join(' ')), and break the text into lines no longer than width characters
(.comb(/ . ** {1..$width}) providing each line also ends on a word boundary
before a space or end-of-string ()> [' ' | $]). Finally, we rejoin those lines with newlines (.join("\n")) and print them (.say).

That’s a reasonable one-liner solution to the specified challenge, but we can do better.
For a start, there’s a hidden edge-case we’re not handling yet. Namely, what happens if you’re a scholarly Welsh ex-miner with health issues?

      Look you, I shall have to be terminating my interdisciplinary
      investigation of consanguineous antidisestablishmentarianism
      in Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch.
      For I've just been electrophotomicrographically diagnosed with
      pseudopneumonoultramicroscopicsilicovolcanoconiosis, isn't it?

Our one-statement solution fails miserably when reformatting this input, being unable to correctly break the excessively long names of the town or the disease. As each of them has more than 45 characters, the regex has to skip over and omit as many leading characters as
necessary from the very long words, until it again finds 45 trailing characters followed by
a space. So we get:

      Look you, I shall have to be terminating my
      interdisciplinary investigation of
      consanguineous antidisestablishmentarianism
      in
      yngyllgogerychwyrndrobwllllantysiliogogogoch.
      For I've just been
      electrophotomicrographically diagnosed with
      neumonoultramicroscopicsilicovolcanoconiosis,
      isn't it?

Apart from the decapitated words, we also get an absurdly unbalanced right margin
when the algorithm is forced to shift sesquipedalia like “consanguineous” and “electrophotomicrographically” to the next line.

Of course, it’s not difficult to fix both those problems. We just give the regex a fall-back option: if it can’t break a line at a word boundary because of an excessively long word,
we allow it to break that word internally, provided the break isn’t too close to either end
of the word (say, at least five characters in: $minbreak).

We also constrain it to break regular lines at no less than 80% of the specified width
(i.e. $minwidth) to avert those textual crevasses in the right-hand margin:

sub MAIN (:$width) {
say greedy-wrap( $*IN.slurp, :$width );
}

sub greedy-wrap( $text,
:$width = 80,
:$minwidth = floor(0.8 * $width),
:$minbreak = 5,
) {
$text.words.join(' ')
.comb(/ . ** {1..$width} $
| . ** {$minwidth..$width} )> ' '
| . ** {$minbreak..$width}
<before \S ** {$minbreak}>
/)
.join("\n")
}

In this version the .comb regex specifies that we must fill at least 80% of the requested width with words (. ** {$minwidth..$width}), except on the final line
(. ** {1..$width}$), and otherwise we’re allowed to take any number of characters,
provided we take at least five (. ** {$minbreak..$width}), and provided we leave
at least five visible characters at the start of the next line as well
(<before \S ** {$minbreak}>).

This version produces a much more uniform wrapping:

      Look you, I shall have to be terminating my
      interdisciplinary investigation of consangui
      neous antidisestablishmentarianism in
      Llanfairpwllgwyngyllgogerychwyrndrobwllllanty
      siliogogogoch. For I've just been electrophot
      omicrographically diagnosed with pseudopneumo
      noultramicroscopicsilicovolcanoconiosis,
      isn't it?

Except that the longer words are now unceremoniously chopped off, without even
the common courtesy of an interpolated copula. So we need an extra step in the
pipeline to add hyphens where they’re needed:

sub greedy-wrap( $text,
:$width = 80,
:$minwidth = floor(0.8 * $width),
:$minbreak = 5
) {
$text.words.join(' ')
.match(/ . ** {1..$width} $
| . ** {$minwidth..$width} )> ' '
| . ** {$minbreak..$width-1}
<broken=before \S ** {$minbreak}>
/, :global)
.map({ $^word.<broken> ?? "$^word-" !! $^word })
.join("\n")
}

In this version we use a global .match instead of a .comb to break the text into lines, because we need to break long words one character short of the maximal width
(. ** {$minbreak..$width-1}), then mark those lines as having been broken
(<broken=before \S ** {$minwidth}>) and then add a hyphen to those lines
( $^word.<broken> ??"$^word-"!! $^word).

Which produces:

      Look you, I shall have to be terminating my
      interdisciplinary investigation of consangui-
      neous antidisestablishmentarianism in
      Llanfairpwllgwyngyllgogerychwyrndrobwllllant-
      ysiliogogogoch. For I've just been electroph-
      otomicrographically diagnosed with pseudopne-
      umonoultramicroscopicsilicovolcanoconiosis,
      isn't it?

Howdy, $T e X$

Even with the improvements we made, the greedy line-wrapping algorithm often produces ugly unbalanced paragraphs. For example:

      No one would have believed, in the last years
      of the nineteenth century, that human affairs
      were being watched from the timeless worlds
      of space. No one could have dreamed that we
      were being scrutinised as someone with a
      microscope studies creatures that swarm and
      multiply in a drop of water. And yet, across
      the gulf of space, minds immeasurably
      superior to ours regarded this Earth with
      envious eyes, and slowly, and surely, they
      drew their plans against us...

In 1981, Donald Knuth and Michael Plass published an algorithm for breaking text into lines, implemented as part of the $T e X$ typesetting system. The algorithm considers every possible point in the text at which a line-break could be inserted and then finds the subset of those points that produces the most evenly balanced overall result.

This, of course, is far more complex and more expensive than the first-in-best-dressed approach of the greedy algorithm. In fact, as it has to consider building a line starting at every one of the N words, and running to every one of the N-M following words, it is clearly going to require O(N²) space and time to compute, compared to the greedy algorithm’s thrifty O(N). On a typical paragraph like the examples above, the $T e X$ algorithm runs about 60 times slower.

But as most paragraphs are short (50 to 100 words), an N² cost is often acceptable.
So here’s a simple version of that approach, in Raku:

sub TeX-wrap ($text, :$width = 80, :$minbreak = 5 ) { # Extract individual words, hyphenating if necessary... my @words = $text.words.map: { my @breaks = .comb: $width-$minbreak; @breaks[0..*-2] »~=» '-'; |@breaks; }; # Compute handy text statistics... my @word-len = @words».chars; my $word-count = @words.elems; # These track EOL gaps, plus cost and position of breaks... my @EOL-gap = [0 xx $word-count+1] xx $word-count+1; my @line-cost = [0 xx $word-count+1] xx $word-count+1; my @total-cost = 0 xx $word-count+1; my @break-pos = 0 xx $word-count+1; # Build table of EOL gaps for lines from word i to word j... for 1..$word-count -> $i { @EOL-gap[$i][$i] = $width - @word-len[$i-1]; for $i+1 .. $word-count -> $j { @EOL-gap[$i][$j] = @EOL-gap[$i][$j-1] - @word-len[$j-1] - 1; } } # Work out the cost of a line built from word i to word j... for 1..$word-count -> $i { for $i..$word-count -> $j { # Overlength lines are infinitely expensive... if @EOL-gap[$i][$j] < 0 { @line-cost[$i][$j] = Inf; } # A short final line costs nothing... elsif $j == $word-count && @EOL-gap[$i][$j] >= 0 { @line-cost[$i][$j] = 0; } # Cost of other lines is sum-of-squares of EOL gaps... else { @line-cost[$i][$j] = @EOL-gap[$i][$j]²; } } } # Walk through cost table, finding the least-cost path... @total-cost[0] = 0; for 1..$word-count -> $j { @total-cost[$j] = Inf; for 1..$j -> $i { # Do words i to j (as a line) reduce total cost??? my $line-ij-cost = @total-cost[$i-1] + @line-cost[$i][$j]; if $line-ij-cost < @total-cost[$j] { @total-cost[$j] = $line-ij-cost; @break-pos[$j] = $i; } } } # Extract minimal-cost lines backwards from final line... return join "\n", reverse gather loop { state $end-word = $word-count; my $start-word = @break-pos[$end-word] - 1; take @words[$start-word..$end-word-1].join(' '); $end-word = $start-word or last; } }

It’s slower and far more complex than the greedy algorithm but, as with so many other aspects of life, you get what you pay for...because it also produces much better
line-wrappings, like these:

      No one would have believed, in the last years
      of the nineteenth century, that human affairs
      were being watched from the timeless worlds
      of space. No one could have dreamed that
      we were being scrutinised as someone with
      a microscope studies creatures that swarm
      and multiply in a drop of water. And yet,
      across the gulf of space, minds immeasurably
      superior to ours regarded this Earth with
      envious eyes, and slowly, and surely, they
      drew their plans against us...

      It is a truth universally acknowledged, that
      a single man in possession of a good fortune
      must be in want of a wife. However little
      known the feelings or views of such a man
      may be on his first entering a neighbourhood,
      this truth is so well fixed in the minds
      of the surrounding families, that he is
      considered the rightful property of some one
      or other of their daughters.

      Look you, I shall have to be terminating
      my interdisciplinary investigation of
      consanguineous antidisestablishmentarianism
      in Llanfairpwllgwyngyllgogerychwyrndrobwlll-
      lantysiliogogogoch. For I've just been
      electrophotomicrographically diagnosed with
      pseudopneumonoultramicroscopicsilicovolc-
      anoconiosis, isn't it?

Slow is smooth; smooth is fast

You get what you pay for, but there’s no reason to overpay for those benefits.
The Knuth/Plass algorithm is widely used, and hence has been the subject of extensive optimization efforts. Versions have now been devised that run in linear time and space, though the intrinsic complexity always has to go somewhere, and it generally winds up
in the code itself...as O(N³) incomprehensibility.

But not all of the optimized solutions are brain-meltingly complicated. For example, there’s an elegant O(N * width) algorithm that implicitly converts the text into a directed graph, in which each node is a word and the weight of each edge is the cost of breaking a line at that word. The optimal break points can then found in linear time by computing the shortest path through the graph.

In Raku, that looks like this:

sub shortest-wrap ($text, :$width = 80, :$minbreak = 5) {
# Extract and hyphenate individual words (as for TeX)...
my @words = $text.words.map: {
my @breaks = .comb: $width-$minbreak;
@breaks[0..*-2] »~=» '-';
|@breaks;
};
my $word-count = @words.elems;

# Compute index positions from start of text to each word...
my @word-offset = [+] 0, |@words».chars;

# These track minimum cost, and optimal break positions...
my @minimum = flat 0, Inf xx $word-count;
my @break-pos = 0 xx $word-count+1;

# Walk through text tracking minimum cost...
for 0..$word-count -> $i {
for $i+1..$word-count -> $j {
# Compute line width for line from word i to word j...
my $line-ij-width
= @word-offset[$j] - @word-offset[$i] + $j - $i - 1;

# No need to track cost for lines wider than maximum...
last if $line-ij-width > $width;

# Cost of line increases with square of EOL gap...
my $cost = @minimum[$i] + ($width - $line-ij-width)²;

# Track least cost and optimal break position...
if $cost < @minimum[$j] {
@minimum[$j] = $cost;
@break-pos[$j] = $i;
}
}
}

# Extract minimal-cost lines backwards (as for TeX)...
return join "\n", reverse gather loop {
state $end-word = $word-count;
my $start-word = @break-pos[$end-word];
take @words[$start-word..$end-word-1].join(' ');
$end-word = $start-word or last;
}
}

This approach sometimes optimizes line-breaks slightly differently from the $T e X$ algorithm, but always with the same overall “balanced” appearance. For example:

      No one would have believed, in the last
      years of the nineteenth century, that human
      affairs were being watched from the timeless
      worlds of space. No one could have dreamed
      that we were being scrutinised as someone
      with a microscope studies creatures that
      swarm and multiply in a drop of water.
      And yet, across the gulf of space, minds
      immeasurably superior to ours regarded this
      Earth with envious eyes, and slowly, and
      surely, they drew their plans against us...

The major difference between these two “best-fit” algorithms is that the shortest-path approach tries to balance all the lines it builds, including the final one, so it tends
to produce a “squarer” wrapping with shorter lines generally, but a longer last line.

It also runs five times faster than the $T e X$ approach (but still ten times slower than
the greedy algorithm).

Punishing widows and orphans

There’s a subtle problem with all three approaches we’ve looked at so far: they each optimize for only one thing. Greedy wrapping optimizes for maximal line-widths,
whereas $T e X$ wrapping and shortest-path wrapping both optimize for maximal line balance
(i.e. minimal raggédness).

But, as desirable as each of those characteristics are, there are other
typographical properties we might also want to see in our wrapped text.
Because there are numerous other ways for a piece of text to be ugly:

      Now is the winter of our discontent made
      glorious summer by this sun of York; and
      all the clouds that lour'd upon our
      house in the deep bosom of the ocean
      buried. Now are our brows bound with
      victorious wreaths; our bruised arms
      hung up for monuments; our stern
      alarums changed to merry meetings, our
      dreadful marches to delightful
      measures.

Apart from the disconcerting unevenness of the lines, this wrapping is also mildly irritating because it repeatedly breaks a line at a grammatically infelicitous point, leaving single words (such as “and”, “buried”, “our”, and “measures”) visually isolated from the rest of their
phrase.

Isolated words at the end of a line are known as widows and at the start of a line as orphans.
Cut off by a line break from their proper context, they make the resulting code look awkward and badly formatted, particularly if (as here) a widow also constitutes the entire last line of a paragraph.

It’s usually possible to avoid creating widows and orphans, by breaking the text one word earlier or later:

      Now is the winter of our discontent made
      glorious summer by this sun of York; and all
      the clouds that lour'd upon our house in
      the deep bosom of the ocean buried. Now are
      our brows bound with victorious wreaths;
      our bruised arms hung up for monuments;
      our stern alarums changed to merry meetings,
      our dreadful marches to delightful measures.

...but to achieve this effect, our line-wrapping algorithm would have to be aware
not just of the width and balance of the lines it creates, but also of the content
of the text, and the aesthetic consequences of where it chooses to break each line.
In practical terms, this means it needs a more sophisticated cost function to optimize.

The cost function that the greedy algorithm attempts to minimize is just the sum of the lengths of the gaps at the end of each line:

    sub cost (@lines, $width) {
        sum ($width «-« @lines».chars)
    }

In contrast, the $T e X$ and shortest-path algorithms attempt to reduce the variation in
end-of-line gap lengths, by minimizing the sum-of-squares:

    sub cost (@lines, $width) {
        sum ($width «-« @lines».chars)»²
    }

But we can easily minimize other properties of a series of wrapped lines, by implementing and applying more complex cost functions. For example, let’s redesign the greedy algorithm (our fastest alternative) to improve its overall line balance, and at the same time to reduce the number of widows and orphans it leaves in the wrapped text.

The cost function we’ll use looks like this:

    sub cost (@lines, $width) {
          ($width «-« @lines.head(*-1)».chars)»³».abs.sum
        * @lines³
        * (1 + 10 * ( @lines.grep(ORPHANS)
                    + @lines.grep(WIDOWS)
                    )
          )³;
    }

The cost it computes for a given set of lines is derived by quantifying and then multiplying together three desirable characteristics of a wrapped paragraph:

the uniformity of the wrapped lines, measured as the sum-of-cubes of the
end-of-line gaps for every line except the last:
($width «-« @lines.head(*-1)».chars)»³».abs.sum
the compactness of the resulting paragraph, measured as the cube of the total
number of lines: @lines³
the number of widows and orphans created, measured as the cube of ten times
the total number of isolated words found:
(1 + 10 * ( @lines.grep(ORPHANS) + @lines.grep(WIDOWS) ) )³

The cost function uses cubes instead of squares to more quickly ramp up the penalty incurred for introducing multiple unwanted features, compared to the zero cost of ideal lines.
The factor of ten applied to widows and orphans reflects a particularly robust aesthetic
objection to them (tweak this number to suit your personal level of typographical zeal).

Orphans and widows are detected as follows:

    sub ORPHANS {/ ^^  \S+  <[.!?,;:]>  [\s | $$] /}
    sub WIDOWS  {/ <[.!?,;:]>  \s+  \S+  $$       /}

An orphan is a single word at the start of a line (^^ \S+) followed by any phrase-ending punctuation character (<[.!?,;:]>), followed by a space or the end of the line
([\s | $$]). A widow is a single word immediately after a punctuation character
(<[.!?,;:]> \s+ \S+), which is also at the end of the line ($$).

With this more sophisticated cost function we can now optimize for both structural
properties and aesthetic ones. We could also extend the function to penalize other
unwanted artefacts, such as phrases fractured after their introductory preposition,
split infinitives, or articles dangling at the end of a line:

sub ESTRANGED { / \s [for|with|by|from|as|to|a|the] $$/ }

sub cost (@lines, $width) {
($width «-« @lines.head(*-1)».chars)»³».abs.sum
* @lines³
* (1 + 10 * ( @lines.grep(ORPHANS)
+ @lines.grep(WIDOWS)
+ @lines.grep(ESTRANGED)
)
)³;
}

In order to optimize a line-wrapping using a complex cost function like this, we need a way to generate alternative wrappings...which we can then assess, compare, and select from.
But the greedy wrapping approach (and, indeed, the $T e X$ algorithm and shortest-path technique as well) always generates only a single wrapping. How do we get more?

An easy and quick way to generate those additional wrappings is to use the greedy
approach, but to vary the width to which it wraps. For example, if we wrap the same text
to 45 columns, then to successively shorter widths, like so:

    for 45...40 -> $width {
        my $wrapping = greedy-wrap($text, :$width);
        my $cost     = cost($wrapping.lines, $width);

        say "[$width columns --> cost: $cost]";
        say "$wrapping\n";
    }

...we get:

      [45 columns --> cost: 40768]
      Far back in the mists of ancient time, in the
      great and glorious days of the former
      Galactic Empire, life was wild, rich and
      largely tax free.

      [44 columns --> cost: 10051712]
      Far back in the mists of ancient time, in
      the great and glorious days of the former
      Galactic Empire, life was wild, rich and
      largely tax free.

      [43 columns --> cost: 3662912]
      Far back in the mists of ancient time, in
      the great and glorious days of the former
      Galactic Empire, life was wild, rich and
      largely tax free.

      [42 columns --> cost: 851840]
      Far back in the mists of ancient time, in
      the great and glorious days of the former
      Galactic Empire, life was wild, rich and
      largely tax free.

      [41 columns --> cost: 85184]
      Far back in the mists of ancient time, in
      the great and glorious days of the former
      Galactic Empire, life was wild, rich and
      largely tax free.

      [40 columns --> cost: 2752]
      Far back in the mists of ancient time,
      in the great and glorious days of the
      former Galactic Empire, life was wild,
      rich and largely tax free.

The 40-column wrapping clearly produces the most balanced and least orphaned or widowed text, and this is reflected in its minimal cost value. Of course, we’re no longer making use of the entire available width, but a 10% reduction in line length seems an acceptable price to pay for such a substantial increase in visual appeal.

More interestingly, the 40-column alternative produced in this way also looks better than the wrapping created by the far more complex $T e X$ algorithm (which unfortunately orphans the “in” at the end of the first line):

      Far back in the mists of ancient time, in
      the great and glorious days of the former
      Galactic Empire, life was wild, rich and
      largely tax free.

The iterated greedy solution is also better than the shortest-path approach, which widows “time”, orphans “life”, and wraps the lines a full 20% short of the requested 45 columns:

      Far back in the mists of ancient
      time, in the great and glorious days
      of the former Galactic Empire, life
      was wild, rich and largely tax free.

Moreover, despite now being technically O(N²)—as the O(N) greedy-wrap function must now be called N/10 times—the iterated greedy technique it still 25% faster than the $T e X$ algorithm and nearly 75% as fast as the shortest-path approach.

But we can do even better than that. Note that, as we reduced the wrapping width from 45 to 40, the narrower margin only sometimes changed the wrapping that was produced (in this case, only at 45, 44, and 40 columns). So we were actually doing twice as much work as was strictly necessary to find the optimal width.

It turns out that, if the width of the longest line in the previous wrapping is equal to
or shorter than the next candidate width, then it’s always a waste of effort to try
that next candidate width...because it must necessarily produce exactly the same
wrapping again.

So we could improve our search loop by tracking how wide each wrapping actually is and only trying subsequent candidate widths if they are shorter than that. And, if we also track the best wrapping to date (i.e. the one with the least cost) as we search, then we’ll have a complete iterated greedy wrapping algorithm:

sub iterative-wrap ($text, :$width = 80) {
# Track the best wrapping we find...
my $best-wrapping;

# Allow any width down to 90% of that specified...
for $width...floor(0.9 * $width) -> $next-width {
# Only try widths that can produce new wrappings...
state $prev-max-width = Inf;
next if $next-width > $prev-max-width;

# Build the wrapping and evaluate it...
my $wrapping = greedy-wrap($text, :width($next-width));
my $cost = cost($wrapping.lines, $next-width);

# Keep the wrapping only if it's the best so far...
state $lowest-cost = Inf;
if $cost < $lowest-cost {
$best-wrapping = $wrapping;
$lowest-cost = $cost;
}

# Try one character narrower next time...
$prev-max-width = $wrapping.lines».chars.max - 1;
}

# Send back the prettiest one we found...
return $best-wrapping;
}

With the optimization of skipping unproductive widths, this solution is now 2.5 times faster than the $T e X$ algorithm and 25% faster than the shortest-path approach.

As a final step, we could rewrite the above code in a cleaner, shorter, more “native” Raku style, which will probably make it more maintainable as well:

sub iterative-wrap ($text, :$width = 80) {
# Return the least-cost candidate wrapping...
return min :by{.cost}, gather loop {
# Start at specified width; stop at 90% thereof...
state $next-width = $width;
last if $next-width < floor(0.9 * $width);

# Create and evaluate another candidate...
my $wrapping = greedy-wrap($text, :width($next-width));
my $cost = cost($wrapping.lines, $next-width);

# Gather it, annotating it with its score...
role Cost { has $.cost }
take $wrapping but Cost($cost);

# Try one character narrower next time...
$next-width = $wrapping.lines».chars.max - 1;
}
}

In this version, we generate each candidate wrapping within an unconditional loop,
starting at the specified width (state $next-width = $width) and finishing at
90% of that width (last if $next-width < floor(0.9 * $width)).

We create each wrapping greedily and evaluate it exactly as before, but then
we simply accumulate the wrapping, annotating it with its own cost
(take $wrapping but Cost($cost)).

The Cost role gives us an easy way to add the cost information to the string containing
the wrapping, without messing up the string itself. A role is a collection of methods and attributes that can added to an existing class as a component. Other languages have similar constructs, but refer to them as “interfaces” or “traits” or “protocol extensions” or “mixins”.

In this case we simply add the extra cost-tracking functionality to the wrapping string by using the infix but operator...which transforms the left operand into a new kind of object derived from the Str class of the left operand, but (ahem!) with additional behaviours specified by the role that is the right operand.

So our gather loop collects a sequence of wrapping strings, each of which now has
an extra .cost method that reports its cost, and which then allows us to apply
the built-in min function to select and return the best wrapping produced by the loop
(return min :by{.cost} gather loop {...}).

The code of our new iterative-wrap subroutine is seven times longer
and seven times slower that the original greedy-wrap implementation.
But it also produces results that are at least seven times prettier.
And that’s a trade-off well worth making.