TWC 189: Saving your Degree by Great Character!

In which we achieve Single Pass and Single Expression, respectively.

Next door to the Haunted Mansion.

Note: Both of my Perl solutions use v5.36, because it auto-enables strict, warnings, and signatures.

The Raku solutions are registered as mind-altering substances.

Produced in a facility that may have contained peanuts, tree nuts, or Halloween.


TWC Task #1 - Greater Character

Fixing the Task

This task has a mis-match between the description and the examples. "...smallest character in the given array lexicographically greater than the target character" implies an undefined answer if nothing in the array is gt the target. However, the 5th example expects the target to be returned in that null case. I chose to code to match the examples, because it was a more interesting problem.

Perl

Final answer:

use v5.36;
use List::Util qw<minstr>;

sub task1 ( $aref, $target ) {
    my @gt = grep { $_ gt $target } @{$aref};

    return @gt ? minstr(@gt) : $target;
}

List::Util::minstr has the useful property of returning undef from an empty list, which allows this shorter version that I had wanted in Raku:

    return minstr( grep { $_ gt $target } @{$aref} )
            // $target;

I decided against that single-statement version, because:
1. it requires knowledge that minstr returns undef when given an empty list, and
2. compared to the two-statement form, // deemphasizes the ill-defined part of the task.

I also considered increasing the emphasis by splitting the solution into two subs along the fault-line of the contention:

sub smallest_gt_target ($a, $t) { minstr grep { $_ gt $t } @{$a} }
sub task1              ($a, $t) { smallest_gt_target($a, $t) // $t }

What do you think, would that increase in emphasis be worth the loss in D.R.Y.ness?

Raku

Final answer:

sub task1 ( @a, Str $target --> Str ) {

    return .elems ?? .min !! $target given @a.grep( * gt $target );

I could have used .so instead of .elems, but weighed .so as less well-known.

Similar:

return $_ eq 'Inf' ?? $target !! $_ given @a.grep(* gt $target).min;

I rejected it because:

  • You must know that .min returns Inf when called on an empty list.
  • I have to use the literal $_ two times, instead of zero times (.elems, .min) in the final answer.
  • Converting a potential Inf to string in order to compare with 'Inf' via eq felt icky for a Numeric.
  • (Related) I mis-remembered that Inf does not == itself; I was confusing its behavior with NaN, which indeed does not equal itself.

Single-pass too-clever alternative:

sub task1 ( @a, Str $target --> Str ) {

    return @a.min({ $_ !gt $target, $_ }) max $target;

BWAHAHAHAHA!

How does that work?

  • The methods .min, .max, .minmax, and .sort all default to using cmp as the comparator.
  • When comparing two Lists, cmp is run on the first element (.[0]) of each List, and if they are Order::Same, proceeds to cmp the second elements, and so on until the tie is broken or the lists are exhausted.
  • False is "less" than True; (True,False).sort would return (Bool::False, Bool::True).
  • The !gt operator is the same as le, but used in its negative form here to keep the "greater than" from the original wording and make the "greater than target" values "sort" (really the more efficient min) to the top, so we can only get a minimum value le target if no value is gt target.
  • If @a contained anything gt target, .min returns it, and it always wins in the fight against max $target, so the result is the intended value.
  • If @a contained nothing gt target, .min returns the lowest value in the whole array, and it must always lose in the fight against max $target, so the result is the specified target.

Even though I am writing Raku documentation compare&contrast sort versus min/max/minmax, and I wrote this code myself, I find this code hard to grok after being away for 5 minutes. It is particularly egregious "clever code", because it swallows the original problem requirements into its cleverness. You can't read that (uncommented) code and infer the spec. So, we don't really do such things in real life. Really!


TWC Task #2 - Array Degree

Fixing the examples

This task will involve keeping track of element counts and slice lengths and start indexes and end indexes; all of these are non-negative integers that the developer will be scrutinizing and staring at, past the midnight hour. And yet, even though the problem is exactly the same with arrays filled with any-old-thing, the task specifies non-negative integers for the array contents, and indeed, every example contains nothing but non-negative integers. Non-negative integers everywhere! It is enough to make a person... non-positive.

So, while I have kept the original examples intact, I commented them out during development. Just below them, you will see exact equivalent letter transpositions:
1 => A, 2 => B, etc. Thus, I kept my sanity. (But seem to have increased my verbosity?)

Key insight:

The positions of the element(s) contributing to the degree are all that matter.

If @a is 1000 elements, each a single capital letter, and the array's Degree is 99 (due to having 99 B, 99 F, 99 Y, and the other letters 98-or-less each), then our final slice of @a must include all the B's or all the F's or all the Y's. Otherwise, we have reduced the degree. Also, whichever of (B|F|Y) we choose, the slice must start and end with that letter. Otherwise, the slice is not as small as it could be.

Raku has the Bag type, which can concisely give us the array Degree with @a.Bag.values.max, but it will turn out that knowledge of the array Degree will not be useful enough to be worth calculating early!

For each element that exists in the array, we only need to know:

  • Count of occurrences
  • First position seen
  • Last position seen
  • Span (last_position - first_position + 1)

Once we have that information, of all the elements with maximum count (Degree), find which has the minimum span, and maybe use first position as a tie-breaker. For that element, make a Range of the first and last positions, and return a slice of @a[ $first .. $last ].

Raku (built-in) and Perl (module) have concise methods to "group" the array indexes by the elements found there. You will see code that does, and does not use them.

Perl

Raku's min/max do take &by parameters, but do not return a list when there are ties.

Raku's minpairs/maxpairs do return a list when there are ties, but do not take &by parameters.

The Perl module List::UtilsBy does both, and is a great fit for this problem.

Final answer:

#!/usr/bin/env perl
use v5.36;
use experimental     qw<for_list builtin>;
use builtin          qw<indexed>;
use List::UtilsBy    qw<max_by min_by>;

sub task2 ( @a ) {
    my %h;
    for my ($k, $v) (indexed @a) {
        my $href = ( $h{$v} //= {} );

        $href->{KEY  }   = $v;
        $href->{COUNT}  +=  1;
        $href->{FIRST} //= $k;
        $href->{LAST }   = $k;
    }
    $_->{SPAN} = $_->{LAST} - $_->{FIRST} + 1 for values %h;

    my $best = min_by { $_->{FIRST} }
               min_by { $_->{SPAN } }
               max_by { $_->{COUNT} }
               values %h;

   return [ @a[ $best->{FIRST} .. $best->{LAST} ] ];
}

A few tricks used here:

  • Perl v5.36's for_list and indexed are combined to have the same effect as Raku's for @a.kv -> $k,$v {...}.
  • my $href = ( $h{$v} //= {} ); just grabs a reference when the sub-hash already exists in %h. Otherwise it creates the sub-hash in %h just before taking the reference. This is a well-known idiom here, but I should have split it into two lines for expository code.
  • The sub-hashes are each a Record, achieved in Perl by only using a fixed set of keys. (Perl's hashes serve the purpose of other languages' Dictionaries,Sets,Bags,Mixes,Records just by how we use them. Perl's arrays similarly act as Stacks or Queues by only coding push&pop or push&shift.)
  • COUNT is auto-vivified and interpreted as zero for +=, so adding 1 gives us a running total of the times the element has been seen.
  • FIRST will only be undef the first time each element is seen, so //= only populates it once.
  • LAST is simply assigned to, so it keeps getting over-written; when we exit the loop, it will hold the last index where that element was seen.
  • SPAN would be awkward to calculate (and have to be recalculated on each loop), so a postfix for can populate it concisely and efficiently with the size of the slice reaching from first index where the element was seen to the last such index.
  • The array element is kept in the KEY slot, just to help with debugging. It gets overwritten on each loop, but always with the same value, so no problem. Now that I write that, I see that it could also use //= for efficiency.
  • max_by and min_by receive lists, and are called in list context until the last min_by, so ties are passed through. The last one cannot be a tie, because only one element could have any particular FIRST.
  • The chain of max_by and min_by act as tie-breakers; if only one element makes up the array's Degree, then it will be the only hashref emitted by max_by, and the two min_by will just pass it through without needing to do other work. If more than one element create the same maximum Degree, then the lowest SPAN is found, and so on.

I also wrote a version that collapsed (most of) the building of %h into a single line using List::Categorize:

use List::Categorize qw<categorize>;
sub task2 ( @a ) {
    my %h = categorize { $a[$_] } keys @a;

    my $best = min_by { $_->[0]                } # First
               min_by { $_->[-1] - $_->[0] + 1 } # Span
               max_by { 0+@{$_}                } # Count
               values %h;

    my ($head, $tail) = @{$best}[0,-1];
    return [ @a[ $head .. $tail ] ];
}

This may be less efficient than the earlier version; categorize (like .classify later in the Raku code) will keep all of the locations where each element is found, which is much more info than we strictly need. More important to me is that hand-rolling %h allows me to NAME the slots in each href. The %h that categorize populates requires comments to clarify the min_by and max_by code blocks.

Raku

Final answer:

sub task2 ( @a ) {
    my $best =
        @a.pairs
        .classify( {.value}, :as{.key} )
        .map({ .value })
        .max({ .elems, -(.tail - .head + 1), -.head });

    return @a[ $best.head .. $best.tail ];
}

How does that work? Well, let's look at almost the same code, with intermediate variables for each line, like the Raku REPL could demonstrate:

my @a = <A B C B D A B D A>;
       # 0 1 2 3 4 5 6 7 8 are the indexes.


my @b = @a.pairs;
    [0 => A  1 => B  2 => C  3 => B
     4 => D  5 => A  6 => B  7 => D  8 => A]


my @c = @b.classify( {.value}, :as{.key} );
    [A => [0 5 8] D => [4 7] C => [2] B => [1 3 6]]


my @d = @c.map({ .value });
    [[0 5 8] [4 7] [2] [1 3 6]]

(Without the map of .value, we would have to precede many methods with .value in the lines below. Surprisingly, we do not need the .key at all any more!)

Here we will diverge from the actual code, so you can see the values that .max will operate on in the real code.

my @e = @d.map: {
    my $r = .head .. .tail;
    (.elems, -$r.elems, -.head, $r);
};


say @e.sort.reverse; # For illustration only; we will really
                     # use `max` instead of `sort.reverse`.
                     # Note that Span and First will be negated.

   Count = Number of times it occured
   |
   |  Span = size of the range from first to last occurance
   |  |
   |  |  First position found
   |  |  |
   |  |  |  Range of first..last positions found
   |  |  |  |
   v  v  v  v
( (3 -6 -1 1..6)   # 'B' 3 times, span=6, first in pos 1, over 1..6
  (3 -9  0 0..8)   # 'A' 3 times, span=9, first in pos 0, over 0..8
  (2 -4 -4 4..7)   # 'D' twice, is less than the degree of A or B
  (1 -1 -2 2..2) ) # 'C' once , is less than the degree of A or B


my $best = @e.max;
    (3 -6 -1 1..6)


return @a[ $best[3].list ]; # Expand the Range 1..6
                            # to allow the slicing.

We are relying on the max/min/cmp action on List that we used in Task#1.

The black-cat-level sneakiness here is that we can get the min of numbers by scanning for the max of their negated values.

So, we don't really need that big .map to build the List of sorting fields; we can specify them inline as the &by argument to .max:

.max({ .elems, -(.tail - .head + 1), -.head });

Now $best is the full list of indexes of the winning element, and we need the full range from the first place that element was found, to the last, even including other intervening elements (or it wouldn't be a "slice").

return @a[ $best.head .. $best.tail ];

By the way, these three produce the same result Pairs. Which do you like better?

@a.keys  .classify( { @a[$_] }              )
@a.pairs .classify( { .value }, :as{ .key } )
@a       .classify( { $_     }, :as{ $++  } )

I'm fixing a hole where the rain gets in
And stops my mind from wandering
Where it will go
(La la la la la)
-- Fixing a Hole, by The Beatles
(as covered by George Burns, after being re-arranged by the Bee Gees. BWAHAHAHAHAHA!)

Leave a comment

About Bruce Gray

user-pic "Util" on IRC and PerlMonks. Frequent speaker on Perl and Raku, but infrequent blogger.