## TWC 205: Exclusive Third Or First

In which we ponder the highest bit, and find a much faster Max_XOR.

# TWC Task #1 - Third Highest

## Task:

Given an array of integers, return the Third Highest if found, otherwise return the maximum.

## Observations:

The last example shows that duplicate values are to be suppressed;

`task1(5,4,4,3)`

should return`3`

.The very efficient single-pass algorithm for third-highest-maximum is very

*inefficient*, w.r.t. programmer time, so I just sorted.

## Perl

```
use v5.36;
use List::Util qw<uniq>;
sub task1 ( $ns ) {
my ( $x1, $x2, $x3 ) = sort { $b <=> $a } uniq @{$ns};
return $x3 // $x1;
}
```

## Raku

```
sub task1 ( @ns ) {
(.elems == 3 ?? .head !! .tail) given @ns.sort.squish.tail(3);
}
```

`.squish`

is more efficient than`.unique`

, and can be used as an exact replacement when the list is already sorted.`given`

aliases the RHS expression into`$_`

, allowing the concise (`elems`

,`head`

,`tail`

) method calls.

# TWC Task #2 - Maximum XOR

## Task:

Given an array of integers, find the highest value obtained by XORing any two distinct elements.

## Perl

```
use v5.36;
use ntheory qw<forcomb>;
use List::Util qq<max>;
sub task2 ( @ns ) {
my $r = 0;
forcomb {
$r = max( $r, $ns[$_[0]] ^ $ns[$_[1]] )
} @ns, 2;
return $r;
}
```

## Raku

```
sub task2 { @_.unique.combinations(2).map({ [+^] .list }).max }
```

`[+^]`

is the "reduction" form of the `+^`

XOR operator; it XORs *all* the things on its right, which in this case are both of the elements in the two-element list passed to `map`

.

That simple `.combinations(2)`

works fine, with it's `N*(N-1)/2`

performance, until the input array gets large.

With a bit of analysis, we can go *much* faster!

```
sub task2_fast ( @ns ) {
sub hi-bit ( UInt $n ) { $n.log2.floor }
my %grouped = @ns.unique.classify(&hi-bit);
my ($top_group, @lesser_groups) = %grouped.sort(-*.key)
.map(*.value);
if !@lesser_groups {
my $removes_hi-bit = 1 +< hi-bit($top_group[0]);
return task2_fast( $top_group.list »-» $removes_hi-bit );
}
else {
return [max] @lesser_groups.map: {
($top_group.list X+^ .list).max;
}
}
}
```

### What we know:

XOR on single bits:

- 0^0 == 0
- 0^1 == 1
- 1^0 == 1
- 1^1 == 0

XOR is commutative (order does not matter): a^b == b^a

- So, we can use
`.combinations(2)`

instead of`@ns Xop @ns`

Cartesian product.

- So, we can use
a^a==0, which is the minimum possible result, and is only achievable with a^a.

`x^y == 0`

implies`x==y`

- So,
`.unique`

is safe (and helpful); duplicate elements XOR to zero. - Ack! Corner case not handled: @ns.elems > 1, but @ns.unique.elems == 1.

Raku has 3 XOR ops:

`xor`

short-circuiting boolean XOR, low precedence`^^`

short-circuiting boolean XOR, high precedence`+^`

*numeric**bitwise*XOR (which is what this task needs)

The highest binary bit set in an integer can be found in

*many*ways in C and assembler. Traditionally, the rightmost bit ("ones" position) is`0`

, next to the left ("twos" position) is`1`

, next ("fours" position) is`2`

, and so on.- In Perl,
`floor(log($N)/log(2))`

works, where`floor`

is from the`POSIX`

module, or from the new`Perl v5.36`

"builtin". In Raku, this is easily done with

`.log2.floor`

. e.g. the highest bit in 3 (0b11) is`1`

, and the highest bit in 3 (0b100) is`2`

:`$ raku -e 'for sort classify *.log2.floor, ^1024 { say join "\t", .key, .value.minmax.gist, .value.minmax.bounds.fmt("0b%b", "..") }' Hi-Bit RangeInDecimal RangeInBinary -Inf 0..0 0b0..0b0 0 1..1 0b1..0b1 1 2..3 0b10..0b11 2 4..7 0b100..0b111 3 8..15 0b1000..0b1111 4 16..31 0b10000..0b11111 5 32..63 0b100000..0b111111 6 64..127 0b1000000..0b1111111 7 128..255 0b10000000..0b11111111 8 256..511 0b100000000..0b111111111 9 512..1023 0b1000000000..0b1111111111`

- In Perl,
Even if all we know about two integers is the highest bit set in each, we still know something about their XOR. Given

`A > B`

:- If
`A`

and`B`

have*different*highest bits,`A`

has the higher of the two highest bits, and`A XOR B`

will have that highest bit from A. Further,`(A XOR B)`

will always be greater than`B`

. - Otherwise,
`A`

and`B`

have the*same*highest bit, so`A XOR B`

will clear that bit. Further,`(A XOR B)`

will always be less than`A`

.

- If

We will see concrete examples of these "different highest bits" and "same highest bits" later. For now, we posit that some benefit will be gained if we group (via `.classify`

) the input array by the highest bit set.

```
sub hi-bit ( UInt $n ) { $n.log2.floor }
my %grouped = @ns.unique.classify(&hi-bit);
```

Now if we only had a way to re-combine all these groups, that guaranteed we did not miss *any* combinations of the original `@ns`

elements...

### Useful technique from combinatorics:

(or, The Whole Is Always Equal To The Sum Of Its Parts, But Only When Using The Correct Definition Of "Sum")

Classifying (sub-grouping) all of some list into groups, and then taking the cross of each `.combinations(2)`

of those groups, plus the `.combinations(2)`

*within* each group, will exactly equal a generic `.combinations(2)`

of the initial list. Every two-element pairing will still occur exactly once. e.g:

```
$ raku -e '.say for (1..30).combinations(2)' | wc -l
435 # 30*29/2, as expected
$ raku -e '
my @groups = (1..30).classify(* % 10).sort.map(*.value); # 10 subgroups, by last-digit for fun
for @groups.combinations(2) -> ($g1, $g2) { # All 2-combinations of groups,
.say for $g1.list X $g2.list; # one group crossed with another group
}
for @groups -> $g1 { # Each group,
.say for $g1.combinations(2); # 2-combinations of only itself
};
' | wc -l
435 # same count as the simple `.combinations(2)`, just sliced in a (potentially) more useful fashion.
```

### Once grouped by highest set bit, we can deduce:

- Same-bits become zero, and all in a group have the same high bit.
- XOR of any X within a group with any Y within that same group always yield a cancellation of the high bit. The result will be in a lower group.
- XOR of any X within a group with any Y within a lower group can never cause a cancellation of the X high bit. The result will be in the same group as X.
- Therefore, the only time you have to cross (or
`.combinations(2)`

) within a group, is if that group is the*only*group. - Also, since no combination of any X or Y from any non-top groups can combine with XOR to
*generate*the high bit, no groups need to be crossed with each other using`X+^`

; only the top group needs to be crossed with every other group. - Whenever there is only one group, you still have a shortcut, because the max-XOR that they can generate is the same as the XOR of that group with their high-bit turned off; one or more rounds of recursion will get us to a multi-group solution, guaranteed by the initial
`.unique`

.

### Extended example - "same highest bits"

Consider `@ns = 876, 920, 930;`

. All are in "group 9" (512..1023).

```
876 == 0b1101101100, after bit-9 removed: 0b101101100 == 364
920 == 0b1110011000, after bit-9 removed: 0b110011000 == 408
930 == 0b1110100010, after bit-9 removed: 0b110100010 == 418
876 +^ 920 == 364 +^ 408 == 244
876 +^ 930 == 364 +^ 418 == 206
920 +^ 930 == 408 +^ 418 == 58
```

So, `max_XOR(876,920,930)`

will have the same result as `max_XOR(364,408,418)`

, because all the XOR-ed combinations of the former equal all the XOR-ed combinations of the latter. So, we recurse with the reduced numbers. Again, they are all in the same group, this time "group 8".

```
364 == 0b101101100, after bit-8 removed: 0b01101100 == 108
408 == 0b110011000, after bit-8 removed: 0b10011000 == 152
418 == 0b110100010, after bit-8 removed: 0b10100010 == 162
```

So, `max_XOR(364,408,418)`

will have the same result as `max_XOR(108,152,162)`

, so we recurse again.

`(108,152,162)`

are *not* all in one group, so they have "different highest bits", which I needed a bigger example to show.

### Extended example - "different highest bits"

Consider `@ns = flat 1003, 1001, 1..511;`

When grouped by hi-bit, (1003,1001) are in "group 9", because their highest bit is `9`

: (0b1111101011, 0b1111101001). In this example, I call this group `G9`

.
All the rest are in groups between `0`

and `8`

. In real code, they would be in separate groups, but for this example, I lump them all together, calling the super-group `G0_8`

.
How can we combine these?

```
G0_8 X+^ G0_8 always has bit 9 unset, because none of them have bit 9 set.
G9 X+^ G9 always has bit 9 unset, because the XOR cleared it.
G9 X+^ G0_8 always has bit 9 set, since the G9 element has bit 9, and nothing in G0_8 has bit 9 to counteract it.
```

More generally, for *any* case of groups with "different highest bits", we *only* need to check the top group against all the lower groups.

In this particular case, the simple `.combinations(2)`

would require `513*512/2 = 131328`

XORs, while this grouping only needs `2*511 = 1022`

XORs. `131328/1022 == 128.500978`

. Even given the extra work of grouping, that will be quite a speed-up!

Worst case would be if the top group was full, *and* all lower groups were full. Even then, we cut the number of XORs in half.

### Not done:

I expect that a further speed-up is possible, by noting that for any individual `N`

in the top group, if a group exists whose hi-bit matches the first "zero" bit in `N`

, no group lower than that needs searching. E.g. for 943 (group 9) (0b1110101111), the zeros are in bits `6`

and `4`

. All elements of groups 8,7,6 must be XOR, but *if* group 6 had any elements, groups lower than 6 may be skipped. Otherwise, groups 5,4 must be XOR, but *if* group 4 had any elements, groups lower than 4 may be skipped. I did not have time to code this supposition into my solution above, nor did I verify my own logic. It might even result in a net loss of speed, given the work to find the positions of zeros in every `N`

in the top group.

Bud: Well, let’s see, we have on the bags, Who’s on first, What’s on second, I Don’t Know is on third…

Lou: That’s what I want to find out.

-- Abbott & Costello Video

If modules are allowed, then the following gives pretty much optimal performance, slightly slower than O(N); heapifing an array is O(N), removing the max of the heap is O(Log2(N)). The heapify algorithm is pretty interesting, using the siftDown() algorithm the intitial pass is a surprising O(N)!

See: https://en.wikipedia.org/wiki/Binary_heap#Building_a_heap

One thing I found confusing in your writeup, your perl solution takes a reference as the argument for the perl code, but then in your example you call it with a list. I chose to implement the list API.