PWC 057: Task #1, Invert Tree & Task #2, Shortest Unique Prefix

PWC Task #1, Invert Tree

jaredor submission for Task #1

The problem is in two parts, flipping the tree and pretty-printing it.

The flipping part is pretty easy, but since I'm a huge fan of Higher Order Perl I thought I should at least try to make it sort of like the tree walking code I remembered reading, where you give the tree-walker the function you want to operate on each node. (That word, "remembered" should be a hint that I haven't read the book in years and you should really go read the master.) I wrote both a depth-first and a breadth-first binary tree walker. For the purposes of flipping the whole tree, either one would have sufficed, but it is handy to have the option when you are experimenting.

The second (and optional) part of the problem was pretty-printing the binary tree. I think the restriction of the input binary tree to be a full one was for the benefit of the fools attempting the bonus, but me being a fool, I ignored the helpful restriction and tried writing a generic binary tree pretty-printer. I wound up with a binary tree pretty-ish-printer.

Input

The input is from the command line, with the binary tree being a list of numbers. I took the code that I used for PWC 056, Task 2, Path Sum. Since this is supposed to work on a full binary tree, the official list should be 2^N - 1 elements long. A quick and dirty way to do this on a command line would be ./ch-1.pl $(seq 7) for a full 3-level binary tree.

Invert Tree

All you need is a tree walker and an action to perform at each node, which in this case is to flip the left and right nodes. As long as you perform the flip before or after visiting both branches of the node, you will be okay.

Pretty Print Tree

Ugh. I have more free time these days ("Hello, Perl Weekly Challenge!") but this took up more of that than I wanted to because I felt I had to write a pretty-printer that handled arbitrary binary trees. There are warts on the output in terms of how nice things look, but nothing wrong with it. This code is not going to win any awards for succinctness or clarity, but here is the general idea:

  • Recursively build the tree "bottom up."
  • Keep the parent node between the two children nodes, accounting for the length of the data that will be printed for the child nodes.
  • If a node has a left branch, pre-pend "/" to the data element and, likewise, post-pend "\" if it has a right branch.
  • If there is one child node, offset the parent one space to ensure that the "diagonal nature" of left and right branching is always evident. (And also means that there is no overlap of node data when projected onto the next row of its branch data. This simplifies the already complicated line drawing logic at the expense of making the print of the tree wider than what it could minimally be.)
At this point, you have text strings of the rows of data, with the "/" and "\" characters indicating which branches each node has. So now introduce the "lines" that connect a node to its branches.
  • Take the "/" & "\" characters of row N and project them onto the numeric characters of row N+1.
  • For ease of regexp turn every numeric character into one character. I chose "@".
  • Note that the "/" & "\" characters on row N+1 were replaced by spaces.
  • There can only be two situations where "/" & "\" interact with a "@" character with zero or more spaces intervening. The two regexp that capture this are m"@ \s* [/]"xms and m"[\\] \s* @"xms.
  • In either expression that is matched, we want to
    • turn the space, if any, next to the "@" into a dot, ".", and
    • turn the other spaces into dashes, "-", and
    • leave the "/" or "\" character alone.
  • and make sure all @ characters are turned into spaces.

I'm pretty good at sed-level regexp construction, but just to get something out I did multiple passes over the line instead of figuring out one monster substitution.

And that's all I'm going to write about that. :-)

PWC Task #2, Shortest Unique Prefix

jaredor submission for Task #2

This is the kind of problem I love throwing perl at. I automatically knew I was going to use recursion using hashes and hitting my base cases by consuming the data I was manipulating. I couldn't help myself and wound up writing the answer down right away ... and then screwed myself for the rest of the challenge thinking this was Task #1 and the Invert Binary was Task #2. You laugh, but I even copied my ch-1.pl answer over on top of my ch-2.pl answer when I was loading up my git branch. Good thing I use version control.

Input

Just run the script on the command line with the list of words as arguments. You will get back a 1-to-1 list of abbreviations, just like in the problem description.

Details

The idea is that, given a list of words, you hash them by their first character. The hash uses the first character as the key and then you push the rest of the word onto an anonymous array the hash-key points to. For each key in the hash you've made, recurse on the ones that have anonymous arrays of size bigger than 1, the input to the recursion being the list that the anonymous array holds, which you get by the delete operation on the hash. What you return is a hash: The hash of singleton elements you didn't delete, plus the hashes from all the non-singleton elements you did delete and then recurse on--but before adding in these hashes from the recursions, add the key they were based on in your hash to the front of every key of the returned hash..

Wow, that last sentence was a doozy and maybe not even correct, which is why I program in perl and not in English. Maybe an example will help:

  1. I put three words into this mix as my input array of words: dog cat calf
  2. I hash them by their first letter: d => [ 'og' ], c => [ 'at', 'alf' ]
  3. The value of the 'd' key is an anonymous array of size 1, so we keep that for our return hash.
  4. The value of the 'c' key is an anonymous array of size > 1, so we delete it and recurse on the array elements; thus we wait and get back this hash: at => [ '' ], al => [ 'f' ]
  5. we add this has to our hash of singletons--but before we do, we add the key of the our deleted hash element to the front of these keys: cat => [ '' ], cal => [ 'f' ]
  6. at this point we have gone through all the keys of the hash we created, so now we return our output hash of singleton anonymous arrays: d => [ "og" ], cat => [ '' ], cal => [ 'f' ]

The recursion has to stop: Your list of words is getting smaller as well as each word being sent deeper into the recursion is getting smaller itself. The key to this code brevity is perl behavior that I love: When deleting something (here with delete and substr, but also done with pop and shift) what is returned from that operation is the thing being deleted. I can consume data with confidence, knowing that I will hit bottom and not have to worry about off-by-one indexing errors.

After this, it is almost anti-climactic, but we have to do a further transformation of our hash for easy of application. We have to go from

d => 'og' to dog => 'd'

Same amount of information content, but different packaging. This new hash becomes a lookup table and allows us to provide the desired output directly by using the input as keys.

When I got to this step, it was clear to me that I only needed to parse uniq @ARGV for my lookup table. And my next thought was that I must do this. Do you see why?

Leave a comment

About Jared Martin

user-pic I blog about Perl.