Eertree: Palindromic Tree

The Weekly Challenge 145

The Task 2 of the Weekly Challenge #145 asked us to build a Palindromic Tree. It also linked to a blog post explaining the "Eertree" data structure.

Maybe it was just me, but I found the blog post confusing. Fortunately, it further linked to a scientific paper that introduced the structure around 2014.

I spent several evenings of the Christmas holidays wrapping my head around the description of the algorithm to efficiently build the graph. Most of it is described in the proof of Proposition 2, but some parts are rather laconic. I wasn’t able to implement the creation of a suffix link from P where |P| > 1. The article says:

…just continue traversing suffix-palindromes of T starting with the suffix link of Q.

My implementation passed all the test cases from the Challenge, but I was able to find counter examples (e.g. the string “xabcxc”) for which the implementation didn’t work correctly. I clearly had no idea what “continue traversing” meant, even though I had implemented the traversing up to the moment correctly.

In the end, I noticed the blog post had a reply (kind of hidden in the UI) which (when clicked to reveal the full text) showed a link to a Python implementation. From that I was able to understand the logic and finish my implementation.

Rosetta Code

Once finished, I checked fellow challengers’ solutions. To my surprise, someone just copied the solution from Rosetta Code:

for $n (1 .. length($str)) {
   for $m (1 .. length($str)) {
      $strrev = "";
      $strpal = substr($str, $n-1, $m);
      if ($strpal ne "") {
         for $p (reverse 1 .. length($strpal)) {
            $strrev .= substr($strpal, $p-1, 1);
         }
         ($strpal eq $strrev) and push @pal, $strpal;
      }
   }
}
 
print join ' ', grep {not $seen{$_}++} @pal, "\n";

My solution had about 100 lines. I’d spent hours working on it, to only discover this short and easy implementation!

“Wait,” thought I to myself, “this is cheating. The Rosetta Code solution just lists all the palindromes, it doesn’t really construct the structure. And the structure was interesting because it was efficient. Maybe my solution is at least faster when it’s not shorter.”

And that’s actually true. For longer strings (more than 12 characters) my solution starts beating the Rosetta Code one, and for 25 characters it’s already 3.5× faster.

You can find my solution on GitHub. As there’s no Eertree module on CPAN (at least the search doesn’t return anything) I might end up uploading it there, but it needs some polishing before it’s ready.

Update: String::Eertree.

Leave a comment

About E. Choroba

user-pic I blog about Perl.