Perl weekly challenge 96

This week we had contrasting challenges.

Challenge 1 - Reverse Words

Take a string of words {with arbitrary white space around the words} and reverse the order of the words in the string and removing any redundant white space.

This is a classic example of a 1-liner....


   join q( ), reverse grep {$_} split m{\s+}, $_[0];

Challenge 2 - Edit Distance

I will provide 2 solutions here... one a less optimal solution which at the same time gives us a nice way of rendering the alignment - and then an more effic…

Perl weekly challenge 95

Palindromic numbers

You are given a number $N. Write a script to figure out if the given number is Palindrome. Print 1 if true otherwise 0.

There is an easy solution to this - to use "reverse" in string context to reverse the number and comparing the two strings:


sub is_palindrome_rev {
  return ( $_[0] eq reverse $_[0]) ? 1 : 0;
}

But this just seems a touch too easy - so let's see if we can find an alternative solution. Something that will potentially work in any base - not just base 10…

Perl weekly challenge 94

The two challenges this week were a nice introduction to the new year.

Challenge 1 - Group words into groups of anagrams.

This is a nice hash or "arrayref"s question - a Perl staple. For each group we need to generate a key, and put every anagram into this bin.

The simplest key is just to sort the letters into alphabetical order:

join q(), sort split m{}

This means the meat of the method can be written as a one liner.

sub group_anagrams {
  my $anagrams = {};
  push @{ $anagrams-…

Perl weekly challenge 93

These are some answers to the Week 93 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a few days (January 3, 2021). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

I'm not a great blogger - but I will try and explain my solutions to the Perl weekly challenge each week. I always try and look for interesting solutions to the problems at hand.

Part 1

Not su…

Perl weekly challenge 92

These are some answers to the Week 92 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a few days (December 27, 2020). This blog post offers some solutions to this challenge, please don’t read on if you intend to complete the challenge on your own.

I'm not a great blogger - but I will try and explain my solutions to the Perl weekly challenge each week. I always try and look for interesting solutions to the problems at hand.

Part 1

I started with what I thought was the best approach - keeping a hash of the letters in both words and if they don't match the required pattern then to return 0 otherwise return 1. But this looked rather messy. There must be a simpler, more elegant solution.

I thought how would I compare the words, and "normalisation" sprang to mind:

Replacing the first letter you see with "a", the second with "b", etc. That means if the pattern of repeated letters are the same (isomorphic) you would get a similar pattern in this normalised words... Now how to make this work...

Well that is relatively simple:

  • Have a counter that starts with 'a' - Perl then treats this as a text counter and so ++ generates a sequence 'a', 'b', 'c', 'd', ....
  • We create a hash of these letters, key being the letter in the word, value being the "normalised letter". If the letter is a repeat, it will already be a key in the hash so we return that.
  • This leads to a very simple way of normalising a string

my ($x,%m) = 'a'; $normalised_string = join q(), map { $m{$_}||=$x++ } split m{}, $_;

We can embed this in a map to remove the need to write the code twice in the function, by mapping over the parameters in the function call. To get the normalised strings $a & $b see below.

It is then just a case of checking $a eq $b, which evaluates to 1 or "" and "or"ing with 0, makes sure the function returns 0 in the false case.

The function then becomes:

sub iso {
  my ($a,$b) = map {
     my ($x,%m)='a'; join q(), map { $m{$_}||=$x++ } split m{}, $_ } @_;
  return $a eq $b || 0;
}

I decided to write another script that would work out the lists of all "isomorphic" words. This uses the normal form to group the words - and then print out those which have more than 1 word in the list.

I started with a list of 39944 words. 34342 are these are "isomorphic" of which 11,633 were words made entirely up of different letters.

Part 2

This was a much trickier proposition - but something I had come across before - with tiling variations, exons and BLAST hits on the human genome, and computing length of overlapping service from a database {with some really nasty MySQL code!}

This is simpler though as we know the regions we start with are all non-overlapping and in order.

  • If the new region doesn't overlap any of the regions we just push it into the list and return. Whether that be at the front in the middle or at the end.
  • We can test for the first two cases in the loop $new->{end} < $element->{start}, and then return the list of elements before $new, $new, $element and the rest of the list.
  • The last case is the end of the loop, and we just return the list with $new tacked on the end.
  • The problem comes when $new overlaps one of the regions. Firstly we work out the region containing both of the regions. Then we have to repeatedly see if this overlaps the next region and if it does repeat the process again. We then push this region onto the list.


sub int_insert {
  my( $new, @list ) = @_;
  my @new_list;
  while(my $e = shift @list) {
    return [ @new_list, $new, $e, @list ] if $e->[0] > $new->[1];
          ## The start of the next element is after the end of the new element -
          ## so we can safely push the new element and the rest of the list (and return it)
    if( $e->[1] < $new->[0] ) { ## Next element is to the left of the new 
      push @new_list,$e;        ## element so push and continue
      next;
    }
    $new->[0] = $e->[0] if $e->[0] < $new->[0];
    $new->[1] = $e->[1] if $e->[1] > $new->[1];           ## Get start/end of first overlap...
    while( @list && $new->[1] >= $list[0][0] ) {          ## this also overlaps the next list element
      $new->[1] = $list[0][1] if $list[0][1] > $new->[1]; ## Update the end of the region if rqd
      shift @list;                                        ## Remove element from list;
    }
    return [ @new_list, $new, @list ];  ## The rest of the list will be after the "new" element now
  }                                     ## So we can safely push and return it....
  return [ @new_list, $new ];           ## The new element must be after the list so we just
                                        ## return it on the end of the list...
}

The extension to this code to work out the overlap if you have two (or more) ascend non-overlapping sequences is the one that is needed to work out the overlapping transcript or mRNA/cDNA

The code is therefore slightly more complex:

The parameters are the list of arrays of intervals


sub int_merge {
  my ($result,@rest) = @_; ## List of arrays of ascending non-overlapping regions
  my @a = @{$result}; # First list.
  foreach (@rest) {
    my @b = @{$_};
    my @new;
    while( @a && @b ) { ## Stop when we get to one of the lists of intervals. 
      if( $a[0][1] < $b[0][0] ) { ## The first element of @a ends before the first element of @b 
        push @new, shift @a;
        next;
      }
      if( $b[0][1] < $a[0][0] ) { ## The first element of @b ends before the first element of @a 
        push @new, shift @b;
        next;
      }
      ## We have an overlap;
      ## The new region starts as the overlap of the first elements of both lists
      my $new_region = shift @a;
      $new_region->[0] = $b[0][0] if $new_region->[0] > $b[0][0];
      $new_region->[1] = $b[0][1] if $new_region->[1] < $b[0][1];
      shift @b; ## New region is the overlap of the two regions
      while( @a || @b ) { ## Now we look through the two lists for elements that 
                          ## overlap this new region....
                          ## Note we only stop if both of the next to elements
                          ## don't overlap the region OR we get to the end of
                          ## both lists
        if( @a && $a[0][0] <= $new_region->[1] ) { ## Element of first list overlaps - 
          $new_region->[1] = $a[0][1] if $new_region->[1] < $a[0][1]; ## extend region if required
          shift @a;                                ## and remove element
          next;
        }
        if( @b && $b[0][0] <= $new_region->[1] ) { ## Element of first list overlaps - 
          $new_region->[1] = $b[0][1] if $new_region->[1] < $b[0][1]; ## extend region if required
          shift @b;                                ## and remove element
          next;
        }
        last; ## Neither element overlaps so we finish this block
      }
      push @new, $new_region; # Add new region to list
    }
    push @new, @a, @b; ## We will have some of @a or @b left so push both on the answer
    @a = @new; ## Copy the new list of regions to @a - and repeat if required
  }
  return \@a;
}

About James Curtis-Smith

user-pic Perl developer for nearly 30 years now, mainly in maintenance scripts and web pages, using mod_perl.