Perl Weekly Challenge # 17: Ackermann Function and Parsing URLs

These are some answers to the Week 17 of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in several days from now (July 21, 2019). This blog post offers some solutions to this challenge, please don't read on if you intend to complete the challenge on your own.

Challenge # 1: Ackermann Function

The common Ackermann function, named after Wilhelm Ackermann, is defined recursively as follows:

``````A(m, n) = n + 1                  if m = 0
A(m, n) = A(m - 1, 1)            if m > 0 and n = 0
A(m, n) = A(m - 1, A(m, n - 1))  if m > 0 and n > 0
``````

The function grows very rapidly even for relatively small input values, so you may not want to try to compute it for input values such as 5 and 6. You probably don't even want to try it with the first input parameter (`m`) larger than 3. It also does a very large number of recursive calls, so that it will tend to be very slow.

And I don't think it is possible to cache the results or to memoize the function and obtain really significant performance enhancements, because, as far as I can say, the `ack` function is relatively rarely called more than twice or three times with the same arguments (except for some very small values). I tried anyway to memoize the P5 `ack` function, just in case my analysis was wrong, and did not get a large improvement (for some input values, the memoized version even ran slower). So, using the `Memoize` module may make the function slighly faster, but it will not really make it possible to call it witl larger arguments, because it would still require a truly huge number of computations.

Ackermann Function in Perl 6

This is a very simple implementation of the Ackermann function in Perl 6:

``````sub ack (Int \$m, Int \$n) {
return \$n + 1 if \$m == 0;
return ack \$m - 1, 1 if \$n == 0;
return ack \$m - 1, ack \$m, \$n-1;
}
say ack 3, 4;
``````

This prints 125.

Note that we don't need parentheses for the `ack` subroutine calls, not even in the case of the double call on the third return statement: Perl 6 knows that `ack` requires two arguments and successfully manage to parse the calls correctly. That being said, you might prefer for clarity to add some parentheses on that last return statement (but remember there shouldn't be any space between the subroutine call and the opening parenthesis of the arguments):

``````    return ack(\$m - 1, ack(\$m, \$n-1) );
``````

OK, job done, it seems. But we can play with it a bit more and have some fun with some distinctive features of Perl 6.

Using Sigilless Variables

First, Perl 6 supports sigilless variables (I personally tend to prefer variables with sigils, but sigilless variables will be useful here), so that we can define our function almost exactly in the terms of the challenge requirement:

``````sub A (Int \m, Int \n) {
return n + 1 if m == 0;
return A(m - 1, 1) if n == 0; # m > 0 if we get here
return A(m - 1, A(m, n - 1)); # m and n > 0 if we get here
}
say A 3, 4;
``````

Note that the value of a sigilless variable (which is in fact an alias to the value assign to it) cannot be changed, but that's OK since we never modify either `m` or `n`, but only create new local versions of them on each recursive call.

Using Multi Subroutines

Perl 6 allows for writing several routines with the same name but different signatures. They are introduced with the `multi` keyword. We can use this feature to write three versions of the `ack` function, one for each of the conditionals:

``````use v6;
subset Positive of Int where * >= 0;

multi ack (0, Positive \n) { n + 1 }
multi ack (Positive \m, 0) { ack(m - 1, 1) }
multi ack (Positive \m, Positive \n) {
ack(m - 1, ack(m, n - 1))
}
sub MAIN (Positive \m, Positive \n) {
say ack +m, +n;
}
``````

We now need to pass the values of `m` and `n` to the program:

``````\$ perl6 ackermann2.pl6 3 4
125
``````

I originally wanted to declare three multi `MAIN` subroutines (and that's feasible), but that turned out to be impractical in this case because I did not know where to put the print statement to display the result, which has to occur only after the cascade of recursive calls is completed.

Ackermann Function in Perl 5

Translating our initial P6 implementation into Perl 5 is very easy:

``````#!/usr/bin/perl
use strict;
use warnings;
no warnings 'recursion';
use feature qw/say/;

sub ack {
my (\$m, \$n) = @_;
return \$n + 1 if \$m == 0;
return ack(\$m - 1, 1) if \$n == 0;
return ack(\$m - 1, ack(\$m, \$n-1));
}
say ack 3, 4;
``````

This prints again 125.

By the way, please note that the Perl 5 `ack` subroutine just above works perfectly fine in Perl 6. Although it is far more idiomatic to use signatures for managing subroutine parameters, it is still possible in Perl 6 to use the `@_` array for retrieving the parameters passed to a subroutine. The P5 and P6 languages aren't so different, after all.

Also note that we had to suppress the recursion warning, because the recursion stack exceeds 100 calls with the arguments used in the example.

Challenge # 2: Parsing URLs

Create a script to parse URL and print the components of URL. According to Wiki page, the URL syntax is as below:

``````scheme:[//[userinfo@]host[:port]]path[?query][#fragment]
``````

``````scheme:   jdbc:mysql
host:     localhost
port:     3306
path:     /pwc
query:    profile=true
fragment: h1
``````

Parsing URLs in Perl 6

The right way to parse a URL is of course to write a grammar, and this is great fun in Perl 6.

I can't explain in detail how this works (as this would take many pages), but let me try to give an idea. First, we define a grammar, with a number of rules or tokens. When we use a grammar with the `parse` method on a string, the `parse` method calls by default the `TOP` rule (or token) and tries to match this rule:

``````rule TOP {
<scheme> '//' <authority>? <path> [ '?' <query> ]? <fragment>?
}
``````

This rules looks for a `<scheme>` component, followed by `//`, followed by an optional `<authority>` component, followed by a `<path>`component, etc. A `<scheme>` component is defined by the `scheme` token:

``````token scheme { \w+ [ ':' \w+]? ':'}
``````

which is composed of a group of alphanumerical characters, followed by an optional colon and another group of alphanumerical characters, and ending with a colon.

Similarly, the optional `<authority>` component, if it exists, is defined by the `authority` token:

``````token authority { [<userinfo> '@']? <host> [':' <port>]? }
``````

An `authority` component is itself composed of an optional `<userinfo>`and `@`, followed by a `<host>`, followed by an optional colon and a `<port>`. As you probably guessed by now, these three sub-components will be defined in other tokens, and so on.

In the end, if the string is successfully parsed, the result is stored into the `\$/` match object. We can then lookup for the various components with a hash-like syntax on the match object. So for example, the `<scheme>` component can be found in `\$/<scheme>` or, in the code below, in `\$match<scheme>`, since I assigned the parse result (the match object) to the `\$match` variable.

Now the full code of the program:

``````use v6;
# use Grammar::Tracer;

grammar URL {
rule TOP {
<scheme> '//' <authority>? <path> [ '?' <query> ]?
<fragment>?
}
token scheme { \w+ [ ':' \w+]? ':'}
token authority { [<userinfo> '@']? <host> [':' <port>]? }
token userinfo { <user> [':' <password> ]?}
token user { \w+ }
token password { <-[ @ ]>+ }
token host {
|| <hostname>
|| <ipv4>
|| <ipv6>
}
token hostname { \w+ [ '.' \w+ ]* }
token ipv4 { <octet> ['.' <octet> ] ** 3 }
token octet { (\d ** 1..3) <?{0 <= \$0 <= 255 }>}
token ipv6 { '[' <group> ** 8 % \: ']' }
token group { <xdigit> ** 4 }
token port { \d+ }
token path { '/' <segment>? [ '/' <segment> ]* }
token segment { \w+ [ '::' \w+ ]? }
token query {  \w+ '=' <[\w\s]>+ }
token fragment { '#' <frag_id> }
token frag_id { \w+ }
}
sub display (Str \$label, Str \$value) {
printf "    %-15s:\t %-20s\n", \$label, \$value;
}

'https://en.wikipedia.org/wiki/URL',
'https://perlcon.eu/my',
'https://www.perlmonks.org/?node=Seekers of Perl Wisdom',
'https://metacpan.org/pod/Test::More'
) -> \$url-string {
my \$match = URL.parse(\$url-string);
if \$match {
say "Matched \$url-string:";
display "scheme", ~\$match<scheme>;
display "userinfo", ~\$match<authority><userinfo>
if defined \$match<authority><userinfo>;
display "host", ~\$match<authority><host>
if defined \$match<authority><host>;
display "port", ~\$match<authority><port>
if defined \$match<authority><port>;
display "path", ~\$match<path>;
display "query", ~\$match<query>
if defined \$match<query>;
display "fragment", ~\$match<fragment>
if defined \$match<fragment>;
} else {
say "Not matched \$url-string";
}
}
``````

And this is the output with our five test URLs:

``````perl6  parse_url.p6
scheme         :         jdbc:mysql:
host           :         localhost
port           :         3306
path           :         /pwc
query          :         profile=true
fragment       :         #h1
Matched https://en.wikipedia.org/wiki/URL:
scheme         :         https:
host           :         en.wikipedia.org
path           :         /wiki/URL
Matched https://perlcon.eu/my:
scheme         :         https:
host           :         perlcon.eu
path           :         /my
Matched https://www.perlmonks.org/?node=Seekers of Perl Wisdom:
scheme         :         https:
host           :         www.perlmonks.org
path           :         /
query          :         node=Seekers of Perl Wisdom
Matched https://metacpan.org/pod/Test::More:
scheme         :         https:
host           :         metacpan.org
path           :         /pod/Test::More
``````

Undoubtedly, many more tests would be needed, as URLs can take many forms and we tested only a few cases.

Parsing URLs in Perl 5

We don't have grammars in Perl 5, so we will use regexes. But regexes are far less powerful than grammars, so we will have to be much less ambitious: we will not really try to validate full URLs, but only try to extract the components.

``````#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;

'https://en.wikipedia.org/wiki/URL',
'https://perlcon.eu/my',
'https://www.perlmonks.org/?node=Seekers of Perl Wisdom') {

\$url =~ m{
^                       # start of string
(\w+ (?: : \w+)?)       # scheme, captured in \$1
://                     # literal ://
(?:(\w+:\w+)@)?         # optional user info captured in \$2
(\w+ (?: \.\w+)*)       # host, captured in \$3
(?: : (\d+) )?          # optional port captured in \$4
(/(?:\w+ (?:/\w+)*)?)   # path, captured in  \$5
(?: \? (\w+=[\s\w]+))?  # optional query in \$6
(?: \# (\w+))?          # optional fragment in \$7
\$                       # end of string
}x;

say "Matched \$url:";
say "   scheme: \$1";
say "   userinfo: \$2" if defined \$2;
say "   host: \$3";
say "   port: \$4" if defined \$4;
say "   path: \$5";
say "   query: \$6" if defined \$6;
say "   fragment: \$7" if defined \$7;
}
``````

Okay, this large regex may be a bit difficult to read and it would be quite hard to update it if any thing needs to be changed, but it works: this program displays the following output:

``````Matched jdbc:mysql://user:password@localhost:3306/pwc?profile=true#h1:
scheme: jdbc:mysql
host: localhost
port: 3306
path: /pwc
query: profile=true
fragment: h1
Matched https://en.wikipedia.org/wiki/URL:
scheme: https
host: en.wikipedia.org
path: /wiki/URL
Matched https://perlcon.eu/my:
scheme: https
host: perlcon.eu
path: /my
Matched https://www.perlmonks.org/?node=Seekers of Perl Wisdom:
scheme: https
host: www.perlmonks.org
path: /
query: node=Seekers of Perl Wisdom
``````

Wrapping up

The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, July 28. And, please, also spread the word about the Perl Weekly Challenge if you can.