Perl Weekly Challenge 017: Ackermann Function and URL Parsing
Ackermann Function
A(m, n) = n + 1 if m = 0 A(m, n) = A(m - 1, 1) if m > 0 and n = 0 A(m, n) = A(m - 1, A(m, n - 1)) if m > 0 and n > 0
I know that Perl6 supports multisubs, but when I see a function definition of this kind, I always think Erlang, where you get pattern matching and multisubs by default. Here’s how it looks:
-module(ackermann).
-export([ackermann/2]).
ackermann(0, N) ->
N + 1;
ackermann(M, 0) ->
ackermann(M - 1, 1);
ackermann(M, N) ->
ackermann(M - 1, ackermann(M, N - 1)).
Perl solution is a bit less straightforward.
sub ackermann {
my ($m, $n) = @_;
return $n + 1 if 0 == $m;
return ackermann($m - 1, 1) if 0 == $n;
return ackermann($m - 1, ackermann($m, $n - 1))
}
It’s not so easy, though. ackermann(3, 4)
gives several warnings of the following kind:
Deep recursion on subroutine "main::ackermann" at ...
We know that the Ackermann function recurses deeply, so we can safely ignore the warning. Let’s just place
no warnings 'recursion';
after the first return
line. But that’s just half of the story. While recursing many times, it also repeats the same calculations many times. For the 3, 4 case mentioned above, it calculates ackermann(1, 1)
(and 3 more) 114 times! It makes it very slow, ackermann(3, 9)
takes about 3 seconds on my machine.
I’ve already shown a solution to this problem: Memoize (see Perl Weekly Challenge 013). Adding
use Memoize;
memoize('ackermann');
before the subroutine declaration speeds the computation to about 0.057 secs. ackermann(4, 1)
accelerates from 16 mins 15 secs to 0.775 secs. But running ackermann(4, 2)
with memoize ends in the system swapping after 1 min 30 secs. Ackermann function is far from practical, at least for greater m’s.
URL Parsing
My first thought was to reach for a module. Someone has already read the RFC’s and many people have tested the solution. I tried URI::URL I’d already had good experience with. But alas, the example URL doesn’t conform to the definition (not even the one given in Wikipedia) because “subschemes” aren’t standard.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use URI::URL;
for my $url (
'http://choroba:s6cr6t@www.perl.org:80/index.asp?x=12#id',
'https://127.0.0.1/',
'ftp://[1:2:3:4:5:6:dead:BEEF]',
'jdbc:mysql://user:password@localhost:3306/pwc?profile=true#h1',
) {
my $url_o = 'URI::URL'->new($url);
for my $component (qw( scheme userinfo host port path query fragment )) {
if ($url_o->can($component)) {
my $value = $url_o->$component;
say "$component:\t$value" if defined $value;
}
}
}
So, I had to parse the URL myself. When I hear “parse”, I think Marpa::R2. It makes it possible to declare a grammar and use it to parse strings. I didn’t follow all the details, so I simplified the username, hostname, etc. to be just strings while in reality their format would have been probably more restrictive. It shouldn’t be so hard to update the grammar with the correct constraints, though.
#! /usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use Marpa::R2;
my $dsl = << '__DSL__';
:default ::= action => [name,values]
lexeme default = latm => 1
URL ::= Scheme (':')
MaybeAuthority MaybePath MaybeQuery MaybeFragment
action => build
Scheme ::= SchemeName
SchemeName ::= letter SchemeBody action => concat
SchemeBody ::= scheme_char SchemeBody action => concat
| scheme_char action => ::first
MaybeAuthority ::= action => ::undef
MaybeAuthority ::= ('//') MaybeUserInfo Host MaybePort action => host
MaybeUserInfo ::= action => ::undef
MaybeUserInfo ::= UserInfo ('@') action => ::first
UserInfo ::= UserName MaybePassword action => userinfo
UserName ::= String
MaybePassword ::= action => ::undef
MaybePassword ::= (':') Password action => ::first
Password ::= String
Host ::= Hostname
| ('[') IPv6 (']')
IPv6 ::= Hex ':' Hex ':' Hex ':' Hex ':'
Hex ':' Hex ':' Hex ':' Hex action => concat
Hostname ::= String action => ::first
MaybePath ::= Path action => path
Path ::=
Path ::= PathString action => ::first
PathString ::= '/' String action => concat
MaybeQuery ::= action => ::undef
MaybeQuery ::= ('?') Query action => query
Query ::= QString
MaybeFragment ::= action => ::undef
MaybeFragment ::= ('#') Fragment action => fragment
Fragment ::= String
MaybePort ::= action => ::undef
MaybePort ::= (':') Port action => port
Port ::= Num
String ::= action => empty
String ::= char String action => concat
| char action => ::first
QString ::= anychar QString action => concat
| anychar action => ::first
Num ::= digit Num action => concat
| digit action => ::first
Hex ::= hex Hex action => concat
| hex action => ::first
anychar ~ [\S]
letter ~ [a-zA-Z]
scheme_char ~ [a-zA-Z+\-.]
char ~ [\w.]
digit ~ [0-9]
hex ~ [0-9a-fA-F]
__DSL__
sub none {}
sub empty { "" }
sub host { assign(host => $_[0], $_[2]) }
sub port { assign(port => @_) }
sub query { assign(query => @_) }
sub fragment { assign(fragment => @_) }
sub path { $_[0]{path} = $_[1] // "" }
sub concat { join "", @_[ 1 .. $#_ ] }
sub userinfo { $_[0]{username} = $_[1][1], $_[0]{password} = $_[2][1] }
sub build { $_[0]{scheme} = $_[1][1]; $_[0] }
sub assign { $_[1]{ $_[0] } = $_[2][1] }
my $grammar = 'Marpa::R2::Scanless::G'->new({source => \$dsl});
for my $url (
'http://choroba:s6cr6t@www.perl.org:80/index.asp?x=12#id',
'https://127.0.0.1/',
'ftp://[1:2:3:4:5:6:dead:BEEF]',
'jdbc:mysql://user:password@localhost:3306/pwc?profile=true#h1',
) {
say "URL: $url";
my $recce = 'Marpa::R2::Scanless::R'->new({
grammar => $grammar,
semantics_package => 'main'});
$recce->read(\$url);
my $struct = ${ $recce->value };
for my $key (sort keys %$struct) {
say "$key:\t$struct->{$key}";
}
say "";
}
You can see that the grammar consists of two kinds of rules. The ones using ::=
are the rules of the context free grammar, called G1 or structural by Marpa. The ~
is used in L0 or lexical rules. If several rules share the LHS non-terminal, you can use |
instead of the ::=
to denote alternatives (not possible for empty rules, though). It’s also possible to use ||
which means an alternative with a lower priority.
Each rule can be assigned an action (there’s a default action defined at the top) which can do anything you like with the matching RHS phrases. In our example, we build a simple hash that collects the various parts of the URL.
I constructed the above grammar on the basis of the Wikipedia page. It parses the first three URLs correctly:
URL: http://choroba:s6cr6t@www.perl.org:80/index.asp?x=12#id fragment: id host: www.perl.org password: s6cr6t path: /index.asp port: 80 query: x=12 scheme: http username: choroba URL: https://127.0.0.1/ host: 127.0.0.1 path: / scheme: https URL: ftp://[1:2:3:4:5:6:dead:BEEF] host: 1:2:3:4:5:6:dead:BEEF path: scheme: ftp
but fails with the jdbs:mysql
example:
URL: jdbc:mysql://user:password@localhost:3306/pwc?profile=true#h1 Error in SLIF parse: No lexeme found at line 1, column 6 * String before error: jdbc: * The error was at line 1, column 6, and at character 0x006d 'm', ... * here: mysql://user:password@localhost:3306/pwc?profile=t Marpa::R2 exception at ...
Fortunately, it’s easy to add subschemes to it: just extend the grammar with
Scheme ::= SchemeName action => ::array
| SchemeName SubScheme action => ::array
SubScheme ::= (':') SchemeName action => ::first
and change the build
action to
sub build {
$_[0]{scheme} = $_[1][0];
$_[0]{subscheme} = $_[1][1] if defined $_[1][1];
$_[0]
}
And the output is as expected.
URL: jdbc:mysql://user:password@localhost:3306/pwc?profile=true#h1 fragment: h1 host: localhost password: password path: /pwc port: 3306 query: profile=true scheme: jdbc subscheme: mysql username: user
Sorry for the confusion about subschemes, I have updated the task couple of days ago after couple of members reported about it.
You might check out URI::Nested to deal with subschemes.