Improve My Perl 6!

By Ovid on December 24, 2009 11:14 AM

I was playing around with the Slope One collaborative filtering algorithm. Collaborative filtering is a way of trying to guess what users may like based on past choices. There are roughly two approaches. One is the "neighbor-to-neighbor" approach. In this approach, we try to find people who have expressed similar preferences to your own and we use preferences that they've expressed but you haven't to guess what you might like in the future. This has a few problems. One, it tends to be computationally expensive. Two, you might have identical preferences to many people but if you've not expressed any preferences or the ones expressed have no overlap, you lose.

The other main approach is "item-to-item". Here we go through every user and figure out what they've expressed preferences for and correlate all of their preferences. Thus, instead of "people similar to you like Buck Rogers and thus you'll like Buck Rogers too", we have "people who like brain-dead sci-fi like Buck Rogers and thus since you like brain-dead sci-fi ...". "Item-to-item" is less computationally intensive and can also do a better job of protecting user privacy since the identity of users is not relevant. The "Slope One" algorithm is a very simple but powerful example of this concept and I came across an implementation in Python. Naturally I thought "I want to see that in Perl 6". There's nothing really amazing about my Perl 6 code and I fear I'll be writing "baby Perl 6" for a while, so if you see any "natural" improvements, I'd love to hear about them.

class SlopeOne { has %!diffs; has %!freqs; method update (%userdata) { for %userdata.kv -> $user, $ratings { for $ratings.kv -> $item1, $rating1 { %!freqs{$item1} //= {}; %!diffs{$item1} //= {}; for $ratings.kv -> $item2, $rating2 { %!freqs{$item1}{$item2} //= 0; %!diffs{$item1}{$item2} //= 0; %!freqs{$item1}{$item2} += 1; %!diffs{$item1}{$item2} += ( $rating1 - $rating2 ); } } } for %!diffs.kv -> $item1, $ratings { for $ratings.keys -> $item2 { $ratings{$item2} /= %!freqs{$item1}{$item2}; } } } method predict(%userprefs) { my ( %preds, %freqs ); for %userprefs.kv -> $item, $rating { for %!diffs.kv -> $diffitem, $diffratings { my $freq = %!freqs{$diffitem}{$item}; %preds{$diffitem} //= 0; %freqs{$diffitem} //= 0; %preds{$diffitem} += $freq * ( $diffratings{$item} + $rating ); %freqs{$diffitem} += $freq; } } return gather for %preds.keys { if ! %userprefs{$_} && %freqs{$_} { take $_ => ( %preds{$_} / %freqs{$_} ); } } } } my SlopeOne $s .= new; my %userdata = alice => { squid => 1.0, cuttlefish => 0.5, octopus => 0.2 }, bob => { squid => 1.0, octopus => 0.5, nautilus => 0.2 }, carole => { squid => 0.2, octopus => 1.0, cuttlefish => 0.4, nautilus => 0.4 }, dave => { cuttlefish => 0.9, octopus => 0.4, nautilus => 0.5 }; $s.update(%userdata); say $s.predict( { squid => 0.4 } ).perl;

That prints out something almost identical to the Python version:

["cuttlefish" => 0.25, "nautilus" => 0.1, "octopus" => 0.233333333333333]

This post cheerfully written from Calais, France, where I'm spending Christmas with my fiancée and her mother.

6 comments

Tagged as:

perl 6, rakudo

6 Comments

Ovid | December 24, 2009 12:59 PM | Reply

One tiny change I made was to have the update return this:

my %predictions = gather for %preds.kv -> $item, $pred {
    if not %userprefs{$item} and %freqs{$item} {
        take $item => $pred / %freqs{$item};
    }   
};  
return %predictions;

And that allows me to do this:

say $s.predict( { squid => 0.4 } ).fmt("Preference for %s is %0.2f");

But I couldn't figure out how to do that without the intermediate %predictions variable. I'm unsure if this is a limitation in Perl 6 or myself :)

moritz | December 24, 2009 1:45 PM | Reply

You can force coercion to a hash with

return %( gather for %preds.kv ...)

But I'm not sure if you like that bettter.

Also a perfect Perl 6 implementation wouldn't need %!freqs{$item1}{$item2} //= 0; and similar lines. I'm pretty sure you can leave it out with current Rakudo if you use prefix ++:

++%!freqs{$item1}{$item2}

Ovid replied to comment from moritz | December 24, 2009 5:14 PM | Reply

@moritz: I like that I can force coercion to a hash, but I had assumed (incorrectly?) that I could return a hashref by wrapping the gather/take in "{}". Is this invalid?

Aristotle | December 24, 2009 5:16 PM | Reply

Unfortunately Rakudo does not yet support enough Perl 6 to actually write this idiomatically. A real Perl 6 version would declare the fields like this:

has %!diffs{Pair};
has %!freqs{Pair};

And the first loop in update would look something like this:

my @rating = %userdata.values».pairs;
for @rating X @rating -> ( $item1, $rating1 ), ( $item2, $rating2 ) {
    %!freqs{$item1 => $item2} ++;
    %!diffs{$item1 => $item2} += ( $rating1 - $rating2 );
}

Nice, huh?

Ovid | December 24, 2009 5:48 PM | Reply

@Aristotle: Thank you, and yes, that is nice. Looking forward to Perl 6 handling that.

Aristotle | December 25, 2009 2:58 AM | Reply

Woops, that’s broken. I confused myself while trying to follow what’s going on – all those similarly-named variables bound in different levels of immediately nested loops were kinda dizzying to follow. The outer loop is necessary:

for %userdata.values -> %user_ratings {
    my @rating = %user_ratings.pairs;
    for @rating X @rating -> ( $item1, $rating1 ), ( $item2, $rating2 ) {
        %!freqs{$item1 => $item2}++;
        %!diffs{$item1 => $item2} += ( $rating1 - $rating2 );
    }
}

Also, masak says indexing hashes on pairs may not work the way I thought. But my reading of the spec seems to indicate that declaring the hashes as something like my %!freqs{Str, Str} is supposed to work. The changes to the loop code would be minimal – I think the hash access would then be written %!freqs{$item1; $item2}. That just so happens to be shorter and less noisy too.

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Ovid

Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/

More info »

Ovid

Improve My Perl 6!

Tagged as:

6 Comments

Leave a comment

About Ovid

Search this blog