Concurrency and Channels in Perl 6
I've been working on a Bayesian spam filter, but it keeps running out of memory, so I moved to something else for a while. The new concurrency stuff looks really interesting, but I don't understand it well yet. As a project, I came up with the idea of a password cracker, which would check a crypt-style hash against a word list. (This probably isn't a CPU-intensive enough task to be worth threading, but it was simple.) Here's the code, with details below:
#!/usr/bin/env perl6
use v6;
use Crypt::Libcrypt;
sub MAIN( $encrypted, $wordfile, $units=5 ){
my $salt = $encrypted.substr: 0, 2;
my $stream = Channel.new;
start { # 1
for $wordfile.IO.slurp.words -> $w {
$stream.send($w);
}
$stream.close;
};
my $match;
await do for ^$units -> $u { # 2
say "Starting unit $u";
start {
loop {
if $stream.closed.not and # 3
$stream.receive -> $w {
say "Trying $w in unit $u";
if crypt( $w, $salt ) eq $encrypted {
$match = $w;
$stream.close;
}
sleep 1;
} else {
last;
}
}
};
};
say "Found it!: $match" if $match;
}
I call it with: ./crackpass.p6 abPRdpgdfUoM. wordlist 5
That encrypted string will match the word "george" which is in the "wordlist" file.
1) Starts a thread which opens the word file and starts sending the words to the Channel $stream
.
2) Starts 5 (or however many specified on the command line) threads. Each one loops, getting the next word in $stream and checking to see if it matches the encrypted string.
3) Checks to see if $stream is closed, then gets a word from it if it's not. There's a race condition here, because if I remove the sleep line, sometimes I get "Cannot receive a message on a closed channel". I think what's happening is one thread gets to the first half of the if statement, sees that the $stream isn't closed, then goes to the second half, but in the meantime another thread takes the last word from $stream and lets it be closed, so the first thread's $stream.receive
errors. There must be a way to handle that, but I haven't figured it out yet. I know I could wrap it in a try/catch, but that seems like a kludge. Maybe I should be using earliest
or something like that.
Except for that race condition, which doesn't always happen, it does work. I have no idea how much actual concurrency is going on on my system (FreeBSD on dual-core amd64), but it's fun anyway. I'm trying to think of something more interesting to do with these tools, that would really show off what they can do -- once I understand it myself.