80% Hacks

I'm still blogging five days a week, but obviously not here. That's largely because my new daughter is forcing me to choose where I spend my time and I can't blog too much about what I do lest I reveal trade secrets. So, just to keep my hand in, here's an ugly little "80% hack" that lets me find bugs like mad in OO code. I should really combine this with my warnings::unused hack and start building up a tool find find issues in legacy code.

First, an "80% Hack" is based on the Pareto Principle which states that 80% of the results stem from 20% of the effort. So I often write what I call 80% hacks which are simply quick and dirty tools which get things done.

The idea is simple. In legacy OO code where we're not using Moose, we have a nasty tendency to reach inside a blessed hashref. However, as classes start getting old and crufty, particularly in legacy code which is earning the company a ton of money, it's easy for someone to either misspell a hash key or refer to keys which are no longer used. What I've done is assume that each of these keys are used once and only once and I also assume they look like this:

$self->{ foo }
$_[0]  ->  { "bar" } # yeah, we need arbitrary whitespace
shift->{'something'} # and quotes

Yes, this code could be improved tremendously, but 80% hacks are personal hack which I simply don't pour a lot of time and effort into. Besides, they're fun.

#!/usr/bin/env perl                                                                                                                                                                                                                       

use strict;
use warnings;
use autodie ':all';
use Regexp::Common;

my $module = shift or die "usage: $0 pm_file";

#my $module = '/home/cpoe/git_tree/main/test_slot';

my $key_found = qr/
    (?: \$self | \$_\[0\] | shift )  # $self or $_[0] or shift
    \s* ->                         # ->
    \s* {                          # { 
    \s* ($RE{quoted}|\w*)          # $hash_key
    \s* }                          # }
/x;

open my $fh, '<', $module;

my %count_for;
while (<$fh>) {
    while (/$key_found/g) {
        my $key = $1;
        $key =~ s/^["']|['"]$//g;    # try and strip the quotes

        no warnings 'uninitialized';
        $count_for{$key}{count}++;
        $count_for{$key}{line} = $.;
    }
}

foreach my $key ( sort keys %count_for ) {
    next if $count_for{$key}{count} > 1;
    print "Possibly unused key '$key' at line $count_for{$key}{line}\n";
}

I run that with a .pm file as an argument and I get a report like:

Possibly unused key '_key1' at line 1338
Possibly unused key '_key2' at line 5325
...
Possibly unused key '_keyX' at line 4031

It's amazing how many bugs I've found with this.

Leïla and Lilly-Rose. Lilly-Rose is 3 weeks old in this photo.

I can't blog as much as I used to, but they make it all worth it.

2 Comments

I ran this on a part of the work codebase, and it didn't find bugs so much as old, dead, now-useless, leftover bits of code. Which should of course also be got rid of, but it's less urgent than outright bugs.

Leave a comment

About Ovid

user-pic Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant. Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.