Forgive me, for I have sinned

I uploaded a new module to CPAN, because none of the existing options were right to me, and then I subsequently found a good few more modules.

While working on my review of modules for getting dependency information, I came across a few modules which take the path to a module and parse the source to extract information. In writing my synopsis style scripts, it would be much easier to pass a module name. So I started hacking on one of the modules (Module::Used). I needed a function to take a module name and return the path for the first module found in @INC. Previously I've written such a thing, using File::Util::SL to get the right directory path separator. Time to look on CPAN.

First I searched metacpan for 'module path', but didn't immediately find anything. After a bit more digging I found Module::Filename:

    use Module::Filename;
    my $mf = Module::Filename->new;
    print "path = ", $mf->filename('HTTP::Client'), "\n";

The OO interface seemed unnecessary, and given the aforementioned review, I'm always looking at dependencies. Turns out this script ends up pulling in 38 modules (as reported by Devel::Dependencies).

At this point I finished up Module::Path and uploaded it to PAUSE. Here's the equivalent script:

    use Module::Path qw(module_path);
    print "path = ", module_path('HTTP::Client'), "\n";

Which pulls in 1 (one) dependency, Exporter.

I finished my change to Module::Used, sent the patch off to Elliot, and went back to the review. Additional modules kept turning up, so I did a thorough trawl through relevant namespaces. And in doing so found not only more dependency related modules, but more modules that can map a module name to local path. Oops.

So, to atone for my sin, now I've got to write a review of the currently 8 modules which can do this mapping. And to try and make my module the best it can be, try a public code review, as Module::Path is mainly documentation.

The requirements for this module, beyond implementing the mapping were:

  • Be portable with respect to directory path separators.
  • Don't be surprised to find a reference in @INC (see the require doc for why).

I thought about using File::Util for the SL function, but that would have added 15 runtime dependencies. So I had a look at the code for SL() and came up with the following to get the right separator:

    my $SEPARATOR  = '/';
    
    BEGIN {
        if ($^O =~ /^(MSWin|dos|os2)/i) {
            $SEPARATOR = '\\';
        } elsif ($^O =~ /^MacOS/i) {
            $SEPARATOR = ':';
        }
    }

Update 1: Tom Molesworth pointed out the flaw in the above. The my initialisation will run after BEGIN, clobbering what might have been set there. So now (Module::Path 0.02) we have:

    my $SEPARATOR;
    BEGIN {
        if ($^O =~ /^(MSWin|dos|os2)/i) {
            $SEPARATOR = '\\';
        } elsif ($^O =~ /^MacOS/i) {
            $SEPARATOR = ':';
        } else {
            $SEPARATOR = '/';
        }
    }

Update 2: CPAN testers reported a failure on Strawberry Perl. I hopped onto #win32 on irc.perl.org, where kmx helped me track down the problem: on Windows the directory path separator is forward slash, not backslash. So now (Module::Path 0.03) we have:

    my $SEPARATOR;
    BEGIN {
        if ($^O =~ /^(dos|os2)/i) {
            $SEPARATOR = '\\';
        } elsif ($^O =~ /^MacOS/i) {
            $SEPARATOR = ':';
        } else {
            $SEPARATOR = '/';
        }
    }

Have I missed anything with that?

And then the function itself is:

    sub module_path
    {
        my $module = shift;
        my $relpath;
        my $fullpath;
    
        ($relpath = $module.'.pm') =~ s/::/$SEPARATOR/g;

        foreach my $dir (@INC) {
            # see 'perldoc -f require' on why you might find
            # a reference in @INC
            next if ref($dir);
    
            $fullpath = $dir.$SEPARATOR.$relpath;
            return $fullpath if -f $fullpath;
        }

        return undef;
    }

So very simple, but that's what I was after. I guess I could check wantarray, and if true, check every directory in @INC, rather than stopping at the first. Any other suggestions for improving this?

9 Comments

Sounds like another excellent review in the series - thanks for these, I'm finding them very useful.

Not sure how that SL replacement would work, though: that assignment will happen after the BEGIN block so any special-cased value you put in would be overwritten?

Maybe this would do the trick:


my $SEPARATOR;
BEGIN {
if ($^O =~ /^(MSWin|dos|os2)/i) {
$SEPARATOR = '\\';
} elsif ($^O =~ /^MacOS/i) {
$SEPARATOR = ':';
} else {
$SEPARATOR = '/';
}
}

As for module_path, would checking %INC be useful for cases where PAR::Packer or similar scripts have been used?

If all the modules are in a single script file, I think module_path would return undef for everything (or worse, a path to a module hanging around in @INC that would never be loaded), so just wondering if perhaps it's worth doing something like checking %INC to see if we have already loaded the module in question, and if so, use that path rather than trawling through @INC.

I don't use PAR::Packer much myself so there may be a better way of handling that, and perhaps it's out of scope for what this module should be doing anyway.

MacOS is dead, there's no need to consider it. For that matter, there's no need to use anything other than File::Spec, except I think (not sure where I got that) that you need to use File::Spec::Unix on VMS for some reason.

No wait, that VMS caveat was only for %INC, not for anything else, I think.

Looks like you forgot to push Module::Path 0.03 to GitHub...

Leave a comment

About Neil Bowers

user-pic Perl hacker since 1992.