Forgive me, for I have sinned
I uploaded a new module to CPAN, because none of the existing options were right to me, and then I subsequently found a good few more modules.
While working on my review of modules for getting dependency information, I came across a few modules which take the path to a module and parse the source to extract information. In writing my synopsis style scripts, it would be much easier to pass a module name. So I started hacking on one of the modules (Module::Used). I needed a function to take a module name and return the path for the first module found in @INC. Previously I've written such a thing, using File::Util::SL to get the right directory path separator. Time to look on CPAN.
First I searched metacpan for 'module path', but didn't immediately find anything. After a bit more digging I found Module::Filename:
use Module::Filename; my $mf = Module::Filename->new; print "path = ", $mf->filename('HTTP::Client'), "\n";
The OO interface seemed unnecessary, and given the aforementioned review, I'm always looking at dependencies. Turns out this script ends up pulling in 38 modules (as reported by Devel::Dependencies).
At this point I finished up Module::Path and uploaded it to PAUSE. Here's the equivalent script:
use Module::Path qw(module_path); print "path = ", module_path('HTTP::Client'), "\n";
Which pulls in 1 (one) dependency, Exporter.
I finished my change to Module::Used, sent the patch off to Elliot, and went back to the review. Additional modules kept turning up, so I did a thorough trawl through relevant namespaces. And in doing so found not only more dependency related modules, but more modules that can map a module name to local path. Oops.
So, to atone for my sin, now I've got to write a review of the currently 8 modules which can do this mapping. And to try and make my module the best it can be, try a public code review, as Module::Path is mainly documentation.
The requirements for this module, beyond implementing the mapping were:
- Be portable with respect to directory path separators.
- Don't be surprised to find a reference in @INC (see the require doc for why).
I thought about using File::Util for the SL function, but that would have added 15 runtime dependencies. So I had a look at the code for SL() and came up with the following to get the right separator:
my $SEPARATOR = '/'; BEGIN { if ($^O =~ /^(MSWin|dos|os2)/i) { $SEPARATOR = '\\'; } elsif ($^O =~ /^MacOS/i) { $SEPARATOR = ':'; } }
Update 1: Tom Molesworth pointed out the flaw in the above. The my initialisation will run after BEGIN, clobbering what might have been set there. So now (Module::Path 0.02) we have:
my $SEPARATOR; BEGIN { if ($^O =~ /^(MSWin|dos|os2)/i) { $SEPARATOR = '\\'; } elsif ($^O =~ /^MacOS/i) { $SEPARATOR = ':'; } else { $SEPARATOR = '/'; } }
Update 2: CPAN testers reported a failure on Strawberry Perl. I hopped onto #win32 on irc.perl.org, where kmx helped me track down the problem: on Windows the directory path separator is forward slash, not backslash. So now (Module::Path 0.03) we have:
my $SEPARATOR; BEGIN { if ($^O =~ /^(dos|os2)/i) { $SEPARATOR = '\\'; } elsif ($^O =~ /^MacOS/i) { $SEPARATOR = ':'; } else { $SEPARATOR = '/'; } }
Have I missed anything with that?
And then the function itself is:
sub module_path { my $module = shift; my $relpath; my $fullpath; ($relpath = $module.'.pm') =~ s/::/$SEPARATOR/g; foreach my $dir (@INC) { # see 'perldoc -f require' on why you might find # a reference in @INC next if ref($dir); $fullpath = $dir.$SEPARATOR.$relpath; return $fullpath if -f $fullpath; } return undef; }
So very simple, but that's what I was after. I guess I could check wantarray, and if true, check every directory in @INC, rather than stopping at the first. Any other suggestions for improving this?
Sounds like another excellent review in the series - thanks for these, I'm finding them very useful.
Not sure how that SL replacement would work, though: that assignment will happen after the BEGIN block so any special-cased value you put in would be overwritten?
Maybe this would do the trick:
As for module_path, would checking %INC be useful for cases where PAR::Packer or similar scripts have been used?
Doh, thanks for the fix Tom. Not sure what you meant with the %INC/PAR::Packer comment — can you expand on that please?
If all the modules are in a single script file, I think module_path would return undef for everything (or worse, a path to a module hanging around in @INC that would never be loaded), so just wondering if perhaps it's worth doing something like checking %INC to see if we have already loaded the module in question, and if so, use that path rather than trawling through @INC.
I don't use PAR::Packer much myself so there may be a better way of handling that, and perhaps it's out of scope for what this module should be doing anyway.
I guess I could add a line before the loop:
return $INC{$relpath} if exists($INC{$relpath});
But I'm in two minds: it saves a few cycles, but at the cost of increased complexity in the code. If classes are defined in the script, they won't appear in %INC, so this change doesn't ever change the result, just the time to reach it.
MacOS is dead, there's no need to consider it. For that matter, there's no need to use anything other than File::Spec, except I think (not sure where I got that) that you need to use File::Spec::Unix on VMS for some reason.
No wait, that VMS caveat was only for %INC, not for anything else, I think.
Thanks for the suggestion Leon. I just benchmarked two different versions of Module::Path — the current one, and a version based on File::Spec. The latter is 5 times slower (because File::Spec is doing all the canonical path stuff) and has 6 dependencies instead of the current 1. Plus File::Spec provides a whole bunch of other functions which get pulled in.
If this a module that was doing more general processing of directory / file paths, I'd use File::Spec. But the directories in @INC have to work for Perl (by definition), so I think the approach presented above is probably good enough for this module's purpose.
Looks like you forgot to push Module::Path 0.03 to GitHub...
Thanks Olivier — done!