Aspects of the blog were about how a Perl programmer may sometimes have to keep in mind the ramifications of the which "phase" things are done, the somewhat blurring of distinction between the compile and run phases since you can just do about anything at compile time that you can do at run time, and finally a hack for being able to trace use or require statements.
In all three of the other alternatives for how to run a subroutine in demo code, the code is actually compiled beforehand. By eval'ing a string, the code is only around in string form before it is actually run.
(One way you can verify this is to install Devel::Trepan::Disassemble run the code and enter the debugger "disassemble" command, or use B::Concise directly.)
In the case of the glob, the function is not seen via can() but it is around in main's symbol table. In the case of lexical subs and a closure, the code is also still around but not visible globally. With the closure that subroutine variable can still be passed around which extends its visibility which wasn't intended. I'm not sure if that's also true with lexical subs.
In my mind, the intent was to have a temporary function that just exists — and again perhaps this was too limiting on my part by assuming that it wouldn't exist in compiled form — only when that branch is taken. Why? Well, in the back of my mind I am hoping that the code is thrown away at compile time (whether for Perl, Python or Ruby). In fact I don't think it happens in any of these, but another way write this which may make it more obvious to a compiler that this is dead code on compilation is:
if (__FILE__ == $0) {
....
}
- - - -
By the way, in debugging the various examples, I see that a misfeature of both perl5db and Devel::Trepan (and most likely every other Perl debugger) is that the behavior is different evaluating in the debugger the results reported by caller(). I'll probably fix that down the line in Devel:Trepan by inserting a custom caller routine that corrects for this discrepancy.
The same is true of your Ruby and Python examples.
The intent was in my mind was to have a temporary function that just exists […] only when that branch is taken.
How is that supposed to work? Would the compiler somehow leave that bit of the code unparsed? How would it identify where the not-to-parse code ends (i.e. in this case, where the closing curly is) without parsing?
]]>The other form of the expression is:
if $0 == __FILE__
and given that compilation occurs just before running without a break in between, the compiler can have the value of $0
around at compile time as it most definitely does for __FILE__
. (I have to walk this back a little since in Perl one can assign to $0.)
Yes, that one is perhaps a little far-fetched as is caller() unless one imagines looking for that specific idiom during compilation. And I am given to understand that Perl does look for other idioms like this.
But in Python, the idiom is:
if __name__ == '__main__':
and this is the more apparent for dead code since that is a compile-time expression.
C Python process Python code and saves that into a .pyc file, but I don't think it does much more than tokenize the file. So I guess if something like this were to be done, there would be a dead-code elimination phase before running.
The other thing I had in my mind was having something like a preprocessor to look for those demo-code idioms and strip them, much in the same way that Google has a tool for "optimizing" javascript code.
My debugger has over 70 files and most if not all of them have demo code. So that's a lot of savings.
Again, I'm sorry for being a bit vague initially and in the blog post about all of this.
Re: git-annex: I did evaluate git-annex a while back when looking for options to backup media files, but it seemed big and complex (adds another layer of complexity) so I didn't use it. Perhaps it's time to take another look.
]]>part
version could be better.
my ($sub_set_a, $sub_set_b, $sub_set_c, $not_used) = part {
$_ <= $a ? 0
: $_ > $a and $_<= $b ? 1
: $_ > $b ? 2
: 3
} @set;
(not tested)
]]>$not_used
is unnecessary, you can just write it like this:
my (undef,$sub_set_a) = part { $_ <= $a } @set;
my (undef,$sub_set_b) = part { $_ > $a and $_<=$b } @set;
my (undef,$sub_set_c) = part { $_ > $b } @set;
More importantly though, if $a < $b
is a precondition here (but only then!), then this is a misunderstanding of the part
function and you can reduce these three passes to a single one:
my ($sub_set_a, $sub_set_b, $sub_set_c) = part {
$_ <= $a ? 0 :
$_ <= $b ? 1 :
2
} @set;
]]>
$a < $b
doesn’t always hold, then the subsets can overlap, which cannot be produced from a single part
pass – so your version is broken and the original is necessary. If it does always hold, then your version is unnecessarily complex.]]>