February 2012 Archives

On defined(@array) and defined(%hash)

Perl is somewhat broken as language as it autovivifies symbol values when accessing them.

Clarification because this post has technical errors:

The following is from a naive understanding of the hypothetical defined operator as it is known from other computer languages or the C preprocessor. Perl's defined was invented to check for the undef value, but is often and falsely used to check for definedness of a symbol.

My understanding coming from a CS background was that defined should check for the existence of the symbol type slot, without creating the symbol and slot. "This this symbol exists?" This is wrong. To check for the symbol being defined, use exists in the symbol hash, the "stash".

There were horrible errors in the wild which can see not so on CPAN more via google codesearch, where defined wanted to check symbol existence but instead created the symbol. Typical defined(@array) errors have been found in the linux kernel, in openssl, everywhere. So the deprecation warning now with 5.15.7 is a very good thing, it opened my eyes.

====================================

defined is supposed to be non-destructive operator to not autovivify the accessed element.

Originally autovivification should just help accessing hash chains by creating intermediate hash values, e.g. $hash->{parent}->{value} would create the parent key and not fail. The safe version would be defined $hash->{parent} and $hash->{parent}->{value}. You can blame Chip for that design. But it did not stay with hash keys only it went on to symbols.

Also perl went an interesting way and invented exists: exists $hash->{parent} and $hash->{parent}->{value};

This was the next spec problem as defined is supposed to return false if the key exists but the value is undef.

So defined(*sub) got broken also. It will always return true, you have to use defined(*{"sub"}).

The next spec bigger error here just happened with 5.16 as the language police decided that defined(@array) and defined(%hash) is illegal, and will warn.

defined(@array) checks if the symbol *array exists without creating it, and then checks if the AV slot of the symbol is defined, is not empty. It does not check if the @array value itself is empty.

Ok now with 5.16 p5p overruled that if (@array) is semantically the same as if defined(@array), because if does not auto-vivify. How does one now when if does not auto-vivify? Only defined is the keyword which does not autovivify for sure.

Same for defined(%hash) or defined(&sub) or defined($scalar). Just that defined(%hash) and defined(@array) is now forbidden.

Or should defined(&sub) now create the *sub symbol just by looking at it? That would be next design error which I can foresee. defined(&sub) or defined(*{"sub"}) => always true

Note that defined(&sub) or defined(*sub) will return true already as defined(*sub) autovivifies.

Also note that checking for the typeslot is done by checking exists of *symbol{SCALAR}, *symbol{ARRAY}, *symbol{CODE} and so on.

BTW: I just fixed a similar issue in the perl-compiler, where checking a symbol - a scalar in this case - autovifivied it. Fixed by this code

The impact of this change: codesearch for defined

CLARIFICATION

I was completely wrong with my defined'ness assumption as other authors also. They have to fix now their code, because it is invalid.

Yesterday after I wrote this blog post, I talked on the #p5p IRC channel for about 5 hours to defend my wrong position, and got persuaded that I was completely off. Thanks. I analyzed my and other code and found multiple errors in my wrong assumption what defined does and defined does not do.

How did I come to this odd assumption? The docs perldoc -f defined are pretty clear on this.

I mainly do maintenance programming, and in most of my modules defined was used t check for symbol existance, in the symbol table hash. defined $symtab($name} which makes only sense if the hash key would return undef in some cases but it does not. The right code would be exists $symtab($name}. In my case it made no difference because there was never a undef as hash value, but in other cases it was just a hard to detect bug.

defined(@array) and defined(%hash) have been deprecated in the docs since 5.6, but only now enforced with 5.16. It never did anything useful, it just worked by accident.

mst came with a good example when it was broken. If you create an array, and then delete all array elements with delete, the array will be empty, but defined will return true.

$ perl -e'$a[0]=undef; print defined(@a); delete $a[0]; print defined(@a); print "empty" unless @a'

=> 11empty

Thanks #p5p!

Well, and if you are worried to ask questions or complain about 'stupid' decisions on the perl5-porters mailing list where such decisions are made:

Do not worry! The perl community has the thickest skin ever. You might hear some intimidating technical slang you will not understand and which will turn you off. Do not care, the explanation should be simple to understand. Do complain and you might get enlightened.

Sorry for the trouble.

ExtUtils::MakeMaker make release

I often wonder why people praise Dist::Zilla for ease of use. Recently I heard this argument: 'It was never easier to make a release. You cannot do that with EUMM'.

So I here is my little make release snippet from one of my Makefile.PL.

There is more in it. make README, make gcov and make gprof for XS extensions.

sub depend {
  "
README : \$(VERSION_FROM)
    pod2text \$(VERSION_FROM) > README

release : dist
    git commit -a -m\"release \$(VERSION)\"
    git tag \$(VERSION)
    cpan-upload \$(DISTVNAME).tar\$(SUFFIX)
    git push
    git push --tags

gcov : \$(BASEEXT).c.gcov \$(BASEEXT).gcov cover_db/\$(BASEEXT)-xs.html

\$(BASEEXT).c.gcov \$(BASEEXT).xs.gcov : \$(BASEEXT).xs
    \$(MAKE) CCFLAGS=\"\$(CCFLAGS) -fprofile-arcs -ftest-coverage\" \
      LDDLFLAGS=\"\$(LDDLFLAGS) -fprofile-arcs -ftest-coverage\"
    gcov \$(BASEEXT).c \$(BASEEXT).xs

cover_db/\$(BASEEXT)-xs.html : \$(BASEEXT).xs.gcov
    PERL5OPT=-MDevel::Cover make test
    -$^X -S gcov2perl \$(BASEEXT).c.gcov \$(BASEEXT).xs.gcov
    $^X -S cover

gprof :
    \$(MAKE) CCFLAGS=\"\$(CCFLAGS) -pg\" LDDLFLAGS=\"\$(LDDLFLAGS) \
      -pg\"
"
}

The unexpected case of -Mblib

I'm in constant worry of unnecessary bloat because our perls have to run fast. "Bloat" means added dependencies, loading new files at startup, needing more time.

With the perl compiler I can analyze code and dependencies at compile-time and strip unneeded packages.
My recent concerns have been:

1. Carp being used in the DynaLoader AUTOLOAD fallback, when a XS is not loaded give a full stacktrace.
2. Carp checking for B being loaded and checking CALLER_OVERRIDE_CHECK_OK. This outsmarts the compilers which pulls in B at…

About Reini Urban

user-pic Working at cPanel on B::C (the perl-compiler), p2, types, parrot, B::Generate, cygwin perl and more guts (LLVM, jit, optimizations).