Updated parrot and rakudo releases for cygwin

I've updated the cygwin packages for parrot, rakudo and rakudo-star for cygwin.

There were a couple of test failures and changes.

See the patches and build specs here:
http://code.google.com/p/cygwin-rurban/source/list
and bugs here:
https://rt.perl.org/rt3/Ticket/Display.html?id=112740
https://rt.perl.org/rt3/Ticket/Display.html?id=112742
https://rt.perl.org/rt3/Ticket/Display.html?id=112744
https://rt.perl.org/rt3/Ticket/Display.html?id=112746

rakudo-star make test is broken. You'd need to install it first, and then you…

Big module sizes

I measured the size of some big modules, with B::Stats and B::C, statically compiled.

Interestingly Module::Build is much harder to compile - you'd need 6GB RAM at least -
and much bigger in the end-result than Moose, which compiles/compresses really fine.

Modules

Module::Build

perl -e'use Module::Build: print q(k)'

ops68…

My definition of "stable"

I catch myself always saying, "No, not yet stable enough." I cannot release it, even though it passes all tests. Which you can interpret that the tests suck. Not enough coverage, bad testcases, ...

Well, that is always the case. You can never have enough tests. Problem is that in my case, the compiler, testing costs a lot of time. LOT of time! I usually spend a week to do the final release testing, but more often it lasts several weeks, because one round of test results influence the decisions of TODO, SKIP and mandatory PASSing tests, and then I'll redo the tests. On all versions, with all platforms. You could rely on cpantesters to do that for you, but it is better to do the most common combinations by your own. That's why I use perlall with a few hundred perls.

But with passing tests you always have your definition of TODO tests. A todo test is always a sign of instability. "Sometimes it works, but not always". Or "It used to work, but it is not so important if it fails". Or "It used to fail, but somehow it looks like I fixed it now. But I'm not so sure".

But I came in the last years to a completely different definition of "stable". I call my app stable,

  1. if the testsuite passes, AND

  2. if small innocent changes to the source create expected results.

That means after a passing testsuite I always play around with the code a bit, doing minor improvements, or testing new features, and only if the results come out as expected I will call it stable enough. Only then I can trust my code.

I was often bitten by the "Action at a distance" anti pattern. Very often minor changes caused something completely unrelated to fail. E.g. loading another module, suddenly broke something which always worked, for no apparent reason.

E.g. a concrete perl example: PL_regex_padav is only relevant for threaded perls, holding the REGEXP bodies of stored qr// SV's. 5.8.1, 5.10 and then 5.16 changed the internal implementation of the PL_regex_padav offsets. 5.16 failed in the C compiler, but the same fix to the Bytecode compiler which looks sane fixed the Bytecode problems. The Bytecode compiler is much simplier than the C compiler and big implementation changes cause always synchronous changes in both. If you fix it in the Bytecode compiler you'll have to do the analogue in C. But in C suddenly all fell apart. The good thing, only in threaded perls > 5.15, so the errors are expected and isolated. Just the fix is not right yet.

Fixing compiled C code is always easy by debugging into it with gdb, one session native and one parallel session compiled, find a proper breakpoint and then compare the state. The reason why C failed could be related to something completely different. In C the PL_body_arenas were empty, and when it was initialized by sideeffect in the added fix (analog to the Bytecode fix), the whole PL_regex_padav array fell apart. gdb hw whatchpoints to check who is writing to it did not work.

Okay, the fix was just not good enough you could say. It works in Bytecode by accident but not in the general case. But this does not sound right to my experience. Something else not yet understood is going on. So I'm calling it instable.

Or if one fix in a 5.10-5.14 non-threaded case, causes changes for threaded code for no apparent reason.

Executive summary: With big complicated apps even after a passing testsuite and passing Q&A, either let Q&A play with it for some time or better play with it by yourself and see how it behaves. One additional week always plays well.

Compiler progress with 5.16

The latest B::C package on CPAN 1.42 works stable for almost all perls until 5.14, but so far did not work good enough for the upcoming 5.16 release.

I couldn't even pinpoint to a specific perl change which caused the problems. I know that hashes need a different initialization now. Empty hashes need to declare

HvTOTALKEYS(hv) = 0

after creation, and readonly hashes must be set readonly after they were created.

DynaLoader and %INC handling is much stricter now with 5.16.

And then there is the PMOP pmoffset IV hack. We have now first-class REGEXP SVs, but not really. With threads we still store the regexp body in PL_regex_padav, not as normal body in an arena, and the parser stores the latest pmoffset (I think) behind the PV in PL_regex_pad[0]. Thing is, the compiler does not use head and body arenas yet, it uses static arrays. But when the first body arena for some dynamic SV is initialized, the PL_regex_padav is reset.

All these changes forced me to rewrite the tricky parts of the compiler, the recursive walker to detect used packages and objects.

In my desperation I added checkers to detect the name of objects for method calls, to check bless for used object names, to associate blessed and new scalars with a method call, and to check bareword require for dynamically included packages. The ISA search is now recursive and tries a lof of candidates to find unknown objects for method calls, with proper AUTOLOAD and UNIVERSAL fallback.

This increased the compile time dramatically, but it must be correct and should include all possible used packages. Or if not, at least the %INC hash must be correct to allow dynamically added packages to be loaded properly at run-time. Possibly with DynaLoader.

Still, 5.16 did not pass the most tricky tests and production code, which all worked fine with 5.14 back in November.

Lately I fought deep recursion troubles when compiling some recursive functions in tricky package inclusion scenarios. Moose and recursive Pod::Simple functions mainly. Either the compiler missed some functions, stored some packages only halfway, like Encode (UTF8 did not work), or the compiler went into deep recursion issues in the walker. So I went the old-school way and removed too recursive functions from the walker.

First I disabled recursing into op->first, because the B function walkoptree already steps into op->first.

Second I disabled walking into newly found packages immediately and just mark them as new. The main walker loops now over all packages again and again until the list of newly found packages in each loop is empty. This adds costly compiler passes, and certainly increases the size of the produced binaries, but better load such code at once compiled, then defer it to later.

https://github.com/rurban/perl-compiler/compare/recurse

address-sanitizer round 2

For the upcoming 5.16 I decided to check our code again with address-sanitizer, google's open-source memory checker.

At the first round address-sanitizer was still a bit immature, I had to use a black list for false positives. With the current versions all the false positives has been fixed and clang 3.1 has address-sanitizer included.

address-sanitizer is a memory checker similar to mudflap, but superior to valgrind or coverity and others. It catches more error types, esp. invalid access to globals and to stack addresses, and use after free and use after return. It does so by using shadow memory maps for all pointers and instrumenting the accesses. This is fast, but needs more memory. ASan (short for address-sanitizer) hashes the pointer maps, so it needs much less memory than a full old-style fence checker, which use insane amounts of memory. valgrind can be used to catch memory leaks (i.e. if you are writing daemons) but should not be used to catch pointer errors.

It's slows down code by factor 1.5-2 compared to valgrind which is 10-20x slower. It can compile code with full optimizations and without -g. I use asan with -DEBUGGING similar to my usage of -DDEBUG_LEAKING_SCALARS. Using valgrind hurts, using asan is straightforward and transparent, you are not hurt by it at all. Note that Perl was checked with coverity and valgrind extensively before.

With the first attempt I had about 20 problems identified, and the very first appeared to be a real bug, the rest looked like false posivites, but I was not sure about it.

In the second round I have found 4 core bugs, and I'm now checking CPAN XS code. I'm checking blead with DEBUGGING, threaded and unthreaded.

Core: No invalid stack access, only globals and even one heap bug, undetected by valgrind.

  • #111594: 0ffb95f Socket.xs heap-buffer-overflow with abstract AF_UNIX paths

  • #111586: sdbm.c: fix off-by-one access to global ".dir"

  • #72700: Copy&paste List::Util BOOT bug, reading global past 2 bytes

  • #111610: XS::APItest::clone_with_stack heap-use-after-free on PL_curcop (without patch, I guess it's just a bad test)

Build instructions

See http://code.google.com/p/address-sanitizer/wiki/HowToBuild

cd ~
svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
cd llvm
R=$(svn info | grep Revision: | awk '{print $2}')
# I tested with r152121 and r152199
(cd tools && svn co -r $R \
   http://llvm.org/svn/llvm-project/cfe/trunk clang)
(cd projects && svn co -r $R \
   http://llvm.org/svn/llvm-project/compiler-rt/trunk compiler-rt)
mkdir build

Move away any existing clang, clang++ and llvm-gcc as older versions will not be able to compile llvm.

(cd build && ../configure --enable-optimized && make -j 10)

# Build and test asan run-time library
cd projects/compiler-rt/lib/asan/
make -f Makefile.old get_third_party

This might need to patch the shebang of the python tools from /usr/bin/python2.4 to /usr/bin/python

make -f Makefile.old test -j 10
# Install clang and asan run-time into a separate directory
# ../asan_clang_linux
make -f Makefile.old install

cd <perl-git>

patch perl with my three asan fixes:

  • 111594: 0ffb95f Socket.xs heap-buffer-overflow with abstract AF_UNIX paths
  • 111586: sdbm.c: fix off-by-one access to global ".dir"
  • 72700: Copy&paste List::Util BOOT bug, reading global past 2 bytes

build perl with:

./Configure -de -Dusedevel -DEBUGGING -Doptimize=-g3 \
  -Dcc=~/llvm/projects/compiler-rt/lib/asan_clang_linux/bin/clang \
  -Accflags=-faddress-sanitizer -Aldflags=-faddress-sanitizer \
  -Alddlflags=-faddress-sanitizer

CPAN samples

Interestingly the first CPAN errors were all in modules maintained by me. B-Flags-0.06 and B-Generate-1.44 are the new fixed versions. Yes, I'm a lisp programmer :), but to my defense, the wrong code was not written by me.

B-Generate

static SV *specialsv_list[6];
...
specialsv_list[6] = (SV*)pWARN_STD;  // asan warned here

Fix:

static SV *specialsv_list[7];

=================================================================

==27781== ERROR: AddressSanitizer global-buffer-overflow on address 0x2b176c37a730 at pc 0x2b176c2833a7 bp 0x7fff58c4dfb0 sp 0x7fff58c4dfa8

WRITE of size 8 at 0x2b176c37a730 thread T0

0x2b176c37a730 is located 0 bytes to the right of global variable 'specialsv_list (Generate.c)' (0x2b176c37a700) of size 48

B-Flags

==19039== ERROR: AddressSanitizer heap-buffer-overflow on address 0x2b0995e2d47f at pc 0x2b09965e709f bp 0x7fff150fcfb0 sp 0x7fff150fcfa8

READ of size 1 at 0x2b0995e2d47f thread T0

The fix is here:

-        if (*(SvEND(RETVAL) - 1) == ',') {
+        if (SvCUR(RETVAL) && (*(SvEND(RETVAL) - 1) == ',')) {

RETVAL was an empty string in this case, and checking SvEND - 1 for a zero size string is invalid.

I submitted a YAPC::US talk proposal titled "Forget valgrind, use address-sanitizer"

Tuning tips

Perl is compared to other bigger apps like chromium and mozilla a malloc hog. The measured slowdown is 3x, compared to 2x in the others.

Kosta (asan dev.): "According to our measurements 400.perlbench suffers the greatest slowdown from asan compared to all other benchmarks -- 2.5x.

And that is measured with stack unwinding turned off (i.e. heap-related warnings will be reported w/o malloc/free stack traces). With stack traces enabled the 400.perlbench's slowdown is way over 3x.

This is easy to explain -- perl is very malloc intensive, allocates small chunks and mostly reads single bytes You may want to try env var ASAN_OPTIONS=malloc_context_size=1 to speed up things."