For quite some months I was running after crazy potion and p2 testing problems, which looked like compiler or stack alignment problems.
potion, the vm for p2 uses tricky volatile words to force the compiler to put all GC-able data onto the stack, and not in registers so that the GC only needs to walk the stack to find all accessible data. No need to come to tricks like the boem-weiser libgc does to spill registers somewhere to be able to track them. The second trick is to keep the stack properly aligned, at least 16 byte on darwin, but in case of SSE or AVX instructions or with double return values (e.g. atof) the alignment must be 32 byte.
But as is turned out the problem was not related to a missing volatile, which would have caused GC troubles, i.e. SIGBUS errors, nor stack alignment problems (i.e. random data corruption, esp. in the main interpreter).
The first fix was to disable -fstack-protector. I added this to enable all known hardening flags. But stack-protector, esp. -fstack-protector-all corrupted my manual assembly layout.
The second final fix was to add .NOTPARALLEL: test to my Makefile. WTF??
It turned out that all tests work fine if I run them with make -j1 test, but start failing consistently at the same place when using -j2 or -j4. And the cause is my shell-scripted testsuite within the Makefile. When running my for loop for all tests in parallel I got concurrent reads from pipes to extract the expected result from the test via sed.
for f in test/**/*.pl; do \ look=`cat $$f | sed "/\#=>/!d; s/.*\#=> //"`; \ for=`p2 --inspect $$f | sed "s/\n$$//"`; \ if [ "$$look" != "$$for" ]; then \ echo; \ echo "$$f: expected <$$look>, but got <$$for>"; \ failed=`expr $$failed + 1`; \ else \ echo -n .; \ fi; \ count=`expr $$count + 1`; \ done; \ pass=`expr $$pass + 1`
Citing the info page:
"Another problem is that two processes cannot both take input from the same device; so to make sure that only one recipe tries to take input from the terminal at once, make will invalidate the standard input streams of all but one running recipe. This means that attempting to read from standard input will usually be a fatal error (a 'Broken pipe' signal) for most child processes if there are several. It is unpredictable which recipe will have a valid standard input stream (which will come from the terminal, or wherever you redirect the standard input of make). The first recipe run will always get it first, and the first recipe started after that one finishes will get it next, and so on." http://www.gnu.org/software/make/manual/html_node/Parallel.html
Marking the test target with .NOTPARALLEL: test runs the tests not in parallel and all is fine now. Since all 258 tests run in ~1sec I don't care yet to enable parallel testing support.