How I made my core testsuite 2x faster

There is more than one way to make perl5 twice as fast, but this is what I did today. I fixed it on one machine.

My Macbook Air gives constantly better results in my hash function benchmarks than my big Linux Desktop PC, because it has a newer i7 Haswell, and the linux has only an older i5 CPU. Both have fast SSD's and enough RAM.

But when I run the perl5 testsuite the linux machine is twice as fast. Typically 530s vs 1200s. Which is odd and very annoying.

And then I fixed it with one little change.

$ time ./perl -Ilib -MNet::Domain -e'print Net::Domain::hostname()'

real    1m0.151s
user    0m0.028s
sys 0m0.011s

$ hostname
airc

$ sudo hostname airc.local
airc.local

 $ time ./perl -Ilib -MNet::Domain -e'print Net::Domain::hostname()'
 airc.local
 real   0m0.039s
 user   0m0.027s
 sys    0m0.008s

You see that Net::Domain::hostname didn't return a value. It timed out. Great for testsuites and benchmarks.

The code in Net::Domain calls for MacOS just hostname, which is fast, but for darwin it calls sys_hostname. But this is fast also. So what else is going on?

sub domainname {

  return $fqdn
    if (defined $fqdn);

  _hostname();

  # *.local names are special on darwin. If we call gethostbyname below, it
  # may hang while waiting for another, non-existent computer to respond.
  if($^O eq 'darwin' && $host =~ /\.local$/) {
    return $host;
  }

I cannot beat this magic, so I changed my hostname on my laptop. Problem solved. Aargh

Elapsed: 637 sec

On Dave Mitchell calling B and B::C a failed experiment

While being blocked from p5p I had to read a new outragious statement by a porter. Who is in reality not a porter like me, just some developer who happens to have no idea what he is talking about.

Re: OP_SIGNATURE

He struggles with his new super-op OP_SIGNATURE which is at second thought a better idea then the old way to assign lexical values from the call stack at the begin of subroutines, just that it cannot take the stack values, it has to go through an intermediate @_ copy, but that is just an implementation detail which can be optimized away, and goes then on like this:

Also, every op/optree related internals change now involves fixups to B and Deparse, so every change is that much more work.

If I could travel back in time and stop Malcolm B. writing B and friends, I would in an instant. Perl now would have been far, far better, and probably a lot more truly extensible than it is now.

The realisation that B (and B::C etc) were a failed experiment was one of the drivers of perl6. It's been an albatross round perl5's neck ever since.

Dear Dave, are you completely out of your mind?

I know that you all don't want to maintain your reflection API and AST. p5p was also not able to maintain B::C and the other 2 compilers, so I had to step up and fix it for them. Even if p5p is unwiling to fix the outstanding problems they create and is still doing more damage than good, B::C is a huge success story.

B::Generate, optimizer and types are not so easy to fix because there the damage done by p5p is too outragious that it can only be fixed in a forked version of perl. I still have to maintain patchsets in my perlall build suite (perlall build --patch=Compiler) to be able to create perls without those roadblocks. And to support windows, because p5p is not willing to export the needed API functions.

Nick Clark really claimed publicly that changing function bodies at run-time is too dangerous when used with concurrent threads. Let that be the problem of the optimizer, not yours. By further blocking dynamic optimizations B::Generate is worthless and type or profile based optimizations cannot be done. Do you have an idea why javascript, an even more dynamic language and worse language than perl could be optimized so much? Apparently not.

To the success

B::C passes the complete core testsuite. B::C compiled code is faster and smaller than uncompiled perl. Look at the B::CC and rperl benchmarks. We are close to v8 and in the next step we will be there.

B::C is successfully used in production by cPanel, which is in fact the largest and most successful company using perl. We just don't shout it out as loud as Booking.com, because we are privately held and we don't need to publish our numbers. Nevertheless cPanel is the facto one of the backbones of the internet, used by ca 70% of all webhosters worldwide, with compiled perl applications and its distribution based on CentOS.

We have to compile perl to have a low memory footprint for our daemons. They need to be smaller than apache and mysql at least, and they need to run on hosting VM's which are low on memory.

Even with p5p non-ability to come up with non-bloated versions of their releases, and their non-ability to come up with any improvements since 5.6, B::C is a huge success.

Dave and Nick, you are the real albatros around perl5's neck for years. Finally stop doing your destructive work, step back and let the people do the work who got a track record and have an idea what they are doing. You both got paid for years to work to improve perl5 and the results are hopeless. One year to fix eval in regex? My dear.

Still not a single feature written and discussed by p5p was ever successful, besides the trivial defined-or syntax. The only non-trivial improvement in the last years came from outside and was initially heavily critized and not understood. ("Why do we need another hash table implementation?") But will this lead eventually to an efficient implementation of classes, roles (mixins), polymorphism and types? Or a better runloop. For sure not. This is shot down since 2002, and everybody who was able to do that and was interested left p5p. Not about talking fixing the easy stuff like smartmatch, switch, given/when or even hash tables. You just gave up.

I have to do my work now behind closed doors.

Stop doing your destructive work, start listening to advice and maybe even implement a good feature or library. Like OP_MULTIDEREF which is nice. Even if it just tampers around the fact that the runloop is too big and slow.

Without B perl5 would have been more successful?

What a ridiculous statement. Inspecting the AST, the optree, how perl compiles its ops? Certainly totally outragious. Nobody would need that. You even refused to accept documentation for the optree. You are just bitching about your need for B::Deparse and B::Concise test updates. Who needs precise optree representations? It's just an implementation detail. The functionality needs to be tested not how it looks internally.

But maybe you'll eventually learn how the optree (i.e. the AST) looks like, when you are forced to update some B modules. I'll do the rest for you anyway for the parts you do not understand.

Realize that your work is looking worse than PHP.

On OP_SIGNATURE

Since it is not possible to write p5p criticism to the mailing list, I'll have to do it in my blog. @p5p: think over your guidelines. I don't believe that stuff like that needs to be blogged.

DaveM now introduced a new OP_SIGNATURE which assigns run-time args according to the compiled signature.

It basically speeds up these compiled checks

    sub f ($a, $b = 0, $c = "foo") {};

=>

    sub f {
        die sprintf("Too many arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @_ <= 3;
        die sprintf("Too few arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @_ >= 1;
        my $a = $_[0];
        my $b = @_ >= 2 ? $_[1] : 0;
        my $c = @_ >= 3 ? $_[2] : 'foo';
        ();
    }

into one OP, which does the same, similar to the new MULTIDEREF. Moving op chains into C. DaveM is now the goto guy for the big DWIM ops, compressing previous op chains into a single one.

This is far too much for a single op, but previously it was all handled either in ENTERSUB or in user code.

The arity checks should be done in the call op (ENTERSUB), the local assignment should be separate ops, we still have no syntax support for types which could be used previously in user-code (my int $i = $_[0];) and now we even need type hooks. These can also be added after the SIGNATURE op, but make not much sense there, as side effects will appear to early. XS calls do not need the checks as they do by their own, but XS calls are easily detected in ENTERSUB.

The assignment to local lexicals is now buried in this single OP, which makes it's now impossible to change the currently only supported call by value to the faster call by reference. I.e. there's still no support to declare call by ref

sub myinc (\$a) { $a++ }; my $i=0; myinc($i); print $i; # => 1

so you still have to use $_[0] directly, which means @_ still needs to be filled with all args, which makes every signature usage still twice as slow as normal calls without signature declaration. Once for @_ in ENTERSUB and a second time for the named args in SIGNATURE. So this new OP basically just hides this new slowness by design (blame Zefram for this idea) by bypassing the normal ops which assigned the locals.

Any optimizing compiler now needs to replace this new SIGNATURE op and cannot work on the optree. Fine, it cannot be used as is anyway.

Compiled or run-time polymorphism (dispatch on argument types) now needs to replace SIGNATURE and not ENTERSUB. There's not much difference, both are horrible ops to work with. SIGNATURE is probably easier to replace, but replacing ENTERSUB had its advantages by leaving out all the unneeded recursion, @_ and debugger code. So basically you have now to replace both.

Of course there's still no type support, and still no return type declaration syntax, though it seems the post declaration attribute list can now be used, as :const is now supported, just for anonsubs only.

So real subs can soon look like:

sub myinc (int \$a) :int { $a++ }

and you can use the faster i_ops for the result, and since it's a reference, for the lifetime of the caller variable. Just don't expect that from p5p in the next 5 years. Only all the other dynamic languages, python, ruby, php, javascript announce these features officially, and I have to implement it in private.

p5p still has no idea what they are doing, but probably will also announce it as great breakthrough, as they did with the Zefram signatures before. Which is somewhat funny, announcing the worst of all existing signature implementations as positive. People bought that, so it will work now too.

So far the biggest breakthrough lately was besides the new fast METHOD ops (THANKS!), to go for the :const attribute for subs (THANKS!), so the other syntax possibilities => type or perl6 like returns type or is ro are now very unlikely to appear. This was a great decision, even if it was done unconsciously, and I can finally go forward.

A little warning to EUMM and shell-script users

I sometimes need to write shell-script test scripts and not perl, to be able to test perl scripts, without interfering with perl, and also for performance and easier IO reasons.

In order to find out with which perl this distro was built, we need to parse the generated Makefile.

Recent EUMM 7.0x introduced a new feature which broke all my scripts. They started double-quoting PERL and FULLPERL in the generated Makefile. Damage is already done. They only thing you can do is to remove the quote.

PERL=`grep "^PERL =" Makefile|cut -c8-`
PERL=${PERL:-perl}
PERL=`echo $PERL|sed -e's,^",,; s,"$,,'`

They obviously were afraid of spaces in Windows paths. Only cmd.exe accepts "cmd", no other shell. So the obvious fix would be to add double quotes on Win32 only, and only of a space appears on the NAME or the PATH. Same as we have to do with $^X in system calls, where we have to double-quote $^X explicitly in string -context. Like with

$X = $^X =~ / / ? qq("$^X") : $^X; system("$X ...")

Initial feedback to the maintainers was not positive, they don't care. EUMM needs to write Makefiles, nothing else. The second reply was: Just use sh -c $PERL $args. Yeah. Exactly.

So I fear the toolchain also starts rotting now with the newbies taking over. Test::Builder is also in great danger with a newbie maintainer. The initial trials were twice as slow to be able to support streaming. Given that p5p has similar technical problems it doesn't look to good for 5.2x being usable too soon. I'm still forced to use 5.14.4.

Let's just hope CPAN will not get new maintainers.

My fix: https://github.com/rurban/perl-compiler/commit/16379cf29cbffdf8ffce9d0822af0548cfb65051

The sad story of pseudohash criticism

I just had to endure MJD’s horrible pseudohash explanation at the Pittsburgh Workshop. “A new, never-before-seen talk on Perl’s catastrophic experiment with “pseudohashes”, which wasted everyone’s time for nine years between 1998 and 2007”

https://www.youtube.com/watch?v=-HlGQtAuZuY

Watch it, you can fast forward through it. I honestly had higher opinions on Marc-Jason.

So let’s see what’s wrong with the popular and uninformed pseudohash critic:

Their main points are that storing a hash in the array slot 0 for run-time lookup is too complicated, the exists and delete ops need to check for arrays / pseudohashes now also, and all the pseudohash checks slowed down general hash usage by 15%. Which basically levelled the advantage of faster compile-time accelerated array lookup on those pseudo hashes, which was ~15%.

package Critter;
use fields qw(NAME TYPE);

my Critter $h;    # compile-time optimization: href NAME => aref 1
$h->{NAME};     # ==> $h->[1]

but:

$key = "NAME";  # defer to run-time lookup of href in aref 0
$h->{$key};       # ==> $h->[ $h->[0]->{$key} ]

So by allowing the slow run-time access, you need to preserve the hash semantics of the array. Still, the compilers knows about the type of $h, and can still compile it to a href $key aref 0.

Same problem with exists and delete.

exists $h->{NAME} is compile-time constant foldable to YES or NO.

delete $h->{NAME} needs to store a sentinel as with hashes in aref 1. This only slows down aref for pseudohashes, but should not slow down href or aref for arrays.

Of course this was not how it was implemented. In good old perl5 fashion $h was kept as hash, and all the hash ops were extended to check for pseudohashes at run-time. Yes, at run-time, in the ops.

What should have been done instead was to either reject pseudohash optimization when a run-time key was parsed, maybe with a warning under use warnings.

Or if you really don’t want to punish bad behaviour by using computed keys with explicitly requested compile-time keys, compile $h to arrays and not to hashes.

As I said before, perl5 is just badly implemented, but still fixable. use fields could still be a hash hint for the compiler to change it to arrays.

Just don’t expect anything from the current maintainers and the old-timers.

Horrible talk.