And the fastest OO accessor is...

By Michael G Schwern on March 31, 2011 11:21 AM

There's a lot of FUD out there about the performance of various OO modules, particularly Mouse. So let's set it straight with some benchmarking.

I've chosen to simulate that buggaboo of OO performance in Perl, the simple accessor. The one that you're going to call millions of times and that you'll be sorely tempted to reimplement with a hash or tear all the argument checks out of for "performance". To make it a little more realistic, I'm checking both getting and setting as well as a simple argument check.

Each benchmark gets or sets an integer once via an accessor (except in the case of the plain hash). The object is created outside the benchmark to avoid distortion.

Each accessor validates that the argument is an integer.

I picked the following modules because they are either popular or make performance claims:

Mouse, both XS and pure Perl
Moose
Moo, with and without Sub::Quote
A plain hash
A class written by hand (aka "manual")

I also threw in Object::Tiny and Object::Tiny::XS, even though it's not really fair. They're read only and do no argument checks. They sacrifice everything for performance. Let's see if it's worth it.

Finally, I did a hash with no argument checks, just to get an upper bound.

I won't clog this post with the full benchmark code, but here's what the Moose class looks like for example.

package Foo::Moose;
use Moose;
has bar => (is => 'rw', isa => "Int");
__PACKAGE__->meta->make_immutable;

Results

And the results, from slowest to fastest getter with the cheaters at the bottom.

Name                  Get            Set
-----------------------------------------
Moo                   -11%            -4%   
manual                  0%             0%    
Mouse, no XS            0%           -33%  
Moo w/quote_sub         5%           -46%  
Moose                   8%           -40%  
Mouse                 145%           468%  

manual, no check        0%           153%  
Object::Tiny           20%           n/a   
Object::Tiny::XS      226%           n/a    
hash, no check        279%         1,116%  
hash                  289%           147%

That's with Perl 5.12.2 on OS X using the latest versions of all those modules as of this writing (Moose 1.24, Mouse 0.91, Moo 0.009007, Object::Tiny 1.08, Object::Tiny::XS 1.01).

The percentages are the % improvement from a hand written class (manual).

You can get all the data as a CSV.

Conclusions

The go-to module is clearly Mouse with XS. The power of Moose, but faster than everything equivalent, faster than writing it by hand, faster (on setting with equivalent validation) than a raw hash, and no required dependencies. Unless you need the meta capabilities of Moose, there's little reason to use anything else.

If you want a read-only object with no argument checks, use Object::Tiny::XS, hands down.

Object::Tiny, on the other hand, is pretty poky; significantly slower than Mouse with XS. Given its extreme limitations, if you can use XS then there's no point to Object::Tiny.

Moose and pure Perl Mouse stack up about the same and fare well against a hand written class, though lose out in setting. You should be getting far more than setting, so you won't be seeing much performance gain by hand writing methods.

The big surprise is the newcomer, Moo... but not in a good way. It's "an extremely light-weight, high-performance Moose replacement" but the numbers don't stack up. Absolutely creamed by Mouse with XS, it's the slowest getter of them all with a slight edge in setting.

Another surprise is Moo's quote_sub. This is supposed "to create coderefs that are inlineable, giving us a handy, XS-free speed boost." The Moo docs suggest using them for "isa" type checks, so I did. The numbers don't pan out, it's worse than a regular sub. I'm going to assume it's a bug and I've filed a report. Curiously, using quote_sub on isa made the getter faster... which makes no sense.

The final conclusion is this: performance is no longer an excuse for not using OO. Accessors are what will be called most often and they're what tempts programmers to micro optimize and throw out their abstractions. Mouse and Object::Tiny::XS can blow the doors off what you can write by hand. The performance improvements in Mouse show that abstraction not only makes the code cleaner, but it allows radical optimizations which you can have just by upgrading to a new version.

Bias

Every benchmark has its biases. Here's the ones in this one that I can identify.

Of course the version of Perl, operating system and versions of modules all matter. One particularly large gap will be in Mouse. Its XS optimizations are a fairly recent thing, so older versions of Mouse will not perform well.

The benchmark uses a very simple type check, an integer. I chose that because it's what an awful lot of methods are going to use. Because it's so simple and common Moose, and especially Mouse, have a clear opportunity to optimize for it. A less common or more complicated type check might have impacted performance more. A string might provide different numbers.

Finally, I really like Mouse. :-)

Update

Corrected my statement about Mouse being faster than a raw hash. That only happens when setting with validation. When I originally did the benchmarks they combined setting and getting.

Added a wider conclusion about the role of performance in choosing an OO system.

9 comments

Tagged as:

benchmarks, oo

9 Comments

zby | March 31, 2011 1:15 PM | Reply

You should check App::Benchmark::Accessors - it's tests output a benchmark table so you can see the benchmarks in many Perl versions and OSes. For example: http://www.cpantesters.org/cpan/report/cd1a6aa6-2a9b-11e0-aaf8-39cd950acfcc

demerphq | March 31, 2011 2:39 PM | Reply

Quote from tsee (he cant log in right now - adjectives altered to make sense in context here):

Wrong conclusion. Faster than a raw hash? Only if you plan to do parameter validation. Seriously, if you do, pick another language OR don't worry about this micro-optimization.

You missed out on the fastest implementation of non-argument-checking accessors available. By proxy, you happened to use part of it via Object::Tiny::XS and surprise, for what Object::Tiny::XS does, it is the fastest.

You should take a look at Class::XSAccessor or Class::XSAccessor::Array.

Chip Salzenberg | March 31, 2011 8:52 PM | Reply

Thank you for these tests! As it happens I've gotten used to the Moose meta object stuff. Perhaps I'll wean myself of it to gain the Mouse speed, someday.

What demerphq said - Class::XSAccessor deserves inclusion.

I would like a mutation of Class::XSAccessor that calls out to Perl for its setting but stays in XS for getting; I wonder if it exists.

Mark Lawrence | March 31, 2011 9:55 PM | Reply

quote: "There's a lot of FUD out there about the performance of various OO modules, particularly Mouse. So let's set it straight with some benchmarking."

I thought it was now common-wisdom that the overall run-time performance of most applications is generally not limited by the object system, but by external IO, design decisions etc.

You didn't link to or mention which FUD you are responding to, so I'm not sure exactly what you are straightening out. However I think your narrow focus on accessors ignores a the use-cases which are not dependent on accessor speed for their "performance."

Command-line apps and one-shot CGI scripts are more sensitive to "startup cost" for example. How about a follow-up post looking at that? And for completeness one could also investigate the relative time for getter/setter calls against DBI calls, or system calls, or ...

oylenshpeegul | April 1, 2011 12:56 AM | Reply

Perl 5.21? I am so out of date!

bart | April 1, 2011 1:21 PM | Reply

I don't see where you get the idea that Mouse with XS is faster than a "raw hash", whatever that means (hash ref? inside out objects?). With no check on assignments, raw hash appears to be twice as fast as Mouse XS. (279 vs 145, 1116 vs 468)

Michael G Schwern | April 2, 2011 3:50 AM | Reply

I updated the blog. The "faster than a raw hash" assertion was not quite accurate, only for setting with validation. It came from when I was benchmarking getting and setting together.

I've also added a wider conclusion: micro-optimization is no longer a valid reason to throw out OO or even parameter validation. I need to do a follow up post focusing on validation.

@zby I wasn't aware of App::Benchmark::Accessors, thanks! It appears to not bench validation making the data not very realistic. :-/ Maybe I can get that added in.

@demerphq (but really @tsee): I don't consider non-validating accessors viable for anything bigger than a toy. I put Object::Tiny and raw hashes in there to provide an upper bound so one can weigh how much performance you gain vs how much functionality you lose. I had forgotten about Class::XSAccessor and gave it a try.

CXS is the fastest non-validating accessor... but not by much. Mouse, without checks to level the playing field, is only 10% slower. Both are far faster than anything else.

The interesting bit was that Mouse with checks was only 25% slower in the setter. We've always assumed that validation is expensive, and it is... for everything but Mouse. All other setters were 2x to 4x slower with validation, but Mouse was only about 20% slower. It looks like you can have your cake and eat it, too.

@chip The nice part about Moose and Mouse is you can use both in the same project. I write classes using Mouse, and then if I need meta stuff I switch it to Moose.

@mark Yes, my focus is deliberately narrow to address the urge to micro-optimize by claiming validation and OO are slow. It's also to illustrate the wild performance gains in Mouse and shed the assumption that in order to be fast you have to shed features. The existance of modules like Moo, Object::Tiny and Class::XSAccessor show that assumption is still in full effect, but the reality has changed.

Randy Stauner | May 23, 2011 6:41 PM | Reply

For the record, my personal FUD is more about memory (and possibly CPU (if that makes sense (I don't think it does))) usage.

The largest project I've worked on (which could really benefit from moving to Moose) is deployed on a farm of low-resource machines... single < 1GHz CPU, some machines (seriously (I'm not joking)) 128 MB RAM.

Granted, I think those extremely low-models have mostly been phased out for machines that are closer to 512 or maybe 1G RAM and the CPU's are probably getting better, but I'm also thrashing huge amounts of data for comparison.

I appreciate this benchmark, but I'd also love to see one based on memory usage (guess I should get on that...)

Michael G Schwern replied to comment from Randy Stauner | May 23, 2011 8:46 PM | Reply

Compile time memory usage will change drastically between them, that's easy enough for you to benchmark. And it doesn't make much difference if you're in a server context and persisting processes. I doubt runtime memory usage will change much between the systems, AFAIK they all still are just shoving things into hashes. Inside-out objects might buy you a bit more memory if your objects are small and plentiful, since you're not making a new hash for every object. But presumptions are the root of all optimization evil, so it would be interesting to know nudge nudge.

But if you're not working in some extreme environment, buy some RAM! You blow an hour micro-optimizing your memory usage and you could have just bought a few DIMMs. Newegg is having a memory sale right now. Hell, I can find better stuff than you've got at my local computer recycling center!

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Michael G Schwern

Ya know, the guy who tells you to write tests and not use MakeMaker.

More info »

Schwern