The Price of Cleverness (YAML is not Safe)

Today I wasted a few hours tracking down this delightful bug:

Undefined subroutine &main::main:: called at 
...lib/site_perl/5.12.4/YAML/Mo.pm line 5.

So what does YAML::Mo line 5 look like?

That's right. That's line 5. I've wrapped it to make it easier to read. When you load the latest YAML, you load YAML::Mo and that contains the above monstrosity. And it has a serious bug. Do you see it?

This bug was first reported a couple of months ago but no one could reproduce it. There was speculation that it was related to *{""} stringification, but while it's related, it's sort of a red herring.

I first encountered this bug when I was was using Dancer and was using the excellent Data::DPath (thank's Steffen!) to extract data from a JSON data structure. At first, Dancer was telling me that YAML was not installed, but a pull request later, it was giving me the error message I opened this post with. And what a heck of a problem it was to debug.

After a lot of work stripping down Data::DPath, I managed to get this test case:

use Safe;
BEGIN { Safe->new } # used in Data::DPath
use YAML;

As it turns out, there's a tiny bug in Safe's calculation of shared code and I've filed a bug report for it. Essentially, it does this:

\&main::;

That creates a reference to a sub whose name is the empty string! Actually, no it doesn't. It appears to autovivify the typeglob slot in the same way a forward declaration will, but it provides no sub body.

# a forward declaration of a sub, but no sub body
sub foo;

Now what does strict do? Here's some fun:

[Ovid:~] $ perl -le 'my $x = foo; print $x'
foo
[Ovid:~] $ perl -le 'sub foo { "bar" } my $x = foo; print $x'
bar

If you don't enable strict, Perl will see if a bareword is a subroutine and if it is, it will call it. Otherwise, it will assume that the bareword is a string.

Actually, it's use strict 'subs' that does this:

[Ovid:~] $ perl -le 'use strict 'subs'; my $x = foo; print $x'
Bareword "foo" not allowed while "strict subs" in use at -e line 1.
Execution of -e aborted due to compilation errors.

At the beginning of the YAML::Mo code, we have this:

package YAML::Mo;
my $M = __PACKAGE__ . ::;

Because strict is not enabled, $M will be assigned the value YAML::Mo::. Clever, huh? No, not really. When Safe is loaded, the reference \&main:: in Safe will create a subroutine effectively named "" (the empty string). When you reference a package variable or subroutine, you're allowed to leave off the main:: if it's in package main, so if you have this:

package main;
sub foo { 'bar' }

The rest of your code can call foo() with either main::foo() or ::foo(). In this case, here's what happens:

  • Perl sees a bare :: in __PACKAGE__ . ::;
  • It first tries to see if :: is a sub, but since main is optional, it sees :: as a sub in main::.
  • Because of the Safe bug, there is now an empty string sub in *main::{CODE}, but it has no sub body.
  • So Perl tries to call *main::, resulting in the strange Undefined subroutine &main::main:: error.

Here's the smallest test case to replicate the bug:

[Ovid~] $ perl -e 'BEGIN { \&:: } use Mo'
Undefined subroutine &main::main:: called at 
... lib/site_perl/5.12.4/Mo.pm line 3.

So now, any code that used Mo to generate "inline Mo" code needs to be rereleased to work around this bug because updating Mo is not enough. You must update the code that inlines Mo.

The fundamental problem here isn't actually the lack of using strict (though this problem would not have happened if it was used). It's trying to be too clever. I have often maintained two things:

  • Clever programmers often aren't.
  • Expert programmers try to make their code as clean and simple as possible.

I have no problem with "clever" code, but only as a proof of concept, for an Acme:: module, or when you have no choice. Willy-nilly applying this to YAML was not a good decision (Sorry Ingy. I luv ya, but honestly ...).

So everyone using YAML now has a potential time bomb in their code. If they load Safe first, or any other code that accidentally creates subs in namespaces where "Mo" thinks there should not be subs, they could fall afoul of this.

And for your amusement, here's a bug report whereby someone gets a commit bit for golfing Mo even further.

And this is the bug report for YAML.

Please don't be clever.

18 Comments

Don't be clever; being great is good enough. ;)

I think the quoted code is shitty code, not clever and there is no point putting all that crap into a single line

I thought YAML.pm basically was a proof-of-concept, and the production implementation was YAML::XS. Certainly I've had problems in the past which were fixed by switching; I would never choose to use YAML in production code.

Its for this reason that I use Moose and not the lighter alternatives that are probably all I need. Give me the real stuff please. I know that Mouse/Moo probably can't be this bad, but I think I will stick with Moose and hope that perl5-MOP goes core eventually to make my Moosey modules lighter.

You do realise that in your example, the single-quoted words in your perl -le commands actually aren't quoted at all, because your inlined single quote stops the shell quoting here? So from perl's point of view, the "bar" in your sub foo() is a bareword, too...

Ovid, i'm not sure if you were aware, but:

The code snippet above is generated from this code:

https://github.com/ingydotnet/mo-pm/blob/master/src/Mo.pm

It is condensed down with an auto-golfer that i helped write and there's honestly a lot less cleverness at work there than one might think. :)

I had this error recently in an EmbPerl project. The error disappeared, when I used YAML directly before Data::DPath:

This crashes:

[*
use strict;
use warnings;
no warnings 'redefine';
use vars qw($r %fdat %udat @param);

use Data::DPath 'dpath';
...
*]

This does not:

[*
use strict;
use warnings;
no warnings 'redefine';
use vars qw($r %fdat %udat @param);

use YAML;
# If not used here first, we will get an internal server error.
# More details at http://blogs.perl.org/users/ovid/2012/04/the-price-of-cleverness-yaml-is-not-safe.html
# Error message is: "Error in Perl code: Undefined subroutine &main::main:: called at /Library/Perl/5.10.0/YAML/Mo.pm line 5."

use Data::DPath 'dpath';
*]

@Joel: Nothing to do with Mouse or Moo, which are not implemented with insane golf-code. Mo is, for a reason I'm not sure to understand.

I would have had nothing against that as long as it had remained in the Mo sandbox. But a depency of YAML? Seriously?

And please, @Ben, you cant say "not to use YAML.pm in production", it's YAML, a natural choice when searching the CPAN for ... YAML.

Anyway, thanks a lot @Ovid for the hard-core and fruitful investigation!

Ovid, for what's worth, i agree. I don't think Mo should have been put in there and if it had to be put in, at least as a module, not copy-pasted. However, i'm fairly sure ingy disagrees because it served its purpose in that position by sussing out issues like this.

Mithaldu: so if I follow you, breaking YAML is a good way to debug Mo?

Good one! :)

...

That's not what i meant, implied or wrote.

@asukrieh:

If you want robust and fast YAML processing using the normal Dump/Load API, please consider switching to YAML::XS. It is by far the best Perl module for YAML at this time. It requires that you have a C compiler, since it is written in C.

Second paragraph of YAML.pm's documentation…

I'm OK with agreeing on the fact that YAML.pm is not "robust and fast", but it clearly does not imply that YAML.pm is supposed to be unstable.

I'm sorry, I disagree.

Call it Acme::YAML or YAML::Mo if you like, but with such a generic name... it's clearly a trap, IMHO.

[...] it served its purpose in that position by sussing out issues like this.

Sorry, that's what I understood here.

@oid.philipp-soehnlein.de: Wow, an Embperl user. A dying breed these days...

You should thanks Ingy for his great work, if you are clever try to understand a little bit more and don't judge what you don't know well. That code is generated by Mo::inline, if you have a problem submit a patch on Mo repo ... that's why exists github and open source :P

Leave a comment

About Ovid

user-pic Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant. Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.