YAML::PP Grant Report December 2017

Hi there!

Thanks for reading my report. In the last report I included a mini tutorial about string quoting methods in YAML. This time I've written an introduction into YAML Schemas and Tags.

In December I have been working about 60 hours on YAML::PP, YAML::XS, libyaml and the Schema article.

See also my previous reports on blogs.perl.org (Aug/Sep, Oct, Nov ) and news.perlfoundation.org (Aug/Sep, Oct, Nov ).

YAML::PP

I released YAML::PP 0.005_001.

Schema

I implemented the Failsafe, JSON and Core Schema for the Loader. To understand what I did and what I would like to do in the future, you might want to read my Schema Introduction. I'll wait here.

Welcome back.

As you read, YAML is quite powerful, but so far we don't really have support for that in Perl. I implemented the three Schemas so that you can choose the Schema you want to load.

The cool thing is that this will enable me to also implement the YAML 1.1 Types, so you will be able to load YAML 1.1 files. The difference between 1.1 and 1.2 is mostly the Schemas. Syntactically every YAML 1.1 document should be parseable by a YAML 1.2 parser (I think).

Also, this will be a base for implementing loading perl objects via the !perl/(hash|array|...):ClassName tags.

The difficulty is not so much the implementation of the rules, but providing an API that will work for all kinds of things people might want to do.

Currently the API for loading a specific Schema looks like this;

my $ypp = YAML::PP->new( schema => ['JSON'] ); # default is Core at the moment

In the future you would be able to additionally load timestamps as DateTime objects, for example, via a DateTime plugin, just like you can do it with DBIx::Class.

In the examples directory you can find a script that implements templating of YAML strings and replacing ${...} with external template variables.

external-vars-templates

Like mentioned above, I'm still figuring out a good API for that.

dump_file

You can also Dump files directly with dump_file now. Also, the legacy interface you know from other YAML processors, is now documented, so you can use Load(), LoadFile(), Dump() and DumpFile(). These will always use the defaults. Configuration is only possible with the Object Oriented interface.

YAML::PP::Writer

The Emitter now doesn't emit directly, but uses YAML::PP::Writer.

Other things

  • Perl boolean false is now loaded as the empty string instead of 0, to match the perl behaviour
  • Error messages have column numbers now, but they are wrong in some cases.

YAML::XS

This month, I learned some more perl XS.

Regex roundtrip

An issue was reported that regexes were growing when repeatedly loaded and dumped:

match: !!perl/regexp (?i-xsm:OK)
match: !!perl/regexp (?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^ui:OK)))))))))))

The problem is that you also get the same effect without YAML involved:

my $regex = qr{OK};
say $regex;
my $str = "$regex";
$regex = qr{$str};
say $regex;
__END__
(?^u:OK)
(?^u:(?^u:OK))

There is already code to prevent that in YAML::XS, but at some point, the u flag was introduced, and the code wasn't updated for this. No test existed for it, so nobody realized that it broke.

I added a quick fix for this in PR 70.

Loading many regexes

An issue was reported (actually similar to one that was previously reported three years ago), but this one I could reproduce. When loading more than about 125 regexes in one document, it resulted in a panic: memory wrap error.

The loading of regexes involves a callback to perl, so I learned about XS perl callbacks which are documented in perldoc perlcall.

The fix in PR 71 turned out to be really simple, once I had read the docs: Commit

Loading Perl objects should be optional

For many years, YAML::XS loaded objects via the !!perl/... tags by default. This can be a security problem when you load YAML from untrusted sources. You just need to find a class (one that is loaded already) that does something destructive in its DESTROY handler.

Reini Urban created an issue about that a while ago, suggesting implementing a SafeLoader similar to PyYAML.

Also, Dominique Dumont, a debian maintainer, joined the discussion, pointing out that it would be really important to be able to load YAML safely.

As a result of the discussion, I implemented the $YAML::XS::LoadBlessed option with which you can turn off loading objects. To be backwards compatible, it's still true by default. The name was chosen as YAML::Syck already has an option with the same name.

I implemented this in PR 73 and PR 74.

libyaml

This month, I also was involved with libyaml itself.

Singlequotes in Doublequotes

An issue was reported, that libyaml incorrectly allows the following:

---
quoted: "escaped \' singlequote"

The fix for it was probably one of the easiest pull requests I made, but hey, it was C!

I simply had to take a case statement out of a switch: PR 74

libyaml and YAML Test Suite

A while ago, the yaml-test-suite had been integrated into libyaml. First it lived in its own repository, and then it was integrated into the main repository. That was probably the reason why the make process for it was overly complicated.

At some point, Travis CI on MacOS started to report failures which weren't really reproducible. The main make was called recursively and it was hard to understand what was going on.

I also realized that it was not testing the source libyaml, but the already installed system libyaml (and failing when it was not installed).

In PR 76, I integrated the test programs into the existing test code, resulting in a much more trivial make process. It tested the source libyaml now, and Travis CI was passing again.

Happy New Year!

Leave a comment

About tinita

user-pic just another perl punk,