YAML::PP Grant Report December 2017
Hi there!
Thanks for reading my report. In the last report I included a mini tutorial about string quoting methods in YAML. This time I've written an introduction into YAML Schemas and Tags.
In December I have been working about 60 hours on YAML::PP, YAML::XS, libyaml and the Schema article.
See also my previous reports on blogs.perl.org (Aug/Sep, Oct, Nov ) and news.perlfoundation.org (Aug/Sep, Oct, Nov ).
YAML::PP
I released YAML::PP 0.005_001.
Schema
I implemented the Failsafe, JSON and Core Schema for the Loader. To understand what I did and what I would like to do in the future, you might want to read my Schema Introduction. I'll wait here.
Welcome back.
As you read, YAML is quite powerful, but so far we don't really have support for that in Perl. I implemented the three Schemas so that you can choose the Schema you want to load.
The cool thing is that this will enable me to also implement the YAML 1.1 Types, so you will be able to load YAML 1.1 files. The difference between 1.1 and 1.2 is mostly the Schemas. Syntactically every YAML 1.1 document should be parseable by a YAML 1.2 parser (I think).
Also, this will be a base for implementing loading perl objects via the
!perl/(hash|array|...):ClassName
tags.
The difficulty is not so much the implementation of the rules, but providing an API that will work for all kinds of things people might want to do.
Currently the API for loading a specific Schema looks like this;
my $ypp = YAML::PP->new( schema => ['JSON'] ); # default is Core at the moment
In the future you would be able to additionally load timestamps as DateTime objects, for example, via a DateTime plugin, just like you can do it with DBIx::Class.
In the examples
directory you can find a script that implements templating
of YAML strings and replacing ${...}
with external template variables.
Like mentioned above, I'm still figuring out a good API for that.
dump_file
You can also Dump files directly with dump_file
now.
Also, the legacy interface you know from other YAML processors, is now
documented, so you can use Load()
, LoadFile()
, Dump()
and DumpFile()
.
These will always use the defaults. Configuration is only possible with the
Object Oriented interface.
YAML::PP::Writer
The Emitter now doesn't emit directly, but uses YAML::PP::Writer
.
Other things
- Perl boolean false is now loaded as the empty string instead of 0, to match the perl behaviour
- Error messages have column numbers now, but they are wrong in some cases.
YAML::XS
This month, I learned some more perl XS.
Regex roundtrip
An issue was reported that regexes were growing when repeatedly loaded and dumped:
match: !!perl/regexp (?i-xsm:OK)
match: !!perl/regexp (?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^u:(?^ui:OK)))))))))))
The problem is that you also get the same effect without YAML involved:
my $regex = qr{OK};
say $regex;
my $str = "$regex";
$regex = qr{$str};
say $regex;
__END__
(?^u:OK)
(?^u:(?^u:OK))
There is already code to prevent that in YAML::XS, but at some point, the u
flag was introduced, and the code wasn't updated for this. No test existed for
it, so nobody realized that it broke.
I added a quick fix for this in PR 70.
Loading many regexes
An issue was
reported (actually similar to one that was previously reported three years ago),
but this one I could reproduce. When loading more than about 125 regexes in one
document, it resulted in a panic: memory wrap
error.
The loading of regexes involves a callback to perl, so I learned about XS perl callbacks which are documented in perldoc perlcall.
The fix in PR 71 turned out to be really simple, once I had read the docs: Commit
Loading Perl objects should be optional
For many years, YAML::XS loaded objects via the !!perl/...
tags by default.
This can be a security problem when you load YAML from untrusted sources.
You just need to find a class (one that is loaded already) that does
something destructive in its DESTROY
handler.
Reini Urban created an issue about that a while ago, suggesting implementing a SafeLoader similar to PyYAML.
Also, Dominique Dumont, a debian maintainer, joined the discussion, pointing out that it would be really important to be able to load YAML safely.
As a result of the discussion, I implemented the $YAML::XS::LoadBlessed
option
with which you can turn off loading objects. To be backwards compatible, it's
still true by default. The name was chosen as YAML::Syck already has an option
with the same name.
I implemented this in PR 73 and PR 74.
libyaml
This month, I also was involved with libyaml itself.
Singlequotes in Doublequotes
An issue was reported, that libyaml incorrectly allows the following:
---
quoted: "escaped \' singlequote"
The fix for it was probably one of the easiest pull requests I made, but hey, it was C!
I simply had to take a case
statement out of a switch
:
PR 74
libyaml and YAML Test Suite
A while ago, the yaml-test-suite had
been integrated into libyaml. First it lived in its own repository, and then it
was integrated into the main repository. That was probably the reason why the
make
process for it was overly complicated.
At some point, Travis CI on MacOS started to report failures which weren't
really reproducible. The main make
was called recursively and it was hard to
understand what was going on.
I also realized that it was not testing the source libyaml, but the already installed system libyaml (and failing when it was not installed).
In PR 76, I integrated the test
programs into the existing test code, resulting in a much more trivial make
process. It tested the source libyaml now, and Travis CI was passing again.
Leave a comment