Important Changes in YAML::PP v0.019

During the SUSE Hackweek 19 I found time to fix some bugs and make important changes in YAML::PP.

Some of these changes might break code, but I expect this will be rare.

As I see more and more CPAN modules using YAML::PP, I decided to make these changes as soon as possible.

I will explain all changes and the reasons.

Loading YAML in scalar context

In scalar context, the several load* functions will return the first document rather than the last.

This is only relevant if your input has more than one document:

--- # Document 1
a: b
c: d
--- # Document 2
e: f
g: h

Usually you would load that in list context:

my @docs = Load($yaml);

But the common code for loading YAML data uses scalar context:

my $data = Load($yaml);

The behaviour of other CPAN modules is different. YAML::Syck::Load also returns the first document, while YAML::XS::Load, YAML::Tiny::Load and YAML::Load return the last.

I beliieve returning the first document is the more natural behaviour. If you have YAML files with one document, the context does not matter. If you later add more documents to a YAML file, you will still get the same result for the existing code with scalar context.

Changing the default Schema to Core

The YAML 1.2 Specification recommends three different Schemas, that a YAML processor should implement: Failsafe, JSON and Core. Core is actually the recommended default.

In the past, I switched to the JSON Schema by default, because it only has very few special values. However, the official YAML 1.2 JSON Schema is actually different from my implementation. All values that are supposed to be strings must be quoted. Only true, false, null and numbers don't have quotes.

In YAML::PP::Schema::JSON, the quotes are not necessary. But I will probably add an option in the future to require quotes, so that it reflects the official schema.

The YAML world is slowly moving towards YAML 1.2. More and more YAML processors are written that implement YAML 1.2, and most just implement one Schema: Core. So by default, YAML::PP will be compatible with that standard Schema.

To get an overview of the different Schemas, and the behaviour of YAML modules, I created a table with regular expressions per Schema.

Also I created a HTML page from my test data and compared the load results of YAML::PP and other YAML modules.

This should give you an idea what incompatibilities you have to expect if you switch from one of the other modules to YAML::PP. As you can see, neither YAML::Syck, YAML::XS or YAML.pm implement any of the standard schemas.

It would be possible to make YAML::PP behave like one of the other modules for compatibility, but it would be a lot of work. (Let me know if you need this and want to sponsor it ;-)

Behaviour of empty nodes in JSON Schema

Since the official YAML 1.2 JSON Schema does not allow unquoted strings, the empty node is actually forbidden:

---
key: # no value

Because YAML::PP::Schema::JSON allows unquoted strings, I had to decide if empty nodes resolve to null or the empty string ''. I decided to make the empty string the default, but make it configurable:

my $yp = YAML::PP->new( schema => ['JSON'] );
my $yp = YAML::PP->new( schema => [qw/ JSON empty=str /] );
my $yp = YAML::PP->new( schema => [qw/ JSON empty=null /] );

Fix some control character escaping and encoding issues

YAML::PP will now just assume all input data are unicode characters and won't do an explicit utf8::upgrade. See Issue 16.

Also, some control characters weren't correctly escaped. See Issue 17.

Fix Core Schema resolver for inf

While preparing the HTML page for the different schemas, I noticed that I forgot +.inf, +.Inf, +.INF in the Core Schema.

Improve emitter regarding empty lists/hashes

Before, when dumping empty sequences and mappings, the emitter added a newline:

---
empty sequence:
  []
empty mapping:
  {}

The new output will not have that newline:

---
empty sequence: []
empty mapping: {}

Leave a comment

About tinita

user-pic just another perl punk,