More on YAML, syck looks much better

In my last YAML post I said libsyck is not maintained anymore. I had a look, and this is wrong. Even if _why does not work on it anymore, (he came back btw recently), it is maintained and made some progress in libsyck, which is not reflected in the YAML::Syck perl part.

It is a mess, I admit, but easier fixable than the YAML::XS mess. So I took libsyck upstream, which is at 0.70, and merged it with our changes which are at 0.61. Our perl-specific changes are a complete mess, so I cleaned that up to be acceptable upstream into a new 0.71.

merge back various changes from upstream (my own WIP version 0.71)

  • const char*, recompiled grammar, ...
  • alloc +1 for the final \0
  • add proper type casts

    sanify various unmergable hacks into proper flags, which can be set perl-specific:

  • add emitter->nocomplexkey flag, default=0, 1 for perl.

  • rename syckemit2quoted1 to syckemit1quotedesc
  • and rename scalar2quote1 to scalar1quoteesc (JSON singlequote as single-quoted with dq-like escapes)

    remove some other unmergable hacks:

  • syck_base64enc requires an ending \n

YAML::Syck has many advantaged over YAML::XS. It does support reading and writing to file streams, which means it does not need to slurp each file into a buffer and process that. It can process streamable buffers. libyaml can do that also, but YAML::XS never implemented that. I only added a LoadFile method, but not DumpFile.

YAML::XS doesn't really use the nice architecture libyaml provides, it rather does it's own perl-specific callbacks, bypassing many advantages of libyaml.

libsyck is much better written than libyaml, no question about that. It has much less bugs, much more options to handle, but it got stuck at YAML 1.1 Anybody really needs YAML 1.2? I haven't checked the changes yet.

My changes (still WIP) are at:

So now I'm pondering to convince everybody to ditch YAML and YAML::XS completely in favor of YAML::Syck. Let's see how this will turn out... In fact it's only a tiny patch to CPAN, and I can do that by my own, since CPAN is in core.

My core integration for YAML::XS is at:

What I need now a is good YAML testsuite which merges the validators required by core (CPAN::Meta) and various interop testing as I did with Cpanel::JSON::XS, esp. roundtrips, add the perl module back to syck to give it into sane hands (this migth be tricky as it involves testing with ruby, php, python, ...), do benchmarks and going over the tickets.

What I know is that YAML.pm processing over my cpan prefs is ~10x slower than with YAML::Syck. The current performance is unacceptable, and YAML::XS emitting unindented seq elements for a map child ditto. Maybe I have to fork YAML::XS to a Cpanel::YAML::XS, but most of the fixes need to be done in libyaml itself, and let's see how fixing syck turns out.

2 Comments

What I need now a is good YAML testsuite

People are working on a cross-language testsuite. Not sure what the current state is, this is a rather non-trivial endeavor.

YAML::XS emitting unindented seq elements for a map child ditto

If you mean:

foo:
- 1
- 2
- 3

Then that actually is valid YAML. I was also surprised by that. As is

- foo: 1
bar: 2

for that matter

If the dude writing on that domain now is _why, I’ll eat a straw broom.

About Reini Urban

user-pic Working at cPanel on cperl, B::C (the perl-compiler), parrot, B::Generate, cygwin perl and more guts, keeping the system alive.