My report of the Perl Toolchain Summit 2018 in Oslo

This year, to my surprise, I was again invited to the summit, on short notice.

Again, I was able to visit a city I have never been before and hack four days on YAML and other stuff.

Test::YAML/Test::Base

There was a bug in Test::Base (that also affected Test::YAML). It was confusing CPAN::Reporter when doing a plain use Test::YAML or use Test::Base.

Test::Base always expects at least one test, but CPAN::Reporter only wants to check if it can load the module. This was reported by Slaven.

I fixed it to do nothing at all if there are no tests in the code.

YAML.pm problematic Regex

I had been working on this before the Summit already.

One part of YAML.pm has had a problem with a regex for a while now. Partially it seemed to be a bug in perl, since it had been fixed in newer versions.

But it still seemed to be problematic in some cases.

It's the parsing of quoted strings. According to the reported gihub issues, there was a problem if:

So I finally decided to try and rewrite this part. Instead of trying to parse everything in one regex, my new code would use a while loop which stops at every backslash.

As a result, strings with many escape sequences take twice of the time to be parsed, and strings with very few escape sequences only take half of the time. The unterminated string was now detected very fast instead of hanging forever.

The test with a libasan-debugging enabled perl was also much faster, but still slow enough so it should be disabled when running on such a perl.

At the summit I tried to compile perl with libasan enabled, asking around, unfortunately I couldn't get it working to reproduce the effect Ribasushi reported.

YAML.pm Trailing Comments

YAML.pm has had support for comments. But only for comments on their own line. Various trailing comments did not work like they should:

---
- "string" # comment
- 'string' # comment
- > # comment
  folded block scalar
- | # comment
  literal block scalar
- [ sequence ] # comment
- { x: y } # comment

These all resulted in an error.

The following ones did not, but had unexpected results:

---
- a string with a #comment?
- foo: bar #comment?
--- short text # comment?

The #comment? would actually end up being part of the content.

I had been doing some work on that 1 year ago, but not very well tested. I decided to start working on this again and fixed all cases and wrote tests for them. I hesitated to just fix this bug, so I made it optional, since I thought people might rely on becoming # part of unquoted content, but Ingy decided that it should just be fixed.

I fixed these bugs and others in the pull requests:

YAML.pm $LoadBlessed

YAML::Syck allows disabling loading objects with $LoadBlessed, and now also YAML::XS supports this. Still missing is YAML.pm, and I started to work on that. One esoteric feature is that you can write to any symbol table entry when loading YAML! Consider this harmless looking code:

use YAML;
$main::foo = 23;
my $data = Load("--- !!perl/glob { PACKAGE: main, NAME: foo, SCALAR: 42 }");
say $foo;

The output is 42!

I will also disable this along with $LoadBlessed (although, strictly, it doesn't have to do anything with blessing objects).

I made a pull request today, but documentation is still missing.

Perl Numbers and JSON/YAML

Since I started YAML::PP, I have been wondering about how to serialize different kind of numbers.

In Perl, a scalar cannot only be a string, integer or float, it can have a string flag and the integer flag at the same time. Same for string plus float. It can even have all three flags!

Even worse, there is not one single correct solution when deciding if something should be serialized as a string or a number.

Im my opinion, the behaviour of the newest JSON::PP and Cpanel::JSON::XS is the best. If the scalar has and int and string flag, it will be treated as an int, that has been used in a string context somewhere.

If you use a string in a numeric context, the variable will get a numeric flag added, so you'll get a number, though.

Playing around with the different JSON modules was confusing because they behaved differently, and people on IRC even reported different results for the same module! So it seemed that implementations had changed recently.

To get an overview how the different modules behave in their newest version, I started to generate an overview:

https://gist.github.com/perlpunk/35a07521b07aeea5a6c23a7d068233e7

I still have to put it into a table that is more readable, though.

When doing this, I decided to add support for Inf and NaN to YAML::PP.

YAML::PP Inf/Nan

This was quite easy to add. YAML supports Infinity and NotANumber with .inf, -.inf and .nan.

Only I made a little mistake and did not check on older perls, and promptly after I uploaded 0.006_001 Slaven reported test failures. Seems the stringification is just a bit different for older perls, something which was easy to fix.

bool.pm draft

Perl doesn't have booleans. In many many cases that's not a problem at all. You can use 1 and 0, so where's the problem? Who needs booleans?

Well, wherever you have a boolean in, say, a JSON document, you might want to keep this when writing it again after loading, because other languages actually do have booleans. Or maybe you want to validate some data in your API against a Schema, maybe via OpenAPI.

All the JSON modules do support booleans with a little trick: they bless a reference to 1 or 0 into the class JSON::PP::Boolean. In the past, most of them used their own class, until finally everyone agreed to just use one class.

So that's great. But why do I have to load a JSON module just to get booleans? Seemed obvious maybe for the JSON module authors, but when I started implementing YAML I thought it's weird for YAML::PP to load JSON::PP. But I implemented it that way, anyway.

Some day, Joel Berger mentioned on IRC that he didn't like the fact that one had to load JSON::PP just to get boolean functionality, and I agreed and thought, apparently I'm not the only one finding this weird.

So at the Summit, I tried to convince Kenichi, the current maintainer of JSON::PP, that we should have a bool.pm instead. We sat together with Joel a bit and discussed things.

Kenichi's wish is to make it the least disruptive as possible. One solution for that is to create a bool.pm that simply pretends to be a JSON::PP::Boolean, so it will take over the functionality of it and return JSON::PP::Boolean objects. This way people would be able to start using bool.pm right away. JSON::PP::Boolean also will inherit from bool by simply adding it to @JSON::PP::Boolean::ISA. No more loading of JSON::PP in the background, but all checks for classname in several JSON (and YAML) modules will still work. At some point, and that can take a while, they can switch over to just checking for isa('bool'). Then, some time later again, bool.pm could get independent from JSON::PP and simply create bool objects. But that will only work if users of this module don't check for the exact classname, but only for the result of isa.

I still have to finish this draft of bool.pm and suggest it on the p5p list. Feedback welcome.

Why didn't we choose boolean.pm? boolean.pm behaves slightly different than JSON::PP. Both modules work the way they do via overloading string, numeric and boolean context. boolean.pm returns another boolean.pm object if you say

$false = ! $true;

JSON::PP::Boolean just returns the perl "boolean", so people use this to, uhm, deobjectify their booleans.

I would like to add a not or complement method to bool.pm that would return the negated boolean value as an object.

Conclusion

It was great to be at the Summit. Sometimes, sitting together and talking about things can get things done much more efficiently.

Since Ingy didn't go this year, I was mostly working alone on the YAML stuff, but without the Summit, I probably wouldn't have decided to fix this old trailing comments bug in YAML.pm. While digging a bit deeper in YAML.pm during this, I found other bugs and fixed them in the last weeks.

While YAML::PP hopefully can become a replacement for YAML.pm at some point, it will still be around and getting used, so it's worth fixing those bugs.

Thanks to the crew for organizing a hacking environment with lots of space, a good network and great food!

Sponsors for the Perl Toolchain Summit 2018

Thanks very much also to the sponsors of this Summit:

NUUG Foundation, Teknologihuset, Booking.com, cPanel, FastMail, Elastic, ZipRecruiter, MaxMind, MongoDB, SureVoIP, Campus Explorer, Bytemark, Infinity Interactive, OpusVL, Eligo, Perl Services, Oetiker+Partner.

3 Comments

Can I see some pictures of Perl Toolchain Summit 2018?

I want to introduce Perl Toolchain Summit 2018 in my Japanese site. If there are some images, it is fun to read.

The "perl_events" account on Instagram (https://www.instagram.com/perl_events/) has photos from all the Perl events. There's a #pts2018 tag for the latest Perl Toolchain Summit, but I can't figure out how to search Instagram for that user and that tag (Instagram seems completely overrun by spam).

Thank you. I try to see instagram at first.

Leave a comment

About tinita

user-pic just another perl punk,