Smartmatch in 5.27.7

What happened?

In the latest development release of perl, smartmatch changed quite a bit.

Almost everything you believed about smartmatching is now wrong

No really, everything. All previous rules are gone except a single one: you can smartmatch against any object that overloads smartmatching (the only "objects" that overload them out of the box are qr// regexps).

Matching against a scalar value? Gone. Matching against a list of values? Also gone.

when is no more.

The when keyword is gone, split into two keywords: whereis and whereso; one smartmatches the value against the current subject and the other does a simple boolean check much like if. I'll let you guess which is which. This split is for good reasons (when sometimes does one and sometimes the other, sometimes depending on things like optimizations), but that doesn't make this any more intuitive.

use 5.010/use 5.028 won't guard you from this.

It would have been possible to support both behaviors, because the old behavior is already using In fact one could even enable old and new style when at the same time in a scope without problems. None of this was done though.

My suggestions

new smartmatch should be more useful.

Right now one can't do anything with it without a helper library (like my Smart::Match). That's just silly.

The insanity of old style matching was that the overloads depended on both operands, this gave rise to hard to predict behavior, but that doesn't mean one can't define useful behavior that only depends on the right side that follows the most common use-cases. In particular matching scalars stringwise, and making $foo ~~ ["bar", "baz" ] mean $foo in [ "bar", "baz" ].

This should be opt-in

Despite retroactively adding warnings the feature is experimental, it has become a widely used feature. This change is breaking a (yet unknown but significant) number of CPAN modules, and likely much more code in darkpan. Breaking this unless strictly necessary is dumb.

And it isn't necessary. We can easily only enable the new behavior when asked for. That way we can improve smartmatching without breaking a decade worth of smartmatching code.

We need better words

whereso and whereis are way too confusing. I'm not sure what that would look like but this just doesn't cut it.

Above all, we need a better process

Somehow, p5p made a fundamental breaking change to the language without even trying to involve the wider community. This blogpost shouldn't have been the first time the wider community hears about it. And we need that wider community IMHO because no one on p5p (myself included) has the kind of language design talent that's required to do the sort of thing we did here. I don't know what the solution would look like exactly, but I like this process even less than I like the outcome so far. I must admit I'm somewhat jealous of Python's PEP process, though I'm not sure that would work without a language designer to guide it.

File::Slurp is broken and wrong

If you are using File::Slurp, you should possibly reconsider. Basically, there are three reasons to do so;

It is wrong in a lot of cases.

File::Slurp predates IO layers, and as such doesn't take them into account well. A few years ago, after some complaints, an attempt was done to make it handle encodings. This was nothing short of being wrong.

The best known bug in this area is #83126, which means that :encoding() layers are always interpreted as :utf8. This not only means that UTF-8 encoded text is not validated (which can be a security risk), but also that files in other encodings (such as UTF-16) will be read as UTF-8, which surely will give an incorrect result.

Likewise it's not handling :crlf correctly, in particular explicitly asking for :crlf will always disable it, even on Windows.

Basically, it's doing all binmodes wrong except the one you shouldn't be using anyway (:utf8), and you should pretty much always be using a binmode, so there's no way to win really.

The interface is poorly huffmanized.

Huffmanization is the process of making commonly used operations shorter. File::Slurp is failing to huffmanize in the unicode world of 2015. Text files are usually UTF-8 nowadays, which in File::Slurp would typically be read_file($filename, binmode => ':raw:utf8'). The shortest option, read_file($filename), does something that most people don't really want anymore: latin-1 encoded files with platforms specific line-endings.

This is mainly the fault of perl itself (backwards compatibility is a PITA), but a library can work around this to make the programmers life easier.

It is poorly maintained

The critical bug mentioned above has been known for about two years, yet the author hasn't even bothered to respond to it, let alone fix it. There hasn't been a release in 4 years despite an increasingly long list of issues. Worst yet, this isn't the first time such a thing happens; before his last maintenance surge in the spring of 2011 the author was also missing-in-action for years. This negligence is inexcusable for a module that is so commonly depended upon.


Instead of File::Slurp, I recommend you use one of these modules depending on your needs:

If your needs are minimal, I'd recommend my File::Slurper. It provides correct, fast and easy to use slurping and spewing functions.

If your needs are average (which is the case for most people), I'd recommend Path::Tiny. This provides a well-balanced set of functions for dealing with file paths and contents.

If you want to go for maximal overkill, try IO::All. It will do everything you can imagine and more.

2 weeks of perl

It all started with the summer meeting on the 9th of August. I happened to be around there, so popped in. is a refreshingly young perl monger group (I might even have been older than the average age there, that's a first for me). At first I didn't know anyone, other than the guest speaker Mark Keating, but after my presentation I had lots of people approaching me and I had a brilliant evening.

A short week later I flew to Germany, for the Perl Reunification Summit in Perl. Like Schwern I arrived a day earlier than most, so I had a calm start of the meetup. It was mostly a gathering of familiar to me faces, though a significant number I hadn't really spoken to before, specially the Perl 6 guys, -Ofun attracts awesome people. I spent most of the PRS talking to people, and doing a little coding (both related and unrelated). It was a very enlightening meetup.

Lastly, there was YAPC::EU. Despite the sometimes unbearable heat, it was awesome. At some points it seemed a bit less organized than my previous YAPCs, but that may also be me noticing more of what's going on. I spent most of my time in the hallway track, which extended into the pub track, and I spent enough time discussing (and occasionally ranting) that it's a miracle that I still have voice left. In between I found enough time to attend some talks, interestingly I attended most of them on the day I gave one myself. After doing threads last year I could only top it with signals this year. I'll have a challenge to come up with a crazier, I think I'll have to look in a vastly different direction (I have ideas already). After a full week of conferencing, I was relieved to be going home though.

So in all I met Mark Keating in 3 different places in 2 weeks time, I'd almost accuse him of stalking me!

What you should know about signal based timeouts

The problem

I think we've all seen code like this example from perlipc:

my $ALARM_EXCEPTION = "alarm clock restart";
eval {
    local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
    alarm 10;
    flock(FH, 2)  || die "cannot flock: $!";
    alarm 0;
alarm 0;
if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }

Here, signals are used to put a time limit on some action. However sometimes this doesn't work as wanted. In particular, some C libraries used in XS modules don't honor the deferred signaling resulting in it being ignored until the C function has finished, which is unlikely to be what you want.

Therefore, people resort to unsafe signals

use Sys::SigAction qw( set_sig_handler );
my $ALARM_EXCEPTION = "alarm clock restart";
my $h;
eval {
    $h = set_sig_handler('ALRM', sub { die $ALARM_EXCEPTION }, { });
    alarm 10;
    flock $fh, 2 or die "cannot flock: $!";
    alarm 0;
alarm 0;
$SIG{ALRM} = $h;
if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }

This works as expected, mostly, but there is a serious problem with doing this; serious enough to have an explicit and specific high severity advise against it in CERT's secure coding guide (and it also happens to violate most other secure coding advises regarding signaling).

Signal handlers (or at least the real, unsafe ones) have a highly restricted set of operations they can safely to perform, doing anything that's not allowed means risking segfaults and data loss. This is why we needed "safe" signaling in the first place. By longjumping out of the unsafe/real signal hander (which is what die does), those restrictions are continued into the rest of the program. That means that anything from that point on can (and at some point probably will) cause segfaults and other bugs.


The way out?

That's the harsh part, sometimes there isn't any easy way out. If a piece of C code doesn't have it's own timeout support, there may be no alternative. The real solution is to write blocking/computationally intensive software in such a way that it can handle this more graciously, for example by using an event loop, but often one has to deal with the tools one has.

So, I'm not saying everyone is wrong for using unsafe signal timeouts, but you should be aware of and accept the risks that come with it.

Looking for Ilja Tabachnik

I'm looking for Ilja Tabachnik.

I want to fix his only module on CPAN (POSIX::RT::MQ), but his public email address no longer exists. If I can not reach him I will ask the PAUSE admins for permission to take over this module.