November 2020 Archives

The mysterious case of the SVt_PVIV

The other day I wanted to send my friend some silly emojis on LINE and so I updated my flaky old Unicode browser to the new-fangled Unicode with values above 0x10000, so that I could fetch the Emojis, which start around here. The thing also features a perl script which fetches values from Unicode::UCD using the charinfo function. I also updated to Perl 5.32 around the same time. Now the funny thing was that I started getting all kinds of errors about invalid JSON in the browser console. My Perl script was sending something of the form {... "script":Common ...} from my module JSON::Create, which is not valid JSON due to not having quotes around Common, and obviously my module was faulty.

Investigating the fault led me into the XS (C) code of my module where the value part of the JSON thought that the value associated with the script key in the hash reference returned by charinfo was of the form SVt_PVIV. PV means "pointer value" which is basically a string, and IV means "integer value", you can probably guess what that is supposed to contain.

My stupid module assumed that the string in an SVt_PVIV was just a representation of the IV part, so it just printed the PV as a string without quotes, leading to the above Common appearing. But it doesn't seem to be so. Is it some kind of "dual variable"? It turned out that the IV part wasn't even valid, so forcing it to treat the SVt_PVIV as an IV didn't work. The solution at the moment is to test with something called SvIOK whether the IV part is OK then treat it as a string if not.

The mysterious part for me is why is the script value an SVt_PVIV in the first place? Answers on a postcard, or comment below if you prefer.

I tried to replicate this bug for testing purposes using Scalar::Util's dualvar, but that creates an SVt_PVNV (floating point/string combo), which my daft module treated differently again.

JSON::Create now features indentation

In version 0.27 of JSON::Create I added a new indentation feature. This was added basically out of necessity. Originally the purpose of the module was sending short bits of JSON over the internet, but I've been using JSON more and more for processing data too. I've spent quite a long time working on a web site for recognition of Chinese, and I've been using JSON more and more extensively. The basic data file for the web site is a 168 megabyte JSON file. Not indenting this kind of file makes for "interesting" problems if one accidentally opens it in an editor or on a terminal screen, a million characters all on one line tends to confuse the best-written text reading utilities. So after years of suffering the relief is tremendous, and now I have tab-based indentation in JSON::Create.

Originally I thought that I should make all kinds of customisable indentation possible, but then it occurred to me that basically any fool armed with a regular expression could easily alter the indentation however they want to. I put a simple example in the documentation.

About Ben Bullock

user-pic Perl user since about 2006, I have also released some CPAN modules.