XS bits: Overloaded interfaces
When writing Perl, people often create hybrid interfaces that accept either a reference to an array or hash, a string, or a reference to a string. The Perl code to do appropriate conversion behind the scenes is usually trivial. Some even use this to overload their interface to do something entirely unrelated depending on the type passed in. However much one might loathe such interfaces, when replacing Perl code with XS, one usually has to reproduce the properties of the original. That is what this entry is about.
A rather reasonable example of such a hybrid interface is PPI::Document
whose constructor accepts either a string (interpreted as a file name) or a reference to a scalar (interpreted as a reference to a scalar containing the code as a string).
While (different) named arguments would have been clearer for a casual reader of the resulting code, this case of an overloaded interface is a generally reasonable optimization.
The straightforward (and least error prone) way to provide an overloaded interface is to keep the interface in Perl and just call XSUBs from there using a simpler, more XS-friendly interface. If I was replacing said constructor of PPI::Document
, I could do something like the following pseudo-code:
package PPI::Document;
sub new {
my $class = shift;
my $source = shift;
if (not ref($source)) {
return _xs_new_from_file($source);
}
elsif (ref($source) and ref($source) eq 'SCALAR') {
return _xs_new_from_string($source);
}
else {
croak("Huh?");
}
}
But if the code in question is in a tight loop already or you are a tad crazy, you may want to have this logic in XS as well. This is one way to do it:
void
new(class, source)
SV* class;
SV* source;
INIT:
SV* inner;
PPCODE:
if (!SvROK(source))
mXPUSHs( _new_document_from_file(class, source) );
else {
inner = SvRV(source);
if (SvTYPE(inner) <= SVt_PVMG)
mXPUSHs( _new_document_from_string(class, SvRV(source)) );
else
croak("Huh?");
}
I'll pull that apart in detail for an XS beginner in a moment. The key bits for the interface are
!SvROK(source)
, which tests whether the source SV is a reference
at all, and SvTYPE(inner) <= SVt_PVMG
, which ensures that the dereferenced SV is a scalar
(and not an array, etc.).
While the test for an SV being a reference is fairly common and simple, the test for being a scalar
reference is slightly more obscure. It grabs the type (enum) of the SV using SvTYPE()
and checks whether the type is smaller or equal to SVt_PVMG
. SVt_PVMG
indicates
a scalar with magic attached. The reason we're using that for less-than-or-equal comparison
lies in the order of the SV types:
All SV types below SVt_PVMG happen to be scalars. Above, you'll find more complicated things
such as arrays, hashes, code references, etc. Using this construct, you could easily add cases
that test for array or hash references by comparing (equality!) with
SVt_PVAV
or SVt_PVHV
respectively.
Now, the example punts on one bit of the PPI::Document->new()
interface:
You can call new()
without arguments to receive an empty document.
Optional parameters to an XSUB aren't particularly complicated once you understood
how parameters are actually passed inside perl. But this is for another post.
Here are a few random notes that may or may not help XS beginners understand the code:
-
The XSUB is declared void and the actual C code is inside an XS section named
PPCODE
. This tells the XSUB compiler that we will (if necessary) manage returning values via the argument stack ourselves. -
This is done with the
mXPUSHs()
macro which takes an SV (which is assumed to be returned by the_new_document_from_*
functions) and pushes it on top of the argument stack. TheX
indicates that it will extend the size of the stack if necessary. Thes
suffix indicates that we're returning a pre-manufactured SV. Then
prefix means that the macro will mortalize the SV. This is roughly equivalent to marking it as a temporary and necessary for all elements of the argument stack. I'm going through all bits of this macro because there are a ton of variants in the API which become moderately obvious once you understood the naming conventions.
Okay, I have no idea what's wrong with my HTML this time. The w3c validator thinks there is something wrong: http://validator.w3.org/check?uri=http%3A%2F%2Fblogs.perl.org%2Fusers%2Fsteffen_mueller%2F2010%2F04%2Fxs-bits-overloaded-interfaces.html&charset=%28detect+automatically%29&doctype=Inline&ss=1&group=0&verbose=1&st=1&user-agent=W3C_Validator%2F1.767 but it doesn't *really* tell me where I screwed up. Pointers would be welcome. Also, any information on how I can eschew writing HTML for some markup that allows for at least pretty-printed code with embedded links would be helpful for future posts.
Most erros have to do with ammpersands. In valid HTML they should be written as an entity (&) to be valid. This is a very common error.
Some of the errors are from the templates, which I believe we can't touch (or can we?). Is there a specific reason for wanting a validated page? Even mobile browsers are advanced enough to handle this kind of pages and it's XHTML 1.0 Transitional anyway...
Burak: This particular problem seems to result in garbled entries on the blogs.perl.org front-page. I'm just using the validator to identify it.