C::Blocks Advent Day 9
This is the C::Blocks Advent Calendar, in which I release a new treat each day about the C::Blocks library. Yesterday I illustrated C::Blocks::Object::Magic
while writing a simple class that had APIs in both Perl and C. Today I dig into one of the keys of yesterday's example: writing a type that can be used with C::Blocks.
When Perl sees code like this:
my Type::Package $some_variable;
it makes a note that the variable name $some_variable
has a "type" with information available in Type::Package
. I must emphasize that this type information is associated with the variable name itself: nothing special is done to the underlying scalar. If Type::Package
set up a few fields with the fields
pragma, then Perl will check the (spelling of) keys of hash dereferencing at compile time. C::Blocks uses this type information in an orthogonal way: to produce custom code for marshalling your data between Perl and C.
To get an idea of how this works, consider the short script from yesterday which used the KISS library. If I use C::Blocks::Filter;
before the cblock
, then I end up with this script:
use strict;
use warnings;
use C::Blocks;
use KISS;
use C::Blocks::Filter;
my KISS $rng = KISS->new;
print "rng's first value is ", $rng->rand, "\n";
cblock {
printf("rng's second value is %u\n", KISS::rand($rng));
}
which produces output like the following when run (I have added whitespace for clarity):
$ perl test.pl
##################################################
void op_func(C_BLOCKS_THX_DECL) {
SV * SV__PERL_SCALAR_rng = (SV*)PAD_SV(1);
struct KISS__state * _PERL_SCALAR_rng
= xs_object_magic_get_struct_rv(aTHX_ SV__PERL_SCALAR_rng);
printf("rng's second value is %u\n", KISS__rand(_PERL_SCALAR_rng));
}
##################################################
rng's first value is 2079675107
rng's second value is 4185567647
The line containing the printf
demonstrates things I've discussed previously. KISS::rand
becomes KISS__rand
and $rng
becomes _PERL_SCALAR_rng
.
The interesting bit is the thing that comes before. I said on day 6 that C::Blocks detects when you use a sigiled variable in your cblock
and injects code to transform that to the SV*
(or AV*
or HV*
) underlying the variable. (It's worth pointing out that C::Blocks does not allow sigiled variables in clex
, cshare
, or csub
blocks because there is no way for it to know which PAD to work with. This only works with cblock
blocks.) If the variable is typed, and if the type's package contains c_blocks_init_cleanup
, C::Blocks will call that method to get the code to use for the transformation.
The C end of the KISS
library is built around a struct pointer. The one method, KISS::rand
, expects a pointer to a struct. Furthermore, you could directly read or set the state by accessing the x
, y
, z
, and c
members of the struct. It makes sense, then, that the type should produce code that would map $rng
to the pointer to the struct. This is what it does. First it gets the SV*
for $rng
from the current PAD
, but instead of putting it in _PERL_SCALAR_rng
as it would for untyped variables, it puts it in SV__PERL_SCALAR_rng
. The variable _PERL_SCALAR_rng
is used for the struct KISS__state
pointer, which is unpacked with xs_object_magic_get_struct_rv
. By the time we reach the printf
line, our Perl $rng
has been unpacked, and $rng
gets transformed into the pointer to the KISS struct, as expected.
Let's look again at the c_blocks_init_cleanup
code from KISS.pm
:
sub c_blocks_init_cleanup {
my ($package, $C_name, $sigil_type, $pad_offset) = @_;
my $init_code = "$sigil_type * SV_$C_name = ($sigil_type*)PAD_SV($pad_offset); "
. "struct KISS::state * $C_name = xs_object_magic_get_struct_rv(aTHX_ SV_$C_name); ";
return $init_code;
}
The method is called with four arguments: the package ("KISS
"), the gently mangled C variable name ("_PERL_SCALAR_rng
"), the sigil type ("SV
"), and the pad offset (1). It returns a single string with the initialization code, utilizing string interpolation throughout.
Here is the init/cleanup code for double arrays, from C::Blocks::Types
:
package C::Blocks::Type::double_array;
sub data_type { 'double' }
sub c_blocks_init_cleanup {
my ($package, $C_name, $sigil_type, $pad_offset) = @_;
my $data_type = $package->data_type;
my $init_code = join(";\n",
"$sigil_type * SV_$C_name = ($sigil_type*)PAD_SV($pad_offset)",
"STRLEN length_$C_name",
"$data_type * $C_name = ($data_type*)SvPVbyte(SV_$C_name, length_$C_name)",
"length_$C_name /= sizeof($data_type)",
'',
);
return $init_code;
}
Unlike my example from KISS
, this method is written in such a way that it can be used by other type packages, such as float_array
and char_array
. These packages simply inherit from this package and implement an alternative data_type
method. To see an example of this, try:
use strict;
use warnings;
use C::Blocks;
use C::Blocks::Types qw(char_array);
my char_array $string = "Hello!";
cblock {
printf("From C, %s\n", $string);
}
When run with -MC::Blocks::Filter
, I get
##################################################
void op_func(C_BLOCKS_THX_DECL) {SV * SV__PERL_SCALAR_string = (SV*)PAD_SV(1);
STRLEN length__PERL_SCALAR_string;
char * _PERL_SCALAR_string = (char*)SvPVbyte(SV__PERL_SCALAR_string, length__PERL_SCALAR_string);
length__PERL_SCALAR_string /= sizeof(char);
printf("From C, %s\n", _PERL_SCALAR_string);
}
##################################################
From C, Hello!
In this case, quite a bit gets unpacked when using this variable. The original SV*
is SV__PERL_SCALAR_string
, the character array is _PERL_SCALAR_string
, and the length is available as length__PERL_SCALAR_string
. In particular, these special variables can be utilized in our cblock
code like this:
use strict;
use warnings;
use C::Blocks;
use C::Blocks::Types qw(char_array);
my char_array $string = "Hello!";
cblock {
printf("The string '%s' is %d characters long\n", $string, length_$string);
}
Notice how length_$string
gives the length! For most string operations the length is not crucial because the string ends in a null character. This is not the case for the numerical types: the length is a crucial piece of information needed to process the full contents of the array:
use strict;
use warnings;
use C::Blocks;
use C::Blocks::Types qw(double_array);
my double_array $data = pack('d*', 1 .. 10);
cblock {
double sum = 0;
for (int i = 0; i < length_$data; i++) {
sum += $data[i];
}
printf("The sum is %f\n", sum);
}
which produces:
The sum is 55.000000
The idea that length_$variable
would resolve to a variable with useful information is an unplanned but very useful side-effect of how the code extractor works. For lack of a better name, I've taken to calling these extra bits of information "prefix macros" because the code extractor only properly resolves them when you add on letters prior to the variable name, not after it.
It turns out that the code generator expects either one or two return values from c_blocks_init_cleanup
. The first return value is always the initialization code; the optional second return argument is any cleanup code. This is useful for basic types, which have to call sv_setiv
or similar to ensure that any changes you've made are propagated back to the original SV*
. Everything we've seen up to this point have involved pointers to things. Modifying those things would lead to the desired side effects, so no cleanup was necessary.
Finally, there is one more trick worth knowing about type handling. Whenever C::Blocks sees a sigiled variable in a cblock
it will replace it with the gently mangled name, as we have seen. What if simply using a variable is insufficient? In that case you can resort to using macros. For example, when using C::Blocks::Types::Pointers
, you can take the address of a pointer to get something that works (caveats aside). Here is the relevant bit of code from day 7
my double_LL $head = 0;
my double_LLp $tail_p = 0;
cblock {
$tail_p = &$head;
}
If $head
resolved to a local variable, this would lead to an local address, which would become invalid as soon as we left the block. To get around that, C::Blocks::Types::Pointers
actually creates a pointer to the desired pointer type called POINTER_TO_$C_name
, pointing to the address of the underlying IV
slot in the SV*
. It then defines a C macro: #define $C_name (*POINTER_TO_$C_name)
. This means that whenever you see $head
in the cblock
, it is ultimately replaced with a pointer de-reference.
Today I explained how to create your own types with C::Blocks. When your library provides both a Perl and C interface, types make it possible to flow back and forth between Perl and C code and have your variables resolve to the "right" thing. This lets you concentrate on writing actionable code instead of extracting your data from a Perl SV*
.
Leave a comment