C::Blocks Advent Day 8
This is the C::Blocks Advent Calendar, in which I release a new treat each day about the C::Blocks library. Yesterday I showed one way to build a (mildly) complex data structure, including handling pointers and managing memory. Today I will explain how to tightly control access to pointers using classes and C::Blocks::Object::Magic
.
My example code yesterday was heavy on C pointers, which will come as no surprise to anyone who has programmed in C. With C::Blocks::Types::Pointers
, managing these pointers was painless, even easy. The cblock
line $tail_p = &$head
is particularly smooth.
However, that line should also sound off alarm bells to anyone who has worked with both Perl and C. The idea of carrying around pointer values in Perl scalars is not the real problem (while it's not the safest option, XS programmers have been doing that for decades with the T_PTR
typemap). The problem is in taking the address of $head
. Where does $tail_p
actually point? It points to the address of the IV slot of $head
. Things could quickly go downhill if we cause $head
to upgrade its internal memory representation. This is easier to do in Perl than you might realize.
One way to change a variable's internal representation is to use it in a string context, such as printing it. This example shows exactly that:
use strict;
use warnings;
use C::Blocks;
use C::Blocks::Types::Pointers
void_p => 'void*',
void_pp => 'void**';
my void_p $address = 0;
my void_pp $ref_to_address = 0;
cblock {
$ref_to_address = &$address;
printf("From C, after assignment, address of address is %p; ref_to_address is %p\n",
&$address, $ref_to_address);
}
print "From Perl, address is $address\n";
cblock {
printf("From C again, address of address is %p; ref_to_address is %p\n",
&$address, $ref_to_address);
}
An example of output for this on my machine is:
$ perl test.pl
From C, after assignment, address of address is 0x12fb8e0; ref_to_address is 0x12fb8e0
From Perl, address is 0
From C again, address of address is 0x12f8090; ref_to_address is 0x12fb8e0
The agreement in the first line shows that ref_to_address
has the correct value. By the third line they disagree. I should reiterate that the problem is not with pointers stored in Perl scalars: these are fine and their values persist correctly. The problem is when I try to use C::Blocks::Types::Pointers
to manage a pointer to a pointer, and then accidentally upgrade the SV*
holding the original pointer. My pointer-to-a-pointer will point to a newly invalid slot in memory that was just returned to the memory pool.
(Note: if I revise the Perl print
to be a printf
instead, it would not upgrade the underlying scalar. If you find yourself regularly using C::Blocks::Types::Pointers
, you should make a standard practice of using printf
instead of print
when printing pointer values.)
While there are many other ways to store pointers, the most elegant solution I've seen is XS::Object::Magic
. I liked it so much that I ported it to C::Blocks as C::Blocks::Object::Magic
. This approach uses Perl Magic (literally) to store pointers. Magic is a mechanism for overriding core behaviors of an individual scalar, array, or hash (such as assignment). It is orthogonal to the object system and does not rely on blessing. Attaching a bit of magic to a Perl variable requires a struct with applicable methods, and an optional pointer to additional information. XS::Object::Magic
(and therefore C::Blocks::Object::Magic
) store the pointer by adding magic with no methods (a struct filled with null pointers) and using the pointer slot associated with this null magic to store the desired pointer. Using this approach, the pointer is only accessible from C code, and pointers can be attached to a scalar, an array, or a hash. The last option is particularly nice since it means I can write a hashref-based object with C data safely tucked away.
The next three code snippets comprise KISS.pm
. It combines a number of concepts I've brought up thus far, so I've broken the code into chunks to illustrate each idea. I start with
# KISS.pm
package KISS;
use strict;
use warnings;
use C::Blocks;
use C::Blocks::Types qw(uint);
use C::Blocks::Object::Magic;
# The KISS random number generator C-side implementation
cshare {
struct KISS::state {
unsigned int x, y, z, c;
};
/* force xs_object_magic_get_struct_rv to be included in this symbol
* table, so that imports of KISS get this symbol. */
void * KISS::ignore_me = &xs_object_magic_get_struct_rv;
unsigned int KISS::rand(struct KISS::state * s) {
unsigned long long t, a = 698769069ULL;
s->x = 69069*s->x+12345;
s->y ^= (s->y<<13); s->y ^= (s->y>>17); s->y ^= (s->y<<5);
t = a*s->z+s->c; s->c = (t>>32);
return s->x+s->y+(s->z=t);
}
}
Because this use
s C::Blocks and contains a cshare
block, this module will provide C code to the lexical contexts where it is used. I use double-colons in my struct and function names so to minimize the likelihood of name clashes with other libraries. Also notice the bit about KISS::ignore_me
. I have this line to force C::Blocks to copy the symbol xs_object_magic_get_struct_rv
into this symbol table. This ensures that any code that use
s this one will be able to call that function. I'll cover more about symbol table tricks like this in a later post.
# Also make it possible to use KISS as a cblock type
sub c_blocks_init_cleanup {
my ($package, $C_name, $sigil_type, $pad_offset) = @_;
my $init_code = "$sigil_type * SV_$C_name = ($sigil_type*)PAD_SV($pad_offset); "
. "struct KISS::state * $C_name = xs_object_magic_get_struct_rv(aTHX_ SV_$C_name); ";
return $init_code;
}
By implementing a function called c_blocks_init_cleanup
, KISS
can be used as a type for C::Blocks. This means that I can type my KISS $rng
, and this type conversion code will be used. In fact, all code written after this function can use the KISS
type, even code in the same module. Obviously there's a lot going on in this that is beyond the scope of this treat: I'll cover how to write a type library soon.
# Perl-side constructor. Build an empty hash and attach the
# rng state struct to it.
sub new {
my $class = shift;
my $self = bless {}, $class;
cblock {
struct KISS::state * state;
Newx(state, 1, struct KISS::state);
*state = (struct KISS::state){123456789, 362436000, 521288629, 7654321};
xs_object_magic_attach_struct(aTHX_ SvRV($self), state);
}
return $self;
}
sub DESTROY {
my KISS $self = shift;
cblock {
Safefree($self);
}
}
# Perl-side method for calling the rng
sub rand {
my KISS $self = shift;
my uint $to_return = 0;
cblock {
$to_return = KISS::rand($self);
}
return $to_return;
}
1;
Finally I get to the Perl code: the new
method builds an object including its hidden state struct, DESTROY
frees up the allocated memory, and rand
gets the next random number. The module use
s C::Blocks::Object::Magic
and illustrates how to use xs_object_magic_attach_struct
to attach the struct to the object. It also uses xs_object_magic_get_struct_rv
to get the struct, though you probably missed it because it's buried in the type definition.
And now I can write a script that uses this module:
use strict;
use warnings;
use C::Blocks;
use KISS;
my KISS $rng = KISS->new;
print "rng's first value is ", $rng->rand, "\n";
cblock {
printf("rng's second value is %u\n", KISS::rand($rng));
}
When run, that script prints:
$ perl test.pl
rng's first value is 2079675107
rng's second value is 4185567647
This is a short script, and on its surface it looks pretty simple: I create a new KISS
random number generator and I use it to produce two random numbers. The remarkable aspect of this script is that I use the same random number generator---even the same variable name $rand
---in both Perl and C code. The object underlying $rng
fluidly moves between the two contexts because the KISS
package provides type information. The Perl-side rand
method ultimately invokes KISS::rand
, which means that I can generate random numbers with my object in whichever context is more convenient. To accomplish this feat, I wrote a module that provides both a Perl and a C interface to a struct, but even the module was not terribly hard to write.
The short script above does not actually show off the utility of using C::Blocks::Object::Magic
. To see that, I need to utilize the fact that the object is a blessed hash:
use strict;
use warnings;
use C::Blocks;
use KISS;
my KISS $rng = KISS->new;
$rng->{name} = 'Gerry';
print "$rng->{name}'s first value is ", $rng->rand, "\n";
cblock {
printf("rng's second value is %u\n", KISS::rand($rng));
}
Running this produces the following output:
$ perl test.pl
Gerry's first value is 2079675107
rng's second value is 4185567647
If I decide while developing a library that I need certain information to be available from C then I add it to the struct, but if the information only needs to be available from Perl then I can simply store it in the hash. Having worked with Prima, which uses a different scheme to support hashref-based objects with a core C struct underneath, I have found this to be particularly useful for storing data used by custom handlers. Of course, subclassing would be a more systematic way to achieve the same goal, but either way the hashref is indispensable for storing the data.
A critique of this approach is that the module author must write distinct C and Perl methods (usually with one calling the other). This sort of code duplication will be cumbersome for anything but the smallest of projects. The proper solution to this problem is an object system, something that simultaneously builds both C and Perl methods from a common declaration. Such a system is not yet available. However, C::Blocks is up to the task and in the coming days I will provide a number of treats that go through the capabilities needed to write a proper object system.
Today I gave an example of a Perl class that provides both a C and a Perl interface. In particular, I showed how C::Blocks::Object::Magic
makes it easy to have a hashref-based object while safely storing a pointer to an underlying struct for the C-visible state. I glossed over how to write a C::Blocks type, an important detail that I will discuss soon. A proper object system for C::Blocks will require some way to implement inheritance in C. How is this accomplished? These details will be some of the forthcoming treats this Advent.
Leave a comment