C::Blocks Advent Day 3

This is the C::Blocks Advent Calendar, in which I release a new treat each day about the C::Blocks library. Yesterday I showed how to get information across the boundary between Perl and C. Today I show how to do that for a number of types, including packed arrays, with minimal boiler-plate.

Yesterday I illustrated that if you refer to a $sigiled @variable %name in your cblock, C::Blocks will treat those tokens as the underlying SV*, AV*, or HV* referencing the appropriate lexically scoped variable. With scalars, you usually don't want to mess with the variable but simply want to use the data contained in it. Perl's XS handles mapping your Perl data to C and back with typemaps. The equivalent behavior for C::Blocks is provided by type annotations, with C::Blocks::Types providing some of the basics.

Let me jump right into an example. The key to this example is the use of C::Blocks::Types qw(Int) and the declaration of $limit:

use strict;
use warnings;
use C::Blocks;
use C::Blocks::Types qw(Int);

my Int $limit = 100;
cblock {
    for (int i = 0; i < $limit; i++) {
        printf("Ho ho ho! ");
    }
    printf("\n");
}

If you run that you'll get 100 consecutive "Ho ho ho!" printed to your screen. Merry eh?

This works because Perl provides built-in support for declarations such as my Int $limit. In this my statement, we are indicating that the type of a variable called $limit. (Note that this is distinct from the package into which a reference is blessed; $limit can have a type without even being blessed into a package.) This is used with the fields pragma, and I noticed it most recently when looking at some code written in rperl. When C::Blocks encounters a variable in your cblock, it checks if Perl knows about the type of the variable, and if so it gets type conversion information.

C::Blocks::Types provides code for marshalling between a variety of C types. Supported numeric types include the signed integer types short, Int, long, and IV; the unsigned integer types ushort, uint, ulong, and UV; and floating point types float, double, and NV. It also provides types for packed arrays of type char, int, float, and double known as char_array, int_array, float_array, and double_array.

For example, we can calculate a specified number of prime numbers, storing the collection as a packed string:

use strict;
use warnings;
use C::Blocks::Types qw(uint int_array);
use C::Blocks;
my uint $N_primes = shift (@ARGV) || 100;
$N_primes = 3 if $N_primes < 3;
my uint $last_prime = 2;

# Declare and allocate the collection of primes
my int_array $primes;
vec($primes, $N_primes-1, 32) = 0;

# Compute the primes
cblock {
    $primes[0] = 2;
    int N_found = 1;
    int potential_prime;
    for (potential_prime = 3; N_found < length_$primes; potential_prime += 2) {
        int max = sqrt(potential_prime);
        for (int i = 0; $primes[i] <= max; i++) {
            if (potential_prime % $primes[i] == 0) goto NEXT_POTENTIAL_PRIME;
        }
        /* If here, nothing divided into potential_prime */
        $last_prime = $primes[N_found++] = potential_prime;

        NEXT_POTENTIAL_PRIME: ;
    }
}

print "The first three primes are @{[unpack('lll', $primes)]}\n";
print "$N_primes-th prime is $last_prime\n";

There are a lot of things going on in this example that I want to point out. First, I use vec allocate a prescribed block of memory in a scalar. Alternatively you could use File::Map's map_anonymous function. Second, I store the results of the prime-number calculation in a packed Perl scalar, but hardly use it after the cblock. Data should only be stored in a Perl scalar if it is significantly easier to allocate, or if it will be used later. For example, I could have used Storable or File::Map to cache these primes for later use in another script. In this case, Newx and Safefree could allocate and cleanup the working memory in C while keeping the scope of the information a bit tighter. Third, the funny thing length_$primes actually gets mangled in a regular C variable that was declared and initialized in code provided by C::Blocks::Type::int_array. This variable gives the length for the array in multiples of the array type (not bytes). In this case that value should be the same as $N_primes, but this variables is particularly handy when you if you didn't have that piece of information, sparing you the nuissance of computing it.

One final note: these sorts of types do not provide any compile-time constraint. They are merely used as hints to C::Blocks about how to unpack and repack the variable. This would not cause a compiler warning or error:

use strict;
use warnings;
use C::Blocks::Types qw(uint int_array);
use C::Blocks;
my uint $foo = 'hello';
cblock {
    $foo++;
}
print "foo is $foo\n";

Running this, I get:

$ perl test.pl 
Argument "hello" isn't numeric in rand at test.pl line 6.
foo is 1

As you can see, this script compiled and ran from start to finish. (You can also see that it claims that rand() gave trouble. This is an artifact of how C::Blocks sets up the ops it builds. I'm wokring on it.) It is possible to create your on type that checks the type of a variable at runtime, while unpacking it, but the basic types provided by C::Blocks::Types do not do this.

C::Blocks makes it easy to painlessly shuttle data between Perl and C using type annotations, and provides a number of basic types via C::Blocks::Types. What if you want to repeatedly unpack and repack your Perl variable in a different way? That'll be covered in a treat some time later in Advent, so you'll just have wait to find out!

C::Blocks Advent Day 1 2 3 4 5 6 7 8 9 10 11 12 13

Leave a comment

About David Mertens

user-pic This is my blog about numerical computing with Perl.