C::Blocks Advent Day 6

This is the C::Blocks Advent Calendar, in which I release a new treat each day about the C::Blocks library. Over the last few days I have illustrated how to use C::Blocks to write procedural code, how to get data across the Perl/C divide, including using types to do that concisely, and how to write code that is shared across modules and scripts. Yesterday I provided some benchmarks that I hope give a sense for the performance of C::Blocks, and start to indicate the circumstances when it might be useful. Today I am going to illustrate the many ways you can generate and/or modify your C code using C::Blocks including source filters, interpolation blocks, and the good old string eval.

Most folks in the Perl community think ill of source filters, so it is with trepidation that introduce these first. However, they can be quite useful, and not merely for code manipulation. The default source filter, C::Blocks::Filter, does not manipulate the code at all but prints out the contents of each C::Blocks block just before compiling it. It can be invoked from the command-line when something is not working and serves as a useful debugging aid. For example, this script:

use strict;
use warnings;
use C::Blocks;
use C::Blocks::PerlAPI;
cblock {
    printf("Merry Christmas from C::Blocks!\n");
}

Can be run with C::Blocks::Filter, producing:

$ perl -MC::Blocks::Filter test.pl 
##################################################
void op_func(C_BLOCKS_THX_DECL) {
        printf("Merry Christmas from C::Blocks!\n");
    }
##################################################
Merry Christmas from C::Blocks!

This lets us peak under the hood to see what's really going on. Between the many # symbols, you can see that the code for our cblock was wrapped into a function called op_func. After C::Blocks compiles this little snippet of C code, it gets the pointer to this function and stores it in the op-tree. This is later invoked when the Perl interpreter encounters the location in your code where the cblock was positioned.

This gives us a useful tool to see exactly how C::Blocks does its magic. When applied to the first examples from Day 2 we get:

$ perl -MC::Blocks::Filter test.pl
##################################################
void op_func(C_BLOCKS_THX_DECL) {SV * _PERL_SCALAR_message = (SV*)PAD_SV(1); 
        char * message = SvPVbyte_nolen(_PERL_SCALAR_message);
        printf("%s from C::Blocks\n", message);
        sv_setpv(_PERL_SCALAR_message, "Feliz Navidad!");
    }
##################################################
Merry Christmas! from C::Blocks
After the cblock, the message is Feliz Navidad!

In this example I had used the variable $message directly in my cblock. Now we see how that works. At compile time, C::Blocks gets the location in the pad where $message lives (offset 1 in this case) and injects code to retrieve the SV* from that pad location. Through the rest of the code, $message is replaced with _PERL_SCALAR_message.

And how about types? When applied to the first examples from Day 3 we get:

$ perl -MC::Blocks::Filter test.pl
##################################################
void op_func(C_BLOCKS_THX_DECL) {SV * SV__PERL_SCALAR_limit = (SV*)PAD_SV(1); int _PERL_SCALAR_limit = SvIV(SV__PERL_SCALAR_limit); 
        for (int i = 0; i < _PERL_SCALAR_limit; i++) {
            printf("Ho ho ho! ");
        }
        printf("\n");
    sv_setiv(SV__PERL_SCALAR_limit, _PERL_SCALAR_limit);}
##################################################
Ho ho ho! Ho ho ho! ...

Here we see that C::Blocks puts the SV* for $limit in a slightly different C variable, SV__PERL_SCALAR_limit. As before, the variable name $limit is replaced with _PERL_SCALAR_limit, but that is not of type int. Also, there is an additional line at the end which sets the Perl variable's value to whatever is in _PERL_SCALAR_limit. This way, if we modify _PERL_SCALAR_limit, the effect is visible after the block has run.

However, source filters are generally meant for modifying code, not printing debug output. With a source filter we can easily add new quasi-keywords, such as loop in this example:

use strict;
use warnings;
use C::Blocks;
use C::Blocks::PerlAPI;

# loop->while filter
sub loop_to_while {
    s/loop/while(1)/g;
}
use C::Blocks::Filter qw(&loop_to_while);

cblock {
    int i = 0;
    loop {
        printf("This is number %d\n", i++);
        if (i == 10) break;
    }
}

and here's what we get:

$ perl test.pl 
This is number 0
This is number 1
This is number 2
This is number 3
This is number 4
This is number 5
This is number 6
This is number 7
This is number 8
This is number 9

In C, the break keyword behaves like the last keyword in Perl, so when i reaches 10, we exit the loop. To get an idea of how this really works, let's add use C::Blocks::Filter both before and after the custom filter:

use strict;
use warnings;
use C::Blocks;
use C::Blocks::PerlAPI;

# loop->while filter
sub loop_to_while {
        print "*** converting loop to while(1)...\n";
    s/loop/while(1)/g;
}
use C::Blocks::Filter;
use C::Blocks::Filter qw(&loop_to_while);
use C::Blocks::Filter;

cblock {
    int i = 0;
    loop {
        printf("This is number %d\n", i++);
        if (i == 10) break;
    }
}

Running that produces:

##################################################
void op_func(C_BLOCKS_THX_DECL) {
        int i = 0;
        loop {
            printf("This is number %d\n", i++);
            if (i == 10) break;
        }
    }
##################################################
*** converting loop to while(1)...
##################################################
void op_func(C_BLOCKS_THX_DECL) {
        int i = 0;
        while(1) {
            printf("This is number %d\n", i++);
            if (i == 10) break;
        }
    }
##################################################
This is number 0
This is number 1
...

The first C::Blocks::Filter, being the first in the series, gets the code as extracted by the C::Blocks code extractor, including the loop. The second C::Blocks::Filter is called after loop_to_while has been applied, letting us see its effect. It has replaced loop with while(1).

Applying a filter by naming the function is useful, but repeatedly using the same filter in this way can lead to lots of typing. Since this is Perl, there is always More Than One Way to Do It. In particular, filters can also be implemented as modules. Getting their effect is as simple as including use My::Filter::Module in your script.

Like so many aspects of C::Blocks, the code to which any filter gets applied is limited by its lexical scope. This lets you apply filters with high levels of granularity, if you so wish.

Interpolation blocks are another tool for generating code: you can write snippets of Perl code that generate C code within your C::Blocks block. These are the philosophical counterpoint of source filters. Source filters are meant to modify your code, effectively enhancing the language, but a source filters have the tendency of over-reaching, modifying bits of the code they shouldn't. This is why source filters are generally discouraged in the Perl community. Interpolation blocks are different: they cannot modify the contents of a block, but can generate code at a specific location. They are blocks of Perl code enclosed as ${ ... Perl code here ...}. The code is run as soon as it is extracted, and the return value is injected into the C code in its place. For example, you could programatically build up the entries in a C struct as follows:

use strict;
use warnings;
use C::Blocks;
use C::Blocks::PerlAPI;
our @struct_fields;

# add points for x and y
BEGIN { push @struct_fields, 'float x', 'float y' }
# ...
# add a color
BEGIN { push @struct_fields, 'int color_idx' }
# ...
# add a name
BEGIN { push @struct_fields, 'char * name' }
# ...

clex {
    typedef struct My::Labeled::Point {
        ${ join ('', map "$_;\n", @struct_fields) }
    } My::Labeled::Point;
}
# ...

cblock {
    My::Labeled::Point test;
    test.x = 1.05;
    test.y = -3.4;
    test.color_idx = 27;
    test.name = "Frank";

    printf("test's position is (%f, %f)\n", test.x, test.y);
}

The interpolation block is the mess of code within the struct declaration, ${ join ('', map "$_;\n", @struct_fields) }. This produces a string of valid C code with the different fields. To see this in action, it's probably best to run this with C::Blocks::Filter like so:

$ perl -MC::Blocks::Filter test.pl 
##################################################

        typedef struct My__Labeled__Point {
            float x;
float y;
int color_idx;
char * name;

        } My__Labeled__Point;

##################################################
##################################################
void op_func(C_BLOCKS_THX_DECL) {
        My__Labeled__Point test;
        test.x = 1.05;
        test.y = -3.4;
        test.color_idx = 27;
        test.name = "Frank";

        printf("test's position is (%f, %f)\n", test.x, test.y);
    }
##################################################
test's position is (1.050000, -3.400000)

As promised, the contents of the struct declaration are filled with the types that were assembled: float x, int color_idx, etc. The cblock then uses this struct type and assigns to the various fields in the struct.

You probably also noticed that I used double-colons: My::Labeled::Point, and these were replaced with double-underscores during the code extraction stage, producing My__Labeled__Point. There is nothing fancy going on here, it just replaces the double-colons (invalid C syntax) with double-underscores (valid C syntax) as a notational convenience.

These mechanisms for generating code are not without their warts. Keeping track of line numbers is currently shaky with C::Blocks, and a number of line counting issues have yet to be fully resolved with these approaches. Those kinds of issues are a major target for work once C::Blocks reaches Beta.

The third method for generating code on the fly is the humble string eval. There is a very clever use of string evals, but this Advent entry is already too long, so I will not demonstrate it yet.

Today I explained the many facilities for generating or modifying C code with C::Blocks. It is possible to modify C code with a source filter, and to generate code at a specific point using interpolation blocks. Incidentally, it's possible to put these together. Interpolation blocks inject code as soon as their closing bracket is detected, and this generated code is part of what gets sent to the filters. (That's why using C::Blocks::Filter was able to print it out.) Both of these mechanisms were implemented with specific, distinct tasks in mind, and I'm sure they will be used for a variety of purposes as the distribution matures and sees new and unexpected uses.

C::Blocks Advent Day 1 2 3 4 5 6 7 8 9 10 11 12 13

Leave a comment

About David Mertens

user-pic This is my blog about numerical computing with Perl.