C::Blocks Advent Day 13

This is the C::Blocks Advent Calendar, in which I release a new treat each day about the C::Blocks library. At the time of writing, we are actually in the season of Christmas, not Advent. I hope you'll forgive these late posts. :-)

Yesterday I used C::Blocks to play around with Perl's C API and mess with keywords. Today I will focus on a couple of neat C tricks that can help clean up the C-end of your library API.

One of the chief goals of C::Blocks is to make it easy to share C code with others. It's one thing to share code, but quite another to share useful code, code with an API that is easy to use and easy to read. Part of the reason this is difficult is because good API design is inherently difficult, but part of the reason is that C only provides one way to pass arguments to functions. Fortunately, there are two preprocessor tricks we can use to alleviate this problem.

Argument Hiding

The first trick is quite common in Perl's own C API: hiding arguments. Consider the simple sv_setiv, which sets a scalar to the given integer value. For example:

my $var;
cblock {
    sv_setiv($var, 5);
}
print "\$var = $var\n";  # prints '$var = 5'

Notice that I use this like a function with two arguments, but it's actually a macro that wraps the real function called Perl_sv_setiv:

#define sv_setiv(a, b) Perl_sv_setiv(aTHX_ a, b)

This makes for a good example for two reasons. First, if you are working with a Perl compiled with MULTIPLICITY (the case if your Perl is threaded), then aTHX is the current Perl interpreter. You are always going to call this function with the current Perl interpreter, so sv_setiv can save us some keystrokes by always adding that as our first argument to the real function. Second, if your Perl is not compiled with MULTIPLICITY, then the Perl interpreter is maintained as a global variable, in which case aTHX is a macro that gets replaced with spaces! In other words, the actual signature for Perl_sv_setiv depends on how you compiled Perl! The complete details for how this works are discussed in perlguts, but you can use sv_setiv in blissful ignorance of these details because the macro wrapper takes care of everything for us. This is good API design.

A simpler but arguably more useful example would be a situation where you want to give useful feedback when your function croaks. In that case you can write a function that takes the current line number and file name, in addition to the rest of its arguments. The macro wrapper would supply those automatically, like so:

use strict;
use warnings;
use C::Blocks;
use C::Blocks::PerlAPI;

clex {
    #define munge_input(input) munge_input_(__LINE__, __FILE__, input)

    int munge_input_ (int line, char * file, char * input) {
        /* make sure input is non-null */
        if (input == 0) {
            croak("In %s line %d, munge_input called with null input\n",
                file, line);
        }
        printf("munge_input not yet implemented...\n");
    }
}

cblock {
    munge_input("to be munged");
    munge_input(0);
}

When I run that, I get this output:

$ perl test.pl
munge_input not yet implemented...
In test.pl line 21, munge_input called with null input

Just like Perl's Carp provides useful dieing behavior, you can provide useful exceptions by wrapping function calls like this. If you have many public functions and a handful of private ones, your public functions can call the private functions explicitly, sending the values of line and file it received when it was called. This way, somebody using your C API will get useful error reporting regardless of how you internally implement your code.

Named Arguments

The Tiny C Compiler is a nearly compliant C99 compiler, which means we can use macro tricks to emulate named arguments. This uses compound literals and variadic macros, both C99 features. The Tiny C Compiler's compound literal handling wasn't quite right for this task until it was fixed very recently (i.e. within the last month). The update is currently only available through a developer's release of Alien::TinyCCx, but should be in the next point release. Take this as a sign of things to come.

First, I'd like to show you how this works. Suppose I wanted to have a GUI command that draws a label somewhere on a canvas. It could have many optional arguments, like padding width, border width, border color, etc. Using a single struct, macro, and function declaration, I can call my function like this:

draw_label(canvas, 1, 5, "X marks the spot");
draw_label(canvas, .y = 5, .x = 1,
    .label = "X marks the spot");
draw_label(canvas, 1, 5,
    .label = "X marks the spot", .x = 3);

This almost looks like Perl's key/value pair calling convention, except that the keys are prefaced with a period. So, how does this work?

The original idea breaks a normal function declaration into three pieces. First, define a preprocessor macro with the actual name that you are going to use in your public API. For example:

#define draw_label(c, ...) draw_label_(c, (struct draw_label_args_){ __VA_ARGS__ })

Notice that this is a variadic macro, and it simply dumps the contents of the ... into a so-called compound literal declaration. Before we can understand that, we need to look at the struct layout:

struct draw_label_args_ {
    const float x;
    const float y;
    const char * label;
};

Now look carefully at the different ways we called the function. Judging from those examples, the following would all be valid ways of initializing a draw_label_args_ struct:

struct draw_label_args_ my_args
    = { 1, 5, "X marks the spot" };
struct draw_label_args_ my_args
    = { .y = 5, .x = 1, .label = "X marks the spot" };
struct draw_label_args_ my_args
    = { 1, 5, .label = "X marks the spot", .x = 3 };

The first assignment is a classic struct assignment, the sort of thing you'd see in C89 code. (In C99, any field that is not mentioned is initialized to zero.) In the second case, all labels are mentioned explicitly, and can be out of order! In the third case, we see that we mix sequential positional values and named fields. In fact, named fields override previous identical named fields, or even positional values!

What sort of function do we need? We need a function that accepts a canvas as its first argument and the struct as its second:

static void draw_label_(Canvas * c, struct draw_label_args_ args) {
    if (args.label == 0) args.label = "(none)";
    // default x,y of 0 is OK
    ...
}

Within the body of the function, the label is accessed as args.label. Likewise the x- and y-positions are accessible via args.x and args.y. If any of those were not specified, they will default to zero, a situation that is fine and suitable for x and y, but not for the label. This can be easily detected and fixed for the label.

Named Arguments with Nonzero Defaults

We can take the previous example one step further. The default value for any uninitialized member is zero. A position of zero is valid for x and y, but what if we want to explicitly indicate an unspecified position? Or, what if we want to provide a different default?

We can specify defaults that are different from zero, but there are multiple ways to do it, with varying trade-offs. The key is to remember that later statements in a struct initialization override previous ones. So, one simple approach for specifying nonzero defaults is to revise the macro to something more like this:

#define draw_label(c, ...) \
    draw_label_(c, \
    (struct draw_label_args_){ \
        .x = 100, \
        .y = 100, \
        __VA_ARGS__ \
    })

(Note that I split the macro definition across multiple lines by ending the line with a backslash.) When this is later used by somebody calling the function, they can override the defaults:

draw_label(canvas, .x = 50);

Unfortunately, the naive implementation here breaks positional arguments, i.e. the following would no longer work:

draw_label(canvas, 1, 5, "X marks the spot");

After naming a field, you can continue to list values in succession. Since we ended our defaults at .y, the next valid unnamed field would be the label, not x. To fix this properly, we need to add an additional item to the beginning of our arg struct, something that the user will not need to override. In this case, by moving the canvas into the arg struct, we can get the behavior we want:

struct draw_label_args_ {
    Canvas * c_,
    const float x;
    const float y;
    const char * label;
};
#define draw_label(c, ...) \
    draw_label_(c, \
    (struct draw_label_args_){ \
        .x = 100, \
        .y = 100, \
        .label = "(none)", \
        .c_ = c, \
        __VA_ARGS__ \
    })
static void draw_label_(struct draw_label_args_ args) {
    // canvas is args.c_...
}

Perhaps more interestingly, we can add a couple of extra members to our argument struct for the calling line and file. This would let us produce an error message naming the calling context. I've mostly been illustrating with snippets of code, so here I'll provide a full working example with C::Blocks:

use strict;
use warnings;
use C::Blocks;
use C::Blocks::PerlAPI;

clex {
    #define salutations(...) salutations_( \
        (struct salutations_args_){ \
        .message = "Hello", \
        .calling_line = __LINE__, \
        .calling_file = __FILE__, \
        __VA_ARGS__ })

    struct salutations_args_
    {
        int calling_line;
        char * calling_file;
        char * name;
        char * message;
        int is_exclamation;
    };

    void salutations_(struct salutations_args_ args)
    {
        /* Croak if no name given.  */
        if (!args.name) {
            croak("salutations called without specifying a name at %s:%d\n",
                args.calling_file, args.calling_line);
        }
        printf("%s %s%s\n", args.message, args.name,
            args.is_exclamation ? "!" : ".");
    }
}

cblock {
    salutations("David");
    salutations("David", .is_exclamation = 1);
    salutations("David", "Merry Christmas");
    salutations("David", "Merry Christmas", 1);
    salutations("David", "Merry Christmas", .is_exclamation = 1);
    salutations(.message = "Merry Christmas",
        .name = "David");
    // runtime error:
    salutations(.message = "Merry Christmas");
}

When run, I get this output:

$ perl test.pl
Hello David.
Hello David!
Merry Christmas David.
Merry Christmas David!
Merry Christmas David!
Merry Christmas David.
salutations called without specifying a name at test.pl:44

When I began learning about the distinctions between C and C++, I remember that named arguments and argument defaults were two big niceties in C++ that were missing in C. I did not discover this trick until much later. On the one hand, you need to write a lot of boiler-plate for named arguments with defaults in C. On the other hand, the code shown in the cblock looks really good. In some respects it is even more flexible than key/value pairs in Perl functions because you are not required to specify keys for your arguments if you don't want to. And of course, we're not talking about strict C: we're talking about C::Blocks, which is capable of automating this kind of code with interpolation blocks or source filters.

Caveats

Before wrapping things up I want to point out two important aspects of these sorts of tricks. Using macros to reduce the argument count is an old and reliable technique. Named arguments and defaults is much newer. I am not sure if these are universally implemented by modern compilers, specifically Microsoft's Visual C. An eventual goal of C::Blocks is to provide an optimizing compiler back-end, in addition to the TCC back-end. If that ever becomes a reality, then named arguments may be a tripping point. (It sorta looks like MS may have gotten compound literals working, but I haven't found a definitive answer on the subject.) But that is probably some ways off, and for now I'd say if you like them, you should use them!

Also, I am new to these sorts of function machinations. I think they're great, but please do not take my code as examples of best practice. They are merely my musings, with the hope that folks will find them illuminating and exciting.

Conclusion

Today I showed how to use macros in C::Blocks to create C APIs that are versatile and easy to use. Named arguments and defaults rely on newly added behavior in tcc, but it should roll out onto the CPAN soon.

C::Blocks Advent Day 1 2 3 4 5 6 7 8 9 10 11 12 13

Leave a comment

About David Mertens

user-pic This is my blog about numerical computing with Perl.