XS versus clang: Infinite warnings

Around the beginning of 2022 I started noticing a large number of warnings when compiling XS modules under macOS 12 Monterey. These looked like warning: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Wcompound-token-split-by-macro], and appeared to originate fairly deeply in Perl's macro stack.

This week I was moved to address them for my one lone XS distribution, Mac-Pasteboard. Not only are they really annoying, but they would make it difficult or impossible to find anything more serious.

A little web searching seemed to say that this warning was added in clang 12.0, and is enabled by default. Beyond that, I did not find much. A Ruby ticket turned up, but the patch involved rewriting the relevant macros so that the warning was not tickled. A desultory check of a few other XS modules that came to mind did not provide any help -- they all showed the same behavior.

There is some work going on in the Perl core. A brief thread in the p5p mailing list, Backport token-split-by-macro fix? refers to GitHub issue 18780: clang12: Using as compiler generates 30K warnings of -Wcompound-token-split-by-macro. As I read both the mail thread and the issue, it is fixed in blead, but back-porting may be problematic. This seems to mean that module authors who want to silence the warnings in currently-existing versions of Perl need to address the problem themselves.

The question is how to turn off the warning. I considered a C pragma, since my copy of Kernighan and Ritchie's The C Programming Language says that unrecognized pragmas are ignored. But some experimentation showed that gcc complained, and some web searching showed that it was not the only C compiler that would do so. This means replacing a large number of warnings with a single warning per compilation from compilers that do not support the required pragma, unless the pragma can be properly conditionalized. But it appears at least one author has gone this route. My version of this implementation is:

#if defined(__clang__) && defined(__clang_major__) && __clang_major__ > 11
#pragma clang diagnostic ignored "-Wcompound-token-split-by-macro"

This is to be inserted at the very beginning of the C or XS source, or at least before any Perl headers are included. Maybe the defined(__clang__) is not needed, but I am cautious.

The other way to suppress the warning is to supply a command line option, -Wno-compound-token-split-by-macro. The problem here is how to know when the option should be applied. I know nothing about the arcane art of probing C compiler configuration, but the brute-force approach is to write a trivial C program (essentially the true command), finding a place for it to live in my distribution (the inc/ directory), and then spawning the following command:

system "$Config{cc} -Werror -Wno-compound-token-split-by-macro -o /dev/null inc/true.c 2>/dev/null";

If this succeeds, I know I can add -Wno-compound-token-split-by-macro option to the compiler command. Otherwise, I can not. The -Werror turns warnings into errors, so I do not have to parse STDERR. There are two problems with this:

  • It is not portable. That is not a problem for my specific case, which only runs under macOS anyway. But a general solution will need to take care of environments like MSWin32 and VMS.
  • It uses a single-argument system() call with shell meta-characters, and so requires spawning a shell and is a potential security risk -- though maybe no more of a risk than including any Perl module, since if the @INC directories (or @INC itself) can be hacked the bad actor probably has better ways to attack a system than waiting for someone to spawn $Config{cc}. But again VMS DCL will be a problem.

I have been back and forth on the proper way to handle this. Currently I like the pragma approach, but the limited number of C compilers in my stable makes it hard to choose definitively between them. Linux favors gcc, BSD favors clang -- at least, macOS, FreeBSD, and OpenBSD do. But I have been personally unable to test under a wider range of compilers. Caveat coder.

1 Comment

Thanks, this was very useful to me. I applied the command line version to Sereal today.

Leave a comment

About Tom Wyant

user-pic I blog about Perl.