Wide character (U+XXXX) in substitution (s///)

There is a “use locale” somewhere in the code you are running.

3 Comments

If you replace use warnings; with use diagnostics;, you get the longer error messages from perldiag:

(S utf8) Perl met a wide character (ordinal >255) when it wasn't expecting one. This warning is by default on for I/O (like print).

If this warning does come from I/O, the easiest way to quiet it is simply to add the ":utf8" layer, e.g., "binmode STDOUT, ':utf8'". Another way to turn off the warning is to add "no warnings 'utf8';" but that is often closer to cheating. In general, you are supposed to explicitly mark the filehandle with an encoding, see open and "binmode" in perlfunc.

If the warning comes from other than I/O, this diagnostic probably indicates that incorrect results are being obtained. You should examine your code to determine how a wide character is getting to an operation that doesn't handle them.

Oh heck, I copied the error above the one I wanted. You can see in the (W locale) that this is a locale error:

Wide character (U+%X) in %s

(W locale) While in a single-byte locale (i.e., a non-UTF-8 one), a multi-byte character was encountered. Perl considers this character to be the specified Unicode code point. Combining non-UTF-8 locales and Unicode is dangerous. Almost certainly some characters will have two different representations. For example, in the ISO 8859-7 (Greek) locale, the code point 0xC3 represents a Capital Gamma. But so also does 0x393. This will make string comparisons unreliable.

You likely need to figure out how this multi-byte character got mixed up with your single-byte locale (or perhaps you thought you had a UTF-8 locale, but Perl disagrees).

Nit-pick: Should it be binmode STDOUT, ':encoding(utf-8)';? I understood ':utf8' to be simply an assertion that the stream is UTF-8.

Leave a comment

About Aristotle

user-pic Waxing philosophical