Once more unto the Wide character (U+XXXX) in substitution (s///)
I wrote very elliptically about this warning and received some helpful comments with the standard advice about how to proceed when encountering it. Except unfortunately that advice will be of no use when you encounter this warning.
Namely I should have been less cute about it and made it clear that I was specifically talking about a warning about a wide character “in substitution”. How can a s///
even possibly trigger a wide character warning, you ask? Beats me, to be entirely honest, even now, but: if you have a use locale
somewhere, it turns out that it can. Because defeating that is what fixed the warning I was getting:
BEGIN { local $INC{'locale.pm'} = 1; require Template }
Unfortunately I haven’t been able to produce a minimal reproduction (or else I would know to answer the “how” question), and “run all of Template Toolkit” isn’t a useful one, but it is all I have. I ran into it in some template code, and the use locale
in Template::Filters turned out to be the culprit. Somehow it manages to convince a s///
that it is an I/O operation and needs to complain about wide characters.
Friends don’t let friends use locale
.
Update: Tony Cook has provided the answer to “how” in comments: a non-UTF8 locale has to be in effect, in which case s///
will warn about wide characters because the locale only defines meanings for byte values. Thus we now have a minimal reproduction:LC_ALL=C perl -Mlocale -We '$_ = "\x{100}"; s/\w//'
It warns because the locale in this case only defines meanings for byte values.
You can avoid this by using a UTF-8 locale (or not using
use locale
at all)Thank you!! I asked about this on #p5p at the time but nobody who happened to be around at that moment could solve the mystery for me either.
So the only thing I missed was that it’s necessary to set
LC_ALL
to something suitable to trigger the warning. And since I do have code in my project to set that toC
, that explains why exactly the warning was getting triggered. And also lends itself to an even shorter and more portable repro: