What are your environment settings for Unicode?
How have you set up your environment to work with Unicode? I want to make a cheat sheet for the Perl newbies. There is some information in the googlesphere but it's disperse and unfocused.
I'm working Unicode into the next edition of Learning Perl. The most frustrating part of this for newbies (Perl, Unicode, or otherwise) is getting all the pieces to cooperate. Even if you get it right inside your Perl program, your terminal might not handle Unicode. If your terminal handles it, you might not have the right fonts. And so on and so on.
I know what I've set up for my system and the programs I use, but that's just for the tools I use. What did you have to
- What environment variables did you set, and with what values? (LESSCHARSET, LANG, LC, LC_ALL)
- Where did you have to set the environment? For instance, the shells have their special files, but there are also files like ~/MacOSX/environment.plist or the Windows registry.
- What did you set in your editor preferences?
- What did you set in your terminal program (besides the shell settings)?
- What fonts do you use? Did they magically show up or did you have to hunt them down? Does that font do something special with your language?
I'm especially interested in settings for languages besides the european ones (anyone have bidi settings?).
If you have a favorite resource, tell me about that too. :)
I don't remember setting up anything on Ubuntu 10.10 (and a lot of previous releases). It works out of the box:
claudio@adelaide:/etc$ env |grep ^L[CAE]
LANG=en_US.utf8
LESSOPEN=| /usr/bin/lesspipe %s
LESSCLOSE=/usr/bin/lesspipe %s %s
Hi brian
Are you going to publish a program to test these settings, or do you know of one?
Cheers
Ron
I'm managing this kind of configuration with my zsh settings. I've got them in a seperate git repository and each system has its own branch.
My .zshrc includes .zsh/env on every system. There I've put all the LC_* stuff. Why? - I've got loads of different systems and I don't want to deal with their unique way to set these things.
For Mac OS X I've also did two things:
defaults write ~/.MacOSX/environment LC_ALL de_DE.utf-8
defaults write ~/.MacOSX/environment PATH ...
Why? - Because applications started from Finder don't inherit your shell's settings - now they do.
FWIW I wrote http://perlgeek.de/en/article/set-up-a-clean-utf8-environment a while ago, but it probably needs to consider many more aspects and systems.
Hi Brian,
are you going to cover the Win32 console and Unicode issues on Windows as well?
I will try to include whatever information people give me. Someone needs to tell me how they configured their windows setup first :)
Markus Kuhn has an excellent resource on UTF-8 On POSIX-like systems:
http://www.cl.cam.ac.uk/~mgk25/unicode.html
Included in that demo is
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
Which is a handy file to have around to test if your terminal is configured for UTF-8, and/or what fonts you are missing. Simply 'cat UTF-8-demo.txt' and you'll see quickly if your setup and good to go.
That's a great resource. Thanks! I found it especially interesting that UTF-8 was invented on a diner placemat and implemented in a day. Everyone else will have to read the docs to find out the details. :)
hi Brian,
can you please confirm that you have received an email from me with a pdf attached ? thanks