How to display Unicode in a Linux virtual terminal?
The console font can load fonts to up to 512 (I think, or something like that) different glyphs; usually only 256 glyphs however.
To display Latin, Cyrrilic, or other languages that use less than 200 non complex symbols is no problem.
However, for complex scripts, or scripts needing a lot of different symbols (like japanese) you have no other possibility than using an extra layout to handle it.
Note that if the limit of 512 should be enough for ASCII and both Kana sets, there is the problem of the width.
CJK and Kana fit a square, they are twice the width of Latin letters. That is not something that the console can handle out of the box.
You could resort to old and ugly “Halfwidth Katakana” (and maybe even find an old font of such a thing), or set your console to 40 columns width and have latin letters be as wide as Kana.
I don't know of any such console font with Kana; you should draw your own (there are tools to do so, and you can just copy the dots of bitmap japanese font.
Also, you could use iconv
to transliterate kana into ASCII.
In addition to LANG/LC_ALL
, stty iutf8
is needed to tell the terminal what to do, you might need setfont
then to load a useful font and mapping. If you still have problems check your kernel config for CONFIG_NLS_xx
settings, you may need to modprobe nls_utf8
if it doesn't load automatically (I think this is only required for Unicode filenames though).
Some linux distributions provide unicode_start
and unicode_stop
scripts to automate this.
If less
causes problems it may require the environment variable LESSCHARSET
to be set (or unset if it's wrong).
Markus Kuhn's UTF-8 and Unicode FAQ for Unix/Linux is invaluable.
You need a font that actually has these characters. Arch Linux for example recommends Lat2-Terminus16
.
To try it, just issue the following command in a virtual console: setfont Lat2-Terminus16
.
As for the rest, most modern distributions already support it out of the box.
By installing uniutils you can find unicodes.
$ sudo apt-get install uniutils
Then use uniname
:
ubuntu@shin-instance:~$ echo 岡田shin | uniname
No LINES variable in environment so unable to determine lines per page.
Using default of 24.
character byte UTF-32 encoded as glyph name
0 0 005CA1 E5 B2 A1 岡 CJK character Nelson 621
1 3 007530 E7 94 B0 田 CJK character Nelson 2994
2 6 000073 73 s LATIN SMALL LETTER S
3 7 000068 68 h LATIN SMALL LETTER H
4 8 000069 69 i LATIN SMALL LETTER I
5 9 00006E 6E n LATIN SMALL LETTER N
6 10 00000A 0A LINE FEED (LF)
ubuntu@shin-instance:~$