What's the SQL national character (NCHAR) datatype really for?
Solution 1:
"NATIONAL" in this case means characters specific to different nationalities. Far east languages especially have so many characters that one byte is not enough space to distinguish them all. So if you have an english(ascii)-only app or an english-only field, you can get away using the older CHAR and VARCHAR types, which only allow one byte per character.
That said, most of the time you should use NCHAR/NVARCHAR. Even if you don't think you need to support (or potentially support) multiple languages in your data, even english-only apps need to be able to sensibly handle security attacks using foreign-language characters.
In my opinion, about the only place where the older CHAR/VARCHAR types are still preferred is for frequently-referenced ascii-only internal codes and data on platforms like Sql Server that support the distinction — data that would be the equivalent of an enum
in a client language like C++ or C#.
Solution 2:
Meanwhile the SQL92 standard explains the feature even less helpfully, stating only that NATIONAL CHARACTER is stored in an implementation-defined character set. As opposed to a mere CHARACTER, which is stored in an implementation-defined character set. Which might be a different implementation-defined character set. Or not.
Coincidentally, this is the same "distinction" the C++ standard makes between char
and wchar_t
. A relic of the Dark Ages of Character Encoding when every language/OS combination has its own character set.
Should one use NVARCHAR for all character (non-binary) storage purposes?
It is not important whether the declared type of your column is VARCHAR
or NVARCHAR
. But it is important to use Unicode (whether UTF-8, UTF-16, or UTF-32) for all character storage purposes.
Are there currently-popular DBMSs in which it will do something undesirable
Yes: In MS SQL Server, using NCHAR
makes your (English) data take up twice as much space. Unfortunately, UTF-8 isn't supported yet.
EDIT: SQL Server 2019 finally introduced UTF-8 support.