How to write UTF-8 characters using bulk insert in SQL Server?
Solution 1:
I came here before looking for a solution for bulk inserting special characters. Didn't like the workaround with UTF-16 (that would double the size of csv file). I found out that you definitely CAN and it's very easy, you don't need a format file. This answer is for other people who are looking for the same, since it doesn't seem to be documented well anywhere, and I believe this is a very common issue for non-english speaking people. The solution is: just add CODEPAGE='65001' inside the with statement of the bulk insert. (65001=codepage number for UTF-8). Might not work for all unicode characters as suggested by Michael O, but at least it works perfect for latin-extended, greek and cyrillic, probably many others too.
Note: MSDN documentation says utf-8 is not supported, don't believe it, for me this works perfect in SQL server 2008, didn't try other versions however.
e.g.:
BULK INSERT #myTempTable
FROM 'D:\somefolder\myCSV.txt'+
WITH
(
CODEPAGE = '65001',
FIELDTERMINATOR = '|',
ROWTERMINATOR ='\n'
);
If all your special characters are in 160-255 (iso-8859-1 or windows-1252), you could also use:
BULK INSERT #myTempTable
FROM 'D:\somefolder\myCSV.txt'+
WITH
(
CODEPAGE = 'ACP',
FIELDTERMINATOR = '|',
ROWTERMINATOR ='\n'
);
Solution 2:
You can't. You should first use a N type data field, convert your file to UTF-16 and then import it. The database does not support UTF-8.