What charset does Microsoft Excel use when saving files?
I have a Java app which reads CSV files which have been created in Excel (e.g. 2007). Does anyone know what charset MS Excel uses to save these files in?
I would have guessed either:
- windows-1255 (Cp1255)
- ISO-8859-1
- UTF8
but I am unable to decode extended chars (e.g. french accentuated letters) using either of these charset types.
Solution 1:
From memory, Excel uses the machine-specific ANSI encoding. So this would be Windows-1252 for a EN-US installation, 1251 for Russian, etc.
Solution 2:
CSV files could be in any format, depending on what encoding option was specified during the export from Excel: (Save Dialog, Tools Button, Web Options Item, Encoding Tab)
UPDATE: Excel (including Office 2013) doesn't actually respect the web options selected in the "save as..." dialog, so this is a bug of some sort. I just use OpenOffice Calc now to open my XLSX files and export them as CSV files (edit filter settings, choose UTF-8 encoding).
Solution 3:
Waking up this old thread... We are now in 2017. And still Excel is unable to save a simple spreadsheet into a CSV format while preserving the original encoding ... Just amazing.
Luckily Google Docs lives in the right century. The solution for me is just to open the spreadsheet using Google Docs, than download it back down as CSV. The result is a correctly encoded CSV file (with all strings encoded in UTF8).