Importing multiline cells from csv file into excel
I have a csv file (comma delimited and quoted). When csv file is opened directly from explorer excel correctly interprets the cells that are mutliline, but it messes up the character encoding (utf-8).
Therefore I have to use import function (Data/Get External Data/From Text). However, when I use import text function in excel (where I can set file encoding explicitly) it interprets the newline as start of the new row instead of putting multiline text into a single cell and breaks the file layout.
Can I somehow overcome the situation by either
- forcing the explorer open command to use
65001: Unicode (UTF-8)
encoding - forcing the Text Import Wizard to ignore quoted line breaks as record delimiters
Use LibreOffice to open the file, then save in desired format: I had exactly the same problem you described when trying to use Excel 2010 to read UTF-8 MySQL data with multi-line Japanese text in some fields exported as quoted CSV with \r\n used for end-of-record (tried \r and \n also with no difference in Excel's behaviour). LibreOffice 4.1.3 imported the CSV file correctly, and I could save it in Excel 2010 xlsx format and thereafter open the xlsx file correctly in Excel.
What you ask is not possible. The only real solution is for Microsoft to either:
- create a setting that allows the user to specify what the default encoding of CSV is when OPENING them
- fix the bug in the Text Import Wizard so that it correctly handles multiline values
However there are some workarounds.
My favourite:
- Open your csv in a text editor that understands the encoding (e.g. sublime, notepad++, etc.)
- Open a new Excel work book
- Copy the entire content of the csv, and paste into cell A1
- Excel automatically does the right thing (encoding + newlines)
Other workarounds are less elegant and rely on manipulating the file:
To use the Text Import Wizard you would have to remove all the new line characters from your CSV file. If you created the file programatically you could edit it to omit the newline characters, if you received it from elsewhere it would be trivial to write a python script to strip the newlines characters out.
To use the standard method of just opening the file (e.g. by double click) you should transcode it to whatever the default encoding is that Excel prefers. If you can control the creation of the file (or ask the creator to create it with the desired encoding), then it's easy. Otherwise a python solution is trivial, but again more work
(HINT: you can find out which encoding Excel is expecting by opening the Text Import Wizard and seeing what the preselected option is)
Ultimately it depends on how often you are receiving / creating these files, the best workaround to deal with this issue would be to create the files in the default encoding Excel expects, so you can just double click to open.
Relevant post on stack overflow: https://stackoverflow.com/questions/2668678/importing-csv-with-line-breaks-in-excel-2007