Formatting a comma-delimited CSV to force Excel to interpret value as a string
Solution 1:
For those that have control over the source data, apparently Excel will auto-detect the format of a CSV field unless the CSV column is in this format:
"=""Data Here"""
eg...
20, 5.5%, "0404 123 351", "3-6", "=""123"""
[number] [percent] [number] [date] [string] <-- how Excel interprets
It also works in Google Spreadsheet, but not sure if other spreadsheet apps support this notation.
If you suspect any the data may contain quotes itself, you need to double-escape them, like this...
"=""She said """"Hello"""" to him"""
(EDIT: Updated with corrections, thanks DMA57361!)
Solution 2:
Like many, I have been struggling with the same decisions that Microsoft makes and tried various suggested solutions.
For Excel 2007 the following goes:
- Putting all values in double quotes does NOT help
- Putting an = before all values after putting them in double quutes DOES help, BUT makes the csv file useless for most other applications
- Putting parentheses around the double quotes around all values is rubbish
- Putting a space before all values before putting double quotes around them DOES prevent conversions to dates, but DOES NOT prevent trimming of leading or trailing zeroes.
- Putting a single quote in front of a value only works when entering data within Excel.
However:
Putting a tab before all values before putting double quotes around them DOES prevent conversions to dates AND DOES prevent trimming of leading or trailing zeroes and the sheet does not even show nasty warning markers in the upper left corner of each cell.
E.g.:
"<tab character><some value>","<tab character><some other value>"
Note that the tab character has to be within the double quotes. Edit: it turns out that the double quotes are not even necessary.
Double clicking the csv file can open the file as a spreadsheet in Excel showing all values that are treated as just above, like text data. Make sure to set Excel to use the '.' as the decimal point and not the ',' or every line of the csv file will end up as one text in the first cell of each row. Apparently Microsoft thinks that CSV means "Not the decimal point" Separated Value.
Solution 3:
Using Excel's import functionality allows you to specify the format (auto, text or date) each column should be interpreted as and does not require any modification to the data files.
You can find it as Data
→ Get External Data
→ From Text
in Excel 2007/2010.
Or Data
→ Import External Data
→ Import Data
in Excel 2003.
Here's an image of the Excel 2003 Text Import Wizard in action on the example data given, showing me importing the latter two columns as text:
Solution 4:
The example from Simon did not work for me, and I suspect it is a language difference. In C# here is what my working format string looks like:
var linebreak = (i++ == list.Count) ? "" : "\r\n";
csv += String.Format("=\"{0}\",{1},{2},{3},=\"{4}\"{5}",
item.Value, item.Status, item.NewStatus, item.Carrier, c.Status, linebreak);
and this is what the output file looks like:
="abababababab",INVALID,INVALID,USPS,="",
="9500100030492359000149",UNKNOWNSTATUS,DELIVERED,USPS,="3"
="9500100030492359000149",UNKNOWNSTATUS,DELIVERED,USPS,="3"
="9500100030492359000149",UNKNOWNSTATUS,DELIVERED,USPS,="3"
="9500100030492359000149",UNKNOWNSTATUS,DELIVERED,USPS,="3"
="9400110200793482982812",UNKNOWNSTATUS,DELIVERED,USPS,="3"
="9400110200793482982812",UNKNOWNSTATUS,DELIVERED,USPS,="3"
="9400110200793000216184",UNKNOWNSTATUS,INVALID,USPS,=""
As can be seen, the format in the output file is ="VALUE",
not "=""VALUE""",
which I believe may be a Visual Basic convention.
I am using Excel 2010. Incidentally, Google Sheets will not open/convert a file formatted this way. It will work if you remove the equal sign thus "VALUE",
- Excel will still open the file but ignore the fact that you want your columns to be strings.