pandas to_csv output quoting issue
Solution 1:
You could pass quoting=csv.QUOTE_NONE
, for example:
>>> df.to_csv('foo.txt',index=False,header=False)
>>> !cat foo.txt
123,"this is ""out text"""
>>> import csv
>>> df.to_csv('foo.txt',index=False,header=False, quoting=csv.QUOTE_NONE)
>>> !cat foo.txt
123,this is "out text"
but in my experience it's better to quote more, rather than less.
Solution 2:
Note: there is currently a small error in the Pandas to_string documentation. It says:
- quoting : int, Controls whether quotes should be recognized. Values are taken from csv.QUOTE_* values. Acceptable values are 0, 1, 2, and 3 for QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONE, and QUOTE_NONNUMERIC,
respectively.
But this reverses how csv defines the QUOTE_NONE and QUOTE_NONNUMERIC variables.
In [13]: import csv
In [14]: csv.QUOTE_NONE
Out[14]: 3
Solution 3:
To use quoting=csv.QUOTE_NONE
, you need to set the escapechar
, e.g.
# Create a tab-separated file with quotes
$ echo abc$'\t'defg$'\t'$'"xyz"' > in.tsv
$ cat in.tsv
abc defg "xyz"
# Gotcha the quotes disappears in `"..."`
$ python3
>>> import pandas as pd
>>> import csv
>>> df = pd.read("in.tsv", sep="\t")
>>> df = pd.read_csv("in.tsv", sep="\t")
>>> df
Empty DataFrame
Columns: [abc, defg, xyz]
Index: []
# When reading in pandas, to read the `"..."` quotes,
# you have to explicitly say there's no `quotechar`
>>> df = pd.read_csv("in.tsv", sep="\t", quotechar='\0')
>>> df
Empty DataFrame
Columns: [abc, defg, "xyz"]
Index: []
# To print out without the quotes.
>> df.to_csv("out.tsv", , sep="\t", quoting=csv.QUOTE_NONE, quotechar="", escapechar="\\")
Solution 4:
To use without escapechar:
Replace comma char ,
(Unicode:U+002C) in your df with an single low-9 quotation mark character ‚
(Unicode: U+201A)
After this, you can simply use:
import csv
df.to_csv('foo.txt', index=False, header=False, quoting=csv.QUOTE_NONE)