How can I use io.StringIO() with the csv module?

I tried to backport a Python 3 program to 2.7, and I'm stuck with a strange problem:

>>> import io
>>> import csv
>>> output = io.StringIO()
>>> output.write("Hello!")            # Fail: io.StringIO expects Unicode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'
>>> output.write(u"Hello!")           # This works as expected.
6L
>>> writer = csv.writer(output)       # Now let's try this with the csv module:
>>> csvdata = [u"Hello", u"Goodbye"]  # Look ma, all Unicode! (?)
>>> writer.writerow(csvdata)          # Sadly, no.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'

According to the docs, io.StringIO() returns an in-memory stream for Unicode text. It works correctly when I try and feed it a Unicode string manually. Why does it fail in conjunction with the csv module, even if all the strings being written are Unicode strings? Where does the str come from that causes the Exception?

(I do know that I can use StringIO.StringIO() instead, but I'm wondering what's wrong with io.StringIO() in this scenario)


The Python 2.7 csv module doesn't support Unicode input: see the note at the beginning of the documentation.

It seems that you'll have to encode the Unicode strings to byte strings, and use io.BytesIO, instead of io.StringIO.

The examples section of the documentation includes examples for a UnicodeReader and UnicodeWriter wrapper classes (thanks @AlexeyKachayev for the pointer).


Please use StringIO.StringIO().

http://docs.python.org/library/io.html#io.StringIO

http://docs.python.org/library/stringio.html

io.StringIO is a class. It handles Unicode. It reflects the preferred Python 3 library structure.

StringIO.StringIO is a class. It handles strings. It reflects the legacy Python 2 library structure.


I found this when I tried to serve a CSV file via Flask directly without creating the CSV file on the file system. This works:

import io
import csv

data = [[u'cell one', u'cell two'], [u'cell three', u'cell four']]

output = io.BytesIO()
writer = csv.writer(output, delimiter=',')
writer.writerows(data)
your_csv_string = output.getvalue()

See also

  • More about CSV
  • The Flask part
  • A few notes about Strings / Unicode

From csv documentation:

The csv module doesn’t directly support reading and writing Unicode, but it is 8-bit-clean save for some problems with ASCII NUL characters. So you can write functions or classes that handle the encoding and decoding for you as long as you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended.

You can find example of UnicodeReader, UnicodeWriter here http://docs.python.org/2/library/csv.html


To use CSV reader/writer with 'memory files' in python 2.7:

from io import BytesIO
import csv

csv_data = """a,b,c
foo,bar,foo"""

# creates and stores your csv data into a file the csv reader can read (bytes)
memory_file_in = BytesIO(csv_data.encode(encoding='utf-8'))

# classic reader
reader = csv.DictReader(memory_file_in)

# writes a csv file
fieldnames = reader.fieldnames  # here we use the data from the above csv file
memory_file_out = BytesIO()     # create a memory file (bytes)

# classic writer (here we copy the first file in the second file)
writer = csv.DictWriter(memory_file_out, fieldnames)
for row in reader:
    print(row)
    writer.writerow(row)