How to write UTF-8 in a CSV file
I am trying to create a text file in csv format out of a PyQt4 QTableWidget
. I want to write the text with a UTF-8 encoding because it contains special characters. I use following code:
import codecs
...
myfile = codecs.open(filename, 'w','utf-8')
...
f = result.table.item(i,c).text()
myfile.write(f+";")
It works until the cell contains a special character. I tried also with
myfile = open(filename, 'w')
...
f = unicode(result.table.item(i,c).text(), "utf-8")
But it also stops when a special character appears. I have no idea what I am doing wrong.
It's very simple for Python 3.x (docs).
import csv
with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file:
writer = csv.writer(csv_file, delimiter=';')
writer.writerow('my_utf8_string')
For Python 2.x, look here.
From your shell run:
pip2 install unicodecsv
And (unlike the original question) presuming you're using Python's built in csv
module, turn import csv
into import unicodecsv as csv
in your code.
Use this package, it just works: https://github.com/jdunck/python-unicodecsv.
For me the UnicodeWriter
class from Python 2 CSV module documentation didn't really work as it breaks the csv.writer.write_row()
interface.
For example:
csv_writer = csv.writer(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)
works, while:
csv_writer = UnicodeWriter(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)
will throw AttributeError: 'int' object has no attribute 'encode'
.
As UnicodeWriter
obviously expects all column values to be strings, we can convert the values ourselves and just use the default CSV module:
def to_utf8(lst):
return [unicode(elem).encode('utf-8') for elem in lst]
...
csv_writer.writerow(to_utf8(row))
Or we can even monkey-patch csv_writer to add a write_utf8_row
function - the exercise is left to the reader.
The examples in the Python documentation show how to write Unicode CSV files: http://docs.python.org/2/library/csv.html#examples
(can't copy the code here because it's protected by copyright)