How to efficiently handle European decimal separators using the pandas read_csv function?
Solution 1:
For European style numbers, use the thousands
and decimal
parameters in pandas.read_csv
.
For example:
pandas.read_csv('data.csv', thousands='.', decimal=',')
From the docs:
thousands :
str, optional Thousands separator.
decimal :
str, default ‘.’ Character to recognize as decimal point (e.g. use ‘,’ for European data).
Solution 2:
You can use the converters
kw in read_csv
. Given /tmp/data.csv
like this:
"x","y"
"one","1.234,56"
"two","2.000,00"
you can do:
In [20]: pandas.read_csv('/tmp/data.csv', converters={'y': lambda x: float(x.replace('.','').replace(',','.'))})
Out[20]:
x y
0 one 1234.56
1 two 2000.00