Pandas read csv file with float values results in weird rounding and decimal digits

I have a csv file containing numerical values such as 1524.449677. There are always exactly 6 decimal places.

When I import the csv file (and other columns) via pandas read_csv, the column automatically gets the datatype object. My issue is that the values are shown as 2470.6911370000003 which actually should be 2470.691137. Or the value 2484.30691 is shown as 2484.3069100000002.

This seems to be a datatype issue in some way. I tried to explicitly provide the data type when importing via read_csv by giving the dtype argument as {'columnname': np.float64}. Still the issue did not go away.

How can I get the values imported and shown exactly as they are in the source csv file?

Pandas uses a dedicated dec 2 bin converter that compromises accuracy in preference to speed.

Passing float_precision='round_trip' to read_csv fixes this.

Check out this page for more detail on this.

After processing your data, if you want to save it back in a csv file, you can pass
float_format = "%.nf" to the corresponding method.

A full exemple:

import pandas as pd

df_in  = pd.read_csv(source_file, float_precision='round_trip')
df_out = ... # some processing of df_in
df_out.to_csv(target_file, float_format="%.3f") # for 3 decimal places

I realise this is an old question, but maybe this will help someone else:

I had a similar problem, but couldn't quite use the same solution. Unfortunately the float_precision option only exists when using the C engine and not with the python engine. So if you have to use the python engine for some other reason (for example because the C engine can't deal with regex literals as deliminators), this little "trick" worked for me:

In the pd.read_csv arguments, define dtype='str' and then convert your dataframe to whatever dtype you want, e.g. df = df.astype('float64') .

Bit of a hack, but it seems to work. If anyone has any suggestions on how to solve this in a better way, let me know.

Pandas read csv file with float values results in weird rounding and decimal digits

Related

Recent Posts