Pandas - compare loaded data to processed data [duplicate]
The issue with floating point numbers is precision. As you guessed, your numbers are very close but not exactly identical:
df.iloc[0,0]
-0.41676538151302184
df2.iloc[0,0]
-0.4167653815130218
with pd.option_context('display.float_format', '{:.20f}'.format):
display(df2.val.compare(df.val))
self other
0 -0.41676538151302178203 -0.41676538151302183755
One option is to use numpy.isclose
or numpy.allclose
, that are specifically designed to test close numbers. There are two parameters rtol
and atol
to specify a custom relative or absolute tolerance.
import numpy as np
np.isclose(df, df2).all()
# or
np.allclose(df, df2)
output: True