Pandas - compare loaded data to processed data [duplicate]

The issue with floating point numbers is precision. As you guessed, your numbers are very close but not exactly identical:

df.iloc[0,0]
-0.41676538151302184

df2.iloc[0,0]
-0.4167653815130218
with pd.option_context('display.float_format', '{:.20f}'.format):
    display(df2.val.compare(df.val))

                     self                   other
0 -0.41676538151302178203 -0.41676538151302183755

One option is to use numpy.isclose or numpy.allclose, that are specifically designed to test close numbers. There are two parameters rtol and atol to specify a custom relative or absolute tolerance.

import numpy as np
np.isclose(df, df2).all()

# or 
np.allclose(df, df2)

output: True