How can I match the missing values (nan) of two dataframes?

how can I set all my values in df1 as missing if their position equivalent is a missing value in df2?

Data df1:
Index     Data
1          3
2          8
3          9


Data df2:
Index     Data
1          nan
2          2
3          nan

desired output:
Index     Data
1          nan
2          8
3          nan

So I would like to keep the data of df1, but only for the positions for which df2 also has data entries. For all nans in df2 I would like to replace the value of df1 with nan as well.

I tried the following, but this replaced all data points with nan.

df1 = df1.where(df2== np.nan, np.nan)

Thank you very much for your help.

Use mask, which is doing exactly the inverse of where:

df3 = df1.mask(df2.isna())

output:

   Index  Data
0      1   NaN
1      2   8.0
2      3   NaN

In your case, you were setting all elements matching a non-NaN as NaN, and because equality is not the correct way to check for NaN (np.nan == np.nan yields False), you were setting all to NaN.

Change df2 == np.nan by df2.notna():

df3 = df1.where(df2.notna(), np.nan)
print(df3)

# Output
   Index  Data
0      1   NaN
1      2   8.0
2      3   NaN

How can I match the missing values (nan) of two dataframes?

Related

Recent Posts