Replace null values in a dataframe using another dataframe
I'm trying to replace null values in dataframe d
using dataframe f
.
d
and f
are linked by EGI
. In d
, EGI
is a column and is not unique. In f
, EGI
is unique and is this dataframe's index.
For each row in d
, I need the values in that row to be 'masked' by the row in f
with corresponding EGI
.
Sample data:
d = pd.DataFrame({'EGI':['a1','b2','a1','d4'],'A': ['x', np.nan, 'z', 'e'], 'B': [pd.NaT, 6, 7, 9], 'C': [2, 1, None, 9], 'D': [2, None, np.nan, None]})
EGI A B C D
0 a1 x NaT 2.0 2.0
1 b2 NaN 6 1.0 NaN
2 a1 z 7 NaN NaN
3 d4 e 9 9.0 NaN
f = pd.DataFrame({'B': [5, 8, 9], 'A': ['w', 'y', np.nan], 'D': [None, np.nan, 8], 'test': [5, 8, 9]}, index=['b2', 'a1', 'c3'])
B A D test
b2 5 w NaN 5
a1 8 y NaN 8
c3 9 NaN 8.0 9
Expected output:
EGI A B C D
0 a1 x 8 2.0 2.0
1 b2 w 6 1.0 NaN
2 a1 z 7 NaN NaN
3 d4 e 9 9.0 NaN
What I tried:
m = d.isnull()
m.index = d['EGI'].tolist()
m = m.drop(['EGI'], axis = 1)
d.mask(m, f)
EGI A B C D
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
If dataframes d
and f
have matching row indexes, we can just fillna
:
d.fillna(f)
But in OP's example, the indexes do not match up, so we just need to align them first.
One-liner
Use set_index
and reindex
to align the indexes to EGI
and then fillna
:
d.set_index('EGI').fillna(f.reindex(d.EGI))
# EGI A B C D
# 0 a1 x 8.0 2.0 2.0
# 1 b2 w 6 1.0 NaN
# 2 a1 z 7 NaN NaN
# 3 d4 e 9 9.0 NaN
Step-by-step
-
Use
set_index
to setd
's index toEGI
:d = d.set_index('EGI') # A B C D # EGI # a1 x NaT 2.0 2.0 # b2 NaN 6 1.0 NaN # a1 z 7 NaN NaN # d4 e 9 9.0 NaN
-
Use
reindex
to alignf
's index tod
's index:f = f.reindex(d.index) # B A D test # EGI # a1 8.0 y NaN 8.0 # b2 5.0 w NaN 5.0 # a1 8.0 y NaN 8.0 # d4 NaN NaN NaN NaN
-
Use
fillna
to filld
's NaNs withf
:d.fillna(f) # EGI A B C D # 0 a1 x 8.0 2.0 2.0 # 1 b2 w 6 1.0 NaN # 2 a1 z 7 NaN NaN # 3 d4 e 9 9.0 NaN
Note that the column indexes of d
and f
are not aligned and do not need to be. We only need to align the row indexes, and fillna
will handle the rest.