Is there a better way to fill missing values between two dataframes?
I'm trying to fill missing values in one dataframe based on the data in other dataframes. My first dataframe looks like this
[df1]:
Key Value
1 15
2 30
23 NaN
1 15
4 60
2 30
5 NaN
2 NaN
4 NaN
25 NaN
62 NaN
32 NaN
21 315
21 NaN
And the dataframe I want to extract the information from looks like this one:
[df2]:
Key Value
23 "fine"
5 "five"
2 30
4 60
25 "two"
62 "bar"
32 "foo"
21 315
Key Value
2 30.0
23 "fine"
1 15.0
4 60.0
2 30.0
5 "five"
2 30.0
4 60.0
25 "two"
62 "bar"
32 "foo"
21 315.0
21 315.0
I've managed to do this iterating over the rows of both dataframes, using
for i, value in df1.iterrows():
for j, value2 in df2.iterrows():
if df1.loc[i,"Key"]==df2.loc[j,"Key"]:
df1.loc[i,"Value"]=df2.loc[j,"Value"]
else:
pass
but using this with my actual information (Dataframes of around 100000 rows and 20+ columns each) takes too long and I believe there's a better way to do it with a pandas built-in function. Thanks in advance!
Solution 1:
You can achieve that with only one line of code:
df1 = df1.fillna(df2)