how to replace an entire column on Pandas.DataFrame
I would like to replace an entire column on a Pandas DataFrame with another column taken from another DataFrame, an example will clarify what I am looking for
import pandas as pd
dic = {'A': [1, 4, 1, 4], 'B': [9, 2, 5, 3], 'C': [0, 0, 5, 3]}
df = pd.DataFrame(dic)
df is
'A' 'B' 'C'
1 9 0
4 2 0
1 5 5
4 3 3
Now I have another dataframe called df1
with a column "E"
that is
df1['E'] = [ 4, 4, 4, 0]
and I would like to replace column "B"
of df with column "E"
of df1
'A' 'E' 'C'
1 4 0
4 4 0
1 4 5
4 0 3
I tried to use the .replace()
method in many ways but I didn't get anything good. Can you help me?
Solution 1:
If the indices match then:
df['B'] = df1['E']
should work otherwise:
df['B'] = df1['E'].values
will work so long as the length of the elements matches
Solution 2:
If you don't mind getting a new data frame object returned as opposed to updating the original Pandas .assign() will avoid SettingWithCopyWarning
. Your example:
df = df.assign(B=df1['E'])
Solution 3:
For those that struggle with the "SettingWithCopy" warning, here's a workaround which may not be so efficient, but still gets the job done.
Suppose you with to overwrite column_1 and column_3, but retain column_2 and column_4
columns_to_overwrite = ["column_1", "column_3"]
First delete the columns that you intend to replace...
original_df.drop(labels=columns_to_overwrite, axis="columns", inplace=True)
... then re-insert the columns, but using the values that you intended to overwrite
original_df[columns_to_overwrite] = other_data_frame[columns_to_overwrite]