Action with pandas SettingWithCopyWarning

I try to delete some column and convert some value in column with

df2.drop(df2.columns[[0, 1, 3]], axis=1, inplace=True)
df2['date'] = df2['date'].map(lambda x: str(x)[1:])
df2['date'] = df2['date'].str.replace(':', ' ', 1)
df2['date'] = pd.to_datetime(df2['date'])

and to all this string I get

  df2.drop(df2.columns[[0, 1, 3]], axis=1, inplace=True)
C:/Users/����� �����������/Desktop/projects/youtube_log/filter.py:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

What is problem there?


Solution 1:

Your df2 is a slice of another dataframe. You need to explicitly copy it with df2 = df2.copy() just prior to your attempt to drop

Consider the following dataframe:

import pandas as pd
import numpy as np


df1 = pd.DataFrame(np.arange(20).reshape(4, 5), list('abcd'), list('ABCDE'))

df1

enter image description here

Let me assign a slice of df1 to df2

df2 = df1[['A', 'C']]

enter image description here

df2 is now a slice of df1 and should trigger those pesky SettingWithCopyWarning's if we try to change things in df2. Let's take a look.

df2.drop('c')

enter image description here

No problems. How about:

df2.drop('c', inplace=True)

There it is:

enter image description here

The problem is that pandas tries to be efficient and tracks that df2 is pointing to the same data as df1. It is preserving that relationship. The warning is telling you that you shouldn't be trying to mess with the original dataframe via the slice.

Notice that when we look at df2, row 'c' has been dropped.

df2

enter image description here

And looking at df1 we see that row 'c' is still there.

df1

enter image description here

pandas made a copy of df2 then dropped row 'c'. This is potentially inconsistent with what our intent may have been considering we made df2 a slice of and pointing to same data as df1. So pandas is warning us.

To not see the warning, make the copy yourself.

df2 = df2.copy()
# or
df2 = df1[['A', 'C']].copy()