Deleting multiple columns based on column names in Pandas
The by far the simplest approach is:
yourdf.drop(['columnheading1', 'columnheading2'], axis=1, inplace=True)
I don't know what you mean by inefficient but if you mean in terms of typing it could be easier to just select the cols of interest and assign back to the df:
df = df[cols_of_interest]
Where cols_of_interest
is a list of the columns you care about.
Or you can slice the columns and pass this to drop
:
df.drop(df.ix[:,'Unnamed: 24':'Unnamed: 60'].head(0).columns, axis=1)
The call to head
just selects 0 rows as we're only interested in the column names rather than data
update
Another method: It would be simpler to use the boolean mask from str.contains
and invert it to mask the columns:
In [2]:
df = pd.DataFrame(columns=['a','Unnamed: 1', 'Unnamed: 1','foo'])
df
Out[2]:
Empty DataFrame
Columns: [a, Unnamed: 1, Unnamed: 1, foo]
Index: []
In [4]:
~df.columns.str.contains('Unnamed:')
Out[4]:
array([ True, False, False, True], dtype=bool)
In [5]:
df[df.columns[~df.columns.str.contains('Unnamed:')]]
Out[5]:
Empty DataFrame
Columns: [a, foo]
Index: []