pandas crashes on repeated DataFrame.reset_index()
Very weird bug here: I'm using pandas to merge several dataframes. As part of the merge, I have to call reset_index several times. But when I do, it crashes unexpectedly on the second or third use of reset_index.
Here's minimal code to reproduce the error:
import pandas
A = pandas.DataFrame({
'val' : ['aaaaa', 'acaca', 'ddddd', 'zzzzz'],
'extra' : range(10,14),
})
A = A.reset_index()
A = A.reset_index()
A = A.reset_index()
Here's the relevant part of the traceback:
....
A = A.reset_index()
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2393, in reset_index
new_obj.insert(0, name, _maybe_cast(self.index.values))
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1787, in insert
self._data.insert(loc, column, value)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 893, in insert
raise Exception('cannot insert %s, already exists' % item)
Exception: cannot insert level_0, already exists
Any idea what's going wrong here? How do I work around it?
Solution 1:
Inspecting frame.py, it looks like pandas tries to insert a column 'index' or 'level_0'. If either/both(??) of them are already taken, then it throws the error.
Fortunately, there's a "drop" option. AFAICT, this drops an existing index with the same name and replaces it with the new, reset index. This might get you in trouble if you have a column named "index," but I think otherwise you're okay.
"Fixed" code:
import pandas
A = pandas.DataFrame({
'val' : ['aaaaa', 'acaca', 'ddddd', 'zzzzz'],
'extra' : range(10,14),
})
A = A.reset_index(drop=True)
A = A.reset_index(drop=True)
A = A.reset_index(drop=True)
Solution 2:
you can use :
A.reset_index(drop=True, inplace=True)