pandas crashes on repeated DataFrame.reset_index()

Very weird bug here: I'm using pandas to merge several dataframes. As part of the merge, I have to call reset_index several times. But when I do, it crashes unexpectedly on the second or third use of reset_index.

Here's minimal code to reproduce the error:

import pandas
A = pandas.DataFrame({
    'val' :  ['aaaaa', 'acaca', 'ddddd', 'zzzzz'],
    'extra' : range(10,14),
})
A = A.reset_index()
A = A.reset_index()
A = A.reset_index()

Here's the relevant part of the traceback:

....
    A = A.reset_index()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2393, in reset_index
    new_obj.insert(0, name, _maybe_cast(self.index.values))
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1787, in insert
    self._data.insert(loc, column, value)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 893, in insert
    raise Exception('cannot insert %s, already exists' % item)
Exception: cannot insert level_0, already exists

Any idea what's going wrong here? How do I work around it?


Solution 1:

Inspecting frame.py, it looks like pandas tries to insert a column 'index' or 'level_0'. If either/both(??) of them are already taken, then it throws the error.

Fortunately, there's a "drop" option. AFAICT, this drops an existing index with the same name and replaces it with the new, reset index. This might get you in trouble if you have a column named "index," but I think otherwise you're okay.

"Fixed" code:

import pandas
A = pandas.DataFrame({
    'val' :  ['aaaaa', 'acaca', 'ddddd', 'zzzzz'],
    'extra' : range(10,14),
})
A = A.reset_index(drop=True)
A = A.reset_index(drop=True)
A = A.reset_index(drop=True)

Solution 2:

you can use :

A.reset_index(drop=True, inplace=True)