Pandas pivot warning about repeated entries on index

On Pandas documentation of the pivot method, we have:

Examples
--------
>>> df
    foo   bar  baz
0   one   A    1.
1   one   B    2.
2   one   C    3.
3   two   A    4.
4   two   B    5.
5   two   C    6.

>>> df.pivot('foo', 'bar', 'baz')
     A   B   C
one  1   2   3
two  4   5   6

My DataFrame is structured like this:

   name   id     x
----------------------
0  john   1      0
1  john   2      0
2  mike   1      1
3  mike   2      0

And I want something like this:

      1    2   # (this is the id as columns)
----------------------
mike  0    0   # (and this is the 'x' as values)
john  1    0

But when I run the pivot method, it is saying:

*** ReshapeError: Index contains duplicate entries, cannot reshape

Which doesn't makes sense, even in example there are repeated entries on the foo column. I'm using the name column as the index of the pivot, the first argument of the pivot method call.

As far as I can tell with updates to pandas, you have to use pivot_table() instead of pivot().

pandas.pivot_table(df,values='count',index='site_id',columns='week')

Try this,

#drop_duplicates removes entries which have same values for 'foo' and 'bar'
df = df.drop_duplicates(['foo','bar'])
df.pivot('foo','bar','baz')

Works fine for me? Can you post the exact pivot method call you're using?

In [4]: df.pivot('name', 'id', 'x')
Out[4]: 
id    1  2
name      
john  0  0
mike  1  0

friends, I've had such problem. In my case problem was in data - my column 'information' contained 1 unique value and it caused error.

UPD: to correct work 'pivot' pairs (id_user,information) mustn't have dublicates

It works:

df2 = pd.DataFrame({'id_user':[1,2,3,4,4,5,5], 
'information':['phon','phon','phone','phone1','phone','phone1','phone'], 
'value': [1, '01.01.00', '01.02.00', 2, '01.03.00', 3, '01.04.00']})
df2.pivot(index='id_user', columns='information', values='value')

it doesn't work:

df2 = pd.DataFrame({'id_user':[1,2,3,4,4,5,5], 
'information':['phone','phone','phone','phone','phone','phone','phone'], 
'value': [1, '01.01.00', '01.02.00', 2, '01.03.00', 3, '01.04.00']})
df2.pivot(index='id_user', columns='information', values='value')

My data has no duplicated pivot pairs and still pivot_table throws a key error :( drop_duplicates() truncates my data to the first value of the pair.

Input:

Well    Reading     Filter 4
0   A2  1   116041
1   B2  1   105191
2   C2  1   93942
3   D2  1   96821
4   E2  1   85622
5   F2  1   90227
6   G2  1   95801
7   H2  1   107833
8   A2  2   115765
9   B2  2   104395
10  C2  2   93986
...
1630    G2  204     388682
1631    H2  204     444708

1632 rows × 3 columns

df_X2.pivot_table('Reading', 'Well', 'Filter 4')

throws: KeyError: 'Reading'

df_X2_uniq=df_X2.drop_duplicates(['Well', 'Reading']) truncates the data to the first 8 rows:

    Well    Reading     Filter 4
0   A2  1   116041
1   B2  1   105191
2   C2  1   93942
3   D2  1   96821
4   E2  1   85622
5   F2  1   90227
6   G2  1   95801
7   H2  1   107833

After 2 hours of combing through the posts I'm none the wiser... any hints of what I should try to get the pivot to work?

Pandas pivot warning about repeated entries on index

Related

Recent Posts