Transposing a column in a pandas dataframe while keeping other column intact with duplicates
My data frame is as follows
selection_id last_traded_price
430494 1.46
430494 1.48
430494 1.56
430494 1.57
430495 2.45
430495 2.67
430495 2.72
430495 2.87
I have lots of rows that contain selection id's and I need to keep selection_id column the same but transpose the data in last traded price to look like this.
selection_id last_traded_price
430494 1.46 1.48 1.56 1.57 e.t.c
430495 2.45 2.67 2.72 2.87 e.t.c
I've tried a to use a pivot
(df.pivot(index='selection_id', columns=last_traded_price', values='last_traded_price')
Pivot isn't working due to duplicate rows in selection_id. is it possible to transpose the data first and drop the duplicates after?
Solution 1:
Option 1groupby
+ apply
v = df.groupby('selection_id').last_traded_price.apply(list)
pd.DataFrame(v.tolist(), index=v.index)
0 1 2 3
selection_id
430494 1.46 1.48 1.56 1.57
430495 2.45 2.67 2.72 2.87
Option 2
You can do this with pivot
, as long as you have another column of counts to pass for the pivoting (it needs to be pivoted along something, that's why).
df['Count'] = df.groupby('selection_id').cumcount()
df.pivot('selection_id', 'Count', 'last_traded_price')
Count 0 1 2 3
selection_id
430494 1.46 1.48 1.56 1.57
430495 2.45 2.67 2.72 2.87
Solution 2:
You can use cumcount
for Counter for new columns names created by set_index
+ unstack
or pandas.pivot
:
g = df.groupby('selection_id').cumcount()
df = df.set_index(['selection_id',g])['last_traded_price'].unstack()
print (df)
0 1 2 3
selection_id
430494 1.46 1.48 1.56 1.57
430495 2.45 2.67 2.72 2.87
Similar solution with pivot
:
df = pd.pivot(index=df['selection_id'],
columns=df.groupby('selection_id').cumcount(),
values=df['last_traded_price'])
print (df)
0 1 2 3
selection_id
430494 1.46 1.48 1.56 1.57
430495 2.45 2.67 2.72 2.87