How to create a pie-chart from pandas DataFrame?
Replace Topic
by Other
if no top N
in Series.where
and then aggregate sum
with Series.plot.pie
:
N = 10
df['Topic'] = df['Topic'].where(df['Count'].isin(df['Count'].nlargest(N)), 'Other')
s = df.groupby('Topic')['Count'].sum()
pie = df.plot.pie(y='Count', legend=False)
#https://stackoverflow.com/a/44076433/2901002
labels = [f'{l}, {s:0.1f}%' for l, s in zip(s.index, s / s.sum())]
plt.legend(bbox_to_anchor=(0.85, 1), loc='upper left', labels=labels)
You need to craft a new dataframe. Assuming your counts are sorted in descending order (if not, use df.sort_values(by='Count', inplace=True)
):
TOP = 10
df2 = df.iloc[:TOP]
df2 = df2.append({'Topic': 'Other', 'Count': df['Count'].iloc[TOP:].sum()},
ignore_index=True)
df2.set_index('Topic').plot.pie(y='Count', legend=False)
Example (N=10, N=5):
Percentages in the legend:
N = 5
df2 = df.iloc[:N]
df2 = df2.append({'Topic': 'Other', 'Count': df['Count'].iloc[N:].sum()}, ignore_index=True)
df2.set_index('Topic').plot.pie(y='Count', legend=False)
leg = plt.legend(labels=df2['Count'])
output: