- It's typically not required to use
seaborn
to plot grouped bars, it's just a matter of shaping the dataframe, usually with .pivot
or .pivot_table
. See How to create a grouped bar plot for more examples.
- Using
pandas.DataFrame.plot
with a wide dataframe will be easier, in this case, than using a long dataframe with seaborn.barplot
, because the column / bar order and totals
coincide.
- This reduces the code from 16 to 8 lines.
- See this answer for adding annotations as a percent of the entire population.
- Tested in
python 3.8.11
, pandas 1.3.1
, and matplotlib 3.4.2
Imports and DataFrame Transformation
import pandas as pd
import matplotlib.pyplot as plt
# transform the sample data from the OP with pivot_table
dfp = all_call.pivot_table(index='Type_of_Caller', columns='with_client_nmbr', values='Call_ID', aggfunc='nunique')
# display(dfp)
with_client_nmbr False True
Type_of_Caller
Agency 994 4593
EE 10554 27455
ER 2748 11296
Use matplotlib.pyplot.bar_label
- Requires
matplotlib >= 3.4.2
- Each column is plotted in order, and the
pandas.Series
created by df.sum()
has the same order as the dataframe columns. Therefore, zip
totals
to the plot containers and use the value, tot
, in labels
to calculate the percentage by hue group.
- Add custom annotations based on percent by hue group, by using the
labels
parameter.
-
(v.get_height()/tot)*100
in the list comprehension, calculates percentage.
- See this answer for other options using
.bar_label
# get the total value for the column
totals = dfp.sum()
# plot
p1 = dfp.plot(kind='bar', figsize=(8, 4), rot=0, color=['orangered', 'skyblue'], ylabel='Value of Bar', title="The value and percentage (by hue group)")
# add annotations
for tot, p in zip(totals, p1.containers):
labels = [f'{(v.get_height()/tot)*100:0.2f}%' for v in p]
p1.bar_label(p, labels=labels, label_type='edge', fontsize=8, rotation=0, padding=2)
p1.margins(y=0.2)
plt.show()