Better way to plot Gender count using Python

I am making a graph to plot Gender count for the time series data that look like following data. Each row represent hourly data of each respective patient.

HR	SBP	DBP	Sepsis	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	1	1	3
85	120	80	1	1	3
88	115	75	1	1	3
93	125	85	1	1	3
78	130	90	1	0	4
115	140	110	1	0	4
102	120	80	0	1	5
98	140	110	0	1	5

This is my code:

gender = df_n['Gender'].value_counts()
plt.figure(figsize=(7, 6))
ax = gender.plot(kind='bar', rot=0, color="c")
ax.set_title("Bar Graph of Gender", y = 1)
ax.set_xlabel('Gender')
ax.set_ylabel('Number of People')
ax.set_xticklabels(('Male', 'Female'))

for rect in ax.patches:
    y_value = rect.get_height()
    x_value = rect.get_x() + rect.get_width() / 2
    space = 1
    label = format(y_value)
    ax.annotate(label, (x_value, y_value), xytext=(0, space), textcoords="offset points", ha='center', va='bottom')    
plt.show()

Graph

Now what is happening is the code is calculating total number of instances (0: Male, 1: Female) and plotting it. But I want to plot the total males and females, not the total number of 0s and 1s, as the Same patient is having multiple rows of data (as per P_ID). Like how many patients are male and how many are female?

Can someone help me out? I guess maybe sns.countplot can be used. But I don't know how.

Thanks for helping me out >.<

__________ Udpate ________________

How I can group those Genders that are sepsis (1) or no sepsis (0)?

__________ Update 2 ___________

So, I got the total actual count of Male and Female, thanks to @Shaido. male female count

In the whole dataset, there are only 2932 septic patients. Rest are non-septic. This is what I got from @JohanC answer.

graph

Now, the problem is that as there are only 2932 septic patients, by looking at the graph, it is assumed that only 426 (251 Male) and (175 Female) are septic patients (out of 2932), rest are non-septic. But this is not true. Please help. Thanks.

Solution 1:

I have a working example for selecting the unique IDS, it looks ugly so there is probably a better way, but it works...

import pandas as pd
# example of data:
data = {'gender': [0, 0, 1, 1, 1, 1, 0, 0], 'id': [1, 1, 2, 2, 3, 3, 4, 4]}
df = pd.DataFrame(data)
# get all unique ids:
ids = set(df.id)
# Go over all id, get first element of gender:
g = [list(df[df['id'] == i]['gender'])[0] for i in ids]

# count genders, laze way using pandas since the rest of the code also assumes a dataframe for plotting:
gender_counts = pd.DataFrame(g).value_counts()
# from here you can use your plot function.

# Or Counter
from collections import Counter
gender_counts = Counter(g)
# You have to create another method for plotting the gender.

Better way to plot Gender count using Python

Solution 1:

Related

Recent Posts

HR	SBP	DBP	Sepsis	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	1	1	3
85	120	80	1	1	3
88	115	75	1	1	3
93	125	85	1	1	3
78	130	90	1	0	4
115	140	110	1	0	4
102	120	80	0	1	5
98	140	110	0	1	5

HR	SBP	DBP	Sepsis	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	1	1	3
85	120	80	1	1	3
88	115	75	1	1	3
93	125	85	1	1	3
78	130	90	1	0	4
115	140	110	1	0	4
102	120	80	0	1	5
98	140	110	0	1	5

HR	SBP	DBP	Sepsis	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	1	1	3
85	120	80	1	1	3
88	115	75	1	1	3
93	125	85	1	1	3
78	130	90	1	0	4
115	140	110	1	0	4
102	120	80	0	1	5
98	140	110	0	1	5