How do you group pandas dataframe rows based on permutation of booleans?

Imagine there is a pandas dataframe with five columns and n rows. Each column holds a boolean value.

Maths says there should be 32 permutations of boolean values.

How do I group them by the permutation of boolean values associated with each row so I can get a count on each group or return other properties?

For example, how do I find out how many rows associated with TTTTTs or TTTTFs or whatever permutation I'm interested in?

There are a couple of ways of doing this. One way would be to just group by all the columns you care about at once. If you want the counts, you can call the GroupBy.count method on the result:

df.groupby(['c1', 'c2', 'c3', 'c4', 'c5']).count()

Or more simply, if all the columns are of interest:

df.groupby(list(df.columns)).count()

You could also convert the booleans to a number, and group on that:

df['Num'] = (df.to_numpy() << [4, 3, 2, 1, 0]).sum(0)
df.groupby('Num').count()

A more general solution that does not require creating a new column could use value_counts

names = ['c1', 'c2', 'c3', 'c4', 'c5']
pd.Series((df[names].to_numpy() << np.arange(len(names))).sum(0)).value_counts()

Which you can very conveniently rewrite as

pd.Series.value_counts((df[names].to_numpy() << np.arange(len(names))).sum(0))

error TS2307: Cannot find module '@ng-bootstrap/ng-bootstrap'

How to build conditional eloquent query with data which comes from database

How to send the Multipart file and json data to spring boot

Can't type text into tkinter entry widget (python)

Localization issue on Xbox (UWP)

How to view DNS cache in OSX?

Minimum value of $ f(x,y,z)=\left(x+\frac{1}{y}\right)^2+\left(y+\frac{1}{z}\right)^2+\left(z+\frac{1}{x}\right)^2. $ [duplicate]

Show that there are only two types of subgroups in R , either Discrete or Dense? [duplicate]

$C_{c}(X)$ is complete. then implies that $X$ is compact. [closed]

If the product of two square matrices is invertible, then both matrices are invertible

New bounds for convex function of 2 variables

Prove that if $\gcd(a,b)=1$, then $\gcd(a\cdot b,c) = \gcd(a,c)\cdot \gcd(b,c)$. [duplicate]

How do you group pandas dataframe rows based on permutation of booleans?

Related

Recent Posts