Summarizing by subgroup percentage in R

Solution 1:

Per your comment, if the subgroups are unique you can do

library(dplyr)
group_by(df, group) %>% mutate(percent = value/sum(value))
#   group subgroup value   percent
# 1     A        a     1 0.1250000
# 2     A        b     4 0.5000000
# 3     A        c     2 0.2500000
# 4     A        d     1 0.1250000
# 5     B        a     1 0.1666667
# 6     B        b     2 0.3333333
# 7     B        c     3 0.5000000

Or to remove the value column and add the percent column at the same time, use transmute

group_by(df, group) %>% transmute(subgroup, percent = value/sum(value))
#   group subgroup   percent
# 1     A        a 0.1250000
# 2     A        b 0.5000000
# 3     A        c 0.2500000
# 4     A        d 0.1250000
# 5     B        a 0.1666667
# 6     B        b 0.3333333
# 7     B        c 0.5000000

Solution 2:

We can use prop.table to calculate percentage/ratio.

Base R :

transform(df, percent = ave(value, group, FUN = prop.table))

#  group subgroup value percent
#1     A        a     1   0.125
#2     A        b     4   0.500
#3     A        c     2   0.250
#4     A        d     1   0.125
#5     B        a     1   0.167
#6     B        b     2   0.333
#7     B        c     3   0.500

dplyr :

library(dplyr)
df %>% group_by(group) %>% mutate(percent = prop.table(value))

data.table :

library(data.table)
setDT(df)[, percent := prop.table(value), group]