Summarizing by subgroup percentage in R
Solution 1:
Per your comment, if the subgroups are unique you can do
library(dplyr)
group_by(df, group) %>% mutate(percent = value/sum(value))
# group subgroup value percent
# 1 A a 1 0.1250000
# 2 A b 4 0.5000000
# 3 A c 2 0.2500000
# 4 A d 1 0.1250000
# 5 B a 1 0.1666667
# 6 B b 2 0.3333333
# 7 B c 3 0.5000000
Or to remove the value
column and add the percent
column at the same time, use transmute
group_by(df, group) %>% transmute(subgroup, percent = value/sum(value))
# group subgroup percent
# 1 A a 0.1250000
# 2 A b 0.5000000
# 3 A c 0.2500000
# 4 A d 0.1250000
# 5 B a 0.1666667
# 6 B b 0.3333333
# 7 B c 0.5000000
Solution 2:
We can use prop.table
to calculate percentage/ratio.
Base R :
transform(df, percent = ave(value, group, FUN = prop.table))
# group subgroup value percent
#1 A a 1 0.125
#2 A b 4 0.500
#3 A c 2 0.250
#4 A d 1 0.125
#5 B a 1 0.167
#6 B b 2 0.333
#7 B c 3 0.500
dplyr
:
library(dplyr)
df %>% group_by(group) %>% mutate(percent = prop.table(value))
data.table
:
library(data.table)
setDT(df)[, percent := prop.table(value), group]