cumsum by group [duplicate]
library(data.table)
data <- data.table(group1=c('A','A','A','B','B'),sum=c(1,2,4,3,7))
data[,list(cumsum = cumsum(sum)),by=list(group1)]
In addition to using data.table
, tapply
in base R works fine for both of these cases:
dta <- read.table(text="
group1 group2 num
A sg 1
A sh 2
A sg 4
B at 3
B al 7", header=TRUE)
dta$cumsum <- do.call(c, tapply(dta$num, dta$group1, FUN=cumsum))
Calculating the cumulative sum by two groups requires some reordering:
dta <- dta[order(dta$group1, dta$group2, dta$num),]
dta$cumsum2 <- do.call(c, tapply(dta$num,
paste0(dta$group1, dta$group2),
FUN=cumsum))
dta
group1 group2 num cumsum cumsum2
1 A sg 1 1 1
3 A sg 4 7 5
2 A sh 2 3 2
5 B al 7 10 7
4 B at 3 3 3
And if you need the original order back:
dta[as.numeric(rownames(dta)),]
group1 group2 num cumsum cumsum2
1 A sg 1 1 1
2 A sh 2 3 2
3 A sg 4 7 5
4 B at 3 3 3
5 B al 7 10 7