Group by multiple columns and sum other multiple columns

Solution 1:

The data.table way is :

DT[, lapply(.SD,sum), by=list(col1,col2,col3,...)]

DT[, lapply(.SD,sum), by=colnames(DT)[1:10]]

where .SD is the (S)ubset of (D)ata excluding group columns. (Aside: If you need to refer to group columns generically, they are in .BY.)

Solution 2:

In base R this would be...

aggregate( as.matrix(df[,11:200]), as.list(df[,1:10]), FUN = sum)

EDIT: The aggregate function has come a long way since I wrote this. None of the casting above is necessary.

aggregate( df[,11:200], df[,1:10], FUN = sum )

And there are a variety of ways to write this. Assuming the first 10 columns are named a1 through a10 I like the following, even though it is verbose.

aggregate(. ~ a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8 + a9 + a10, data = dat, FUN = sum)

(You could use paste to construct the formula and use formula)

Group by multiple columns and sum other multiple columns

Solution 1:

Solution 2:

Related

Recent Posts