Split dataframe using two columns of data and apply common transformation on list of resulting dataframes

Solution 1:

You need to put all the factors you want to split by in a list, eg:

split(mtcars,list(mtcars$cyl,mtcars$gear))

Then you can use lapply on this to do what else you want to do.

If you want to avoid having zero row dataframes in the results, there is a drop parameter whose default is the opposite of the drop parameter in the "[" function.

split(mtcars,list(mtcars$cyl,mtcars$gear), drop=TRUE)

Solution 2:

how about this one:

 library(plyr)
 ddply(df, .(category1, category2), summarize, value1 = lag(value1), value2=lag(value2))

seems like an excelent job for plyr package and ddply() function. If there are still open questions please provide some sample data. Splitting should work on several columns as well:

df<- data.frame(value=rnorm(100), class1=factor(rep(c('a','b'), each=50)), class2=factor(rep(c('1','2'), 50)))
g <- c(factor(df$class1), factor(df$class2))
split(df$value, g)