dplyr issues when using group_by(multiple variables)
I had a similar problem. I found that simply detaching plyr
solved it:
detach(package:plyr)
library(dplyr)
Taking Dickoa's answer one step further -- as Hadley says "summarise peels off a single layer of grouping". It peels off grouping from the reverse order in which you applied it so you can just use
mtcars %>%
group_by(cyl, gear) %>%
summarise(newvar = sum(wt)) %>%
summarise(newvar2 = sum(newvar) + 5)
Note that this will give a different answer if you use group_by(gear, cyl)
in the second line.
And to get your first attempt working:
df1 <- mtcars %>%
group_by(cyl, gear) %>%
summarise(newvar = sum(wt))
df2 <- df1 %>%
group_by(cyl) %>%
summarise(newvar2 = sum(newvar)+5)
If you translate your plyr
code into dplyr
using summarise
instead of mutate
you get the same results.
library(plyr)
df1 <- ddply(mtcars, .(cyl, gear), summarise, newvar = sum(wt))
df2 <- ddply(df1, .(cyl), summarise, newvar2 = sum(newvar) + 5)
df2
## cyl newvar2
## 1 4 30.143
## 2 6 26.820
## 3 8 60.989
detach(package:plyr)
library(dplyr)
mtcars %.%
group_by(cyl, gear) %.%
summarise(newvar = sum(wt)) %.%
group_by(cyl) %.%
summarise(newvar2 = sum(newvar) + 5)
## cyl newvar2
## 1 4 30.143
## 2 8 60.989
## 3 6 26.820
EDIT
Since summarise
drops the last group (gear
) you can skip the second group_by
(see @hadley comment below)
library(dplyr)
mtcars %.%
group_by(cyl, gear) %.%
summarise(newvar = sum(wt)) %.%
summarise(newvar2 = sum(newvar) + 5)
## cyl newvar2
## 1 4 30.143
## 2 8 60.989
## 3 6 26.820