Select first and last row from grouped data
Question
Using dplyr
, how do I select the top and bottom observations/rows of grouped data in one statement?
Data & Example
Given a data frame:
df <- data.frame(id=c(1,1,1,2,2,2,3,3,3),
stopId=c("a","b","c","a","b","c","a","b","c"),
stopSequence=c(1,2,3,3,1,4,3,1,2))
I can get the top and bottom observations from each group using slice
, but using two separate statements:
firstStop <- df %>%
group_by(id) %>%
arrange(stopSequence) %>%
slice(1) %>%
ungroup
lastStop <- df %>%
group_by(id) %>%
arrange(stopSequence) %>%
slice(n()) %>%
ungroup
Can I combine these two statements into one that selects both top and bottom observations?
There is probably a faster way:
df %>%
group_by(id) %>%
arrange(stopSequence) %>%
filter(row_number()==1 | row_number()==n())
Just for completeness: You can pass slice
a vector of indices:
df %>% arrange(stopSequence) %>% group_by(id) %>% slice(c(1,n()))
which gives
id stopId stopSequence
1 1 a 1
2 1 c 3
3 2 b 1
4 2 c 4
5 3 b 1
6 3 a 3