Select first and last row from grouped data

Question

Using dplyr, how do I select the top and bottom observations/rows of grouped data in one statement?

Data & Example

Given a data frame:

df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), 
                 stopId=c("a","b","c","a","b","c","a","b","c"), 
                 stopSequence=c(1,2,3,3,1,4,3,1,2))

I can get the top and bottom observations from each group using slice, but using two separate statements:

firstStop <- df %>%
  group_by(id) %>%
  arrange(stopSequence) %>%
  slice(1) %>%
  ungroup

lastStop <- df %>%
  group_by(id) %>%
  arrange(stopSequence) %>%
  slice(n()) %>%
  ungroup

Can I combine these two statements into one that selects both top and bottom observations?

There is probably a faster way:

df %>%
  group_by(id) %>%
  arrange(stopSequence) %>%
  filter(row_number()==1 | row_number()==n())

Just for completeness: You can pass slice a vector of indices:

df %>% arrange(stopSequence) %>% group_by(id) %>% slice(c(1,n()))

which gives

  id stopId stopSequence
1  1      a            1
2  1      c            3
3  2      b            1
4  2      c            4
5  3      b            1
6  3      a            3

Select first and last row from grouped data

Related

Recent Posts