Filtering a data frame on a vector [duplicate]

Solution 1:

You can use the %in% operator:

> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52)
> L <- c("A","B","E")
> subset(df, id %in% L)
   id  x
1   A  1
2   B  2
5   E  5
27  A 27
28  B 28
31  E 31

If your IDs are unique, you can use match():

> df <- data.frame(id=c(LETTERS), x=1:26)
> df[match(L, df$id), ]
  id x
1  A 1
2  B 2
5  E 5

or make them the rownames of your dataframe and extract by row:

> rownames(df) <- df$id
> df[L, ]
  id x
A  A 1
B  B 2
E  E 5

Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table package.

Solution 2:

I reckon you need to use 'match'. It matches the values in one vector to the values in another vector, and gives NA where there's no match. So then you subset based on !is.na of the match.

See ?match and you can probably work it out for yourself, in which case you'll learn more than from the exact answer someone will do shortly which will just encourage you to cut n paste :)