Opposite of %in%: exclude rows with values specified in a vector
A categorical variable V1 in a data frame D1 can have values represented by the letters from A to Z. I want to create a subset D2, which excludes some values, say, B, N and T. Basically, I want a command which is the opposite of %in%
D2 = subset(D1, V1 %in% c("B", "N", "T"))
You can use the !
operator to basically make any TRUE FALSE and every FALSE TRUE. so:
D2 = subset(D1, !(V1 %in% c('B','N','T')))
EDIT: You can also make an operator yourself:
'%!in%' <- function(x,y)!('%in%'(x,y))
c(1,3,11)%!in%1:10
[1] FALSE FALSE TRUE
How about:
`%ni%` <- Negate(`%in%`)
c(1,3,11) %ni% 1:10
# [1] FALSE FALSE TRUE
Here is a version using filter
in dplyr
that applies the same technique as the accepted answer by negating the logical with !:
D2 <- D1 %>% dplyr::filter(!V1 %in% c('B','N','T'))
If you look at the code of %in%
function (x, table) match(x, table, nomatch = 0L) > 0L
then you should be able to write your version of opposite. I use
`%not in%` <- function (x, table) is.na(match(x, table, nomatch=NA_integer_))
Another way is:
function (x, table) match(x, table, nomatch = 0L) == 0L