Find complement of a data frame (anti - join)

I have two data frames(df and df1). df1 is subset of df. I want to get a data frame which is complement of df1 in df, i.e. return rows of the first data set which are not matched in the second. For example let,

data frame df:

heads
row1
row2
row3
row4
row5

data frame df1:

heads
row3
row5

Then the desired output df2 is:

heads
row1
row2
row4

Solution 1:

You could also do some type of anti join with data.tables binary join

library(data.table)
setkey(setDT(df), heads)[!df1]
#    heads
# 1:  row1
# 2:  row2
# 3:  row4

EDIT: Starting data.table v1.9.6+ we can join data.tables without setting keys while using on

setDT(df)[!df1, on = "heads"]

EDIT2: Starting data.table v1.9.8+ fsetdiff was introduced which is basically a variation of the solution above, just over all the column names of the x data.table, e.g. x[!y, on = names(x)]. If all set to FALSE (the default behavior), then only unique rows in x will be returned. For the case of only one column in each data.table the following will be equivalent to the previous solutions

fsetdiff(df, df1, all = TRUE)

Solution 2:

Try anti_join from dplyr

library(dplyr)
anti_join(df, df1, by='heads')

Solution 3:

Try the %in% command and reverse it with !

df[!df$heads %in% df1$heads,]