Unique rows, considering two columns, in R, without order
Solution 1:
If it's just two columns, you can also use pmin
and pmax
, like this:
library(data.table)
unique(as.data.table(df)[, c("V1", "V2") := list(pmin(V1, V2),
pmax(V1, V2))], by = c("V1", "V2"))
# V1 V2
# 1: a b
# 2: b d
# 3: c e
A similar approach using "dplyr" might be:
library(dplyr)
data.frame(df, stringsAsFactors = FALSE) %>%
mutate(key = paste0(pmin(X1, X2), pmax(X1, X2), sep = "")) %>%
distinct(key)
# X1 X2 key
# 1 a b ab
# 2 b d bd
# 3 c e ce
Solution 2:
There are lot's of ways to do this, here is one:
unique(t(apply(df, 1, sort)))
duplicated(t(apply(df, 1, sort)))
One gives the unique rows, the other gives the mask.