Unique rows, considering two columns, in R, without order

Solution 1:

If it's just two columns, you can also use pmin and pmax, like this:

library(data.table)
unique(as.data.table(df)[, c("V1", "V2") := list(pmin(V1, V2),
                         pmax(V1, V2))], by = c("V1", "V2"))
#    V1 V2
# 1:  a  b
# 2:  b  d
# 3:  c  e

A similar approach using "dplyr" might be:

library(dplyr)
data.frame(df, stringsAsFactors = FALSE) %>% 
  mutate(key = paste0(pmin(X1, X2), pmax(X1, X2), sep = "")) %>% 
  distinct(key)
#   X1 X2 key
# 1  a  b  ab
# 2  b  d  bd
# 3  c  e  ce

Solution 2:

There are lot's of ways to do this, here is one:

unique(t(apply(df, 1, sort)))
duplicated(t(apply(df, 1, sort)))

One gives the unique rows, the other gives the mask.

Unique rows, considering two columns, in R, without order

Solution 1:

Solution 2:

Related

Recent Posts