Can dplyr join on multiple columns or composite key?
I realize that dplyr
v3.0 allows you to join on different variables:
left_join(x, y, by = c("a" = "b")
will match x.a
to y.b
However, is it possible to join on a combination of variables or do I have to add a composite key beforehand?
Something like this:
left_join(x, y, by = c("a c" = "b d")
to match the concatenation of [x.a
and x.c
] to [y.b
and y.d
]
Updating to use tibble()
You can pass a named vector of length greater than 1 to the by
argument of left_join()
:
library(dplyr)
d1 <- tibble(
x = letters[1:3],
y = LETTERS[1:3],
a = rnorm(3)
)
d2 <- tibble(
x2 = letters[3:1],
y2 = LETTERS[3:1],
b = rnorm(3)
)
left_join(d1, d2, by = c("x" = "x2", "y" = "y2"))