How to combine filter(across(starts_with("foo"), ~ . logical-condition)) with mutate(bar = map2(...))?
I want to use dplyr
's filter()
in combination with selection helpers such as starts_with()
.
The current post is a follow-up on this answer, but in a bit more sophisticated data structure that involves list-columns and map2()
from {purrr}
package.
Consider the following my_mtcars
data frame:
library(tibble)
my_mtcars <-
mtcars %>%
rownames_to_column("cars")
I want to filter any column that starts with/contains the string "cars"
, to keep only the following cars:
cars_to_keep <- c("Merc 240D", "Fiat X1-9", "Ferrari Dino")
So from this answer we learned how to use selection helpers with filter()
such that:
library(dplyr)
filter(my_mtcars, across(contains("cars"), ~ . %in% cars_to_keep))
## cars mpg cyl disp hp drat wt qsec vs am gear carb
## 1 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.0 1 0 4 2
## 2 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.9 1 1 4 1
## 3 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6
So far so good.
The problem arises with the following data structure:
higher_level_tibble <-
tibble(my_data = list(my_mtcars),
the_cars_i_want = list(cars_to_keep))
## # A tibble: 1 x 2
## my_data the_cars_i_want
## <list> <list>
## 1 <df [32 x 12]> <chr [3]>
Although the following works:
library(purrr)
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~filter(.x, cars %in% .y)))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>
This doesn't:
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~ filter(.x, across(starts_with("cars"), ~ . %in% .y))))
Error: Problem with
mutate()
columnmy_filtered_data
.
imy_filtered_data = map2(...)
.
x Problem withfilter()
input..1
.
i Input..1
isacross(starts_with("cars"), ~. %in% .y)
.
x the ... list contains fewer than 2 elements
How can I utilize tidyselect
helpers in filter()
, all within purrr::map2()
?
EDIT
desired output
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data,
.y = the_cars_i_want,
.f = ~ .x %>% filter( from the col in .x whose header starts with "cars", return only values that appear in .y )))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>
A possible solution, using purrr::pmap_dfr
:
library(tidyverse)
my_mtcars <-
mtcars %>%
rownames_to_column("cars")
cars_to_keep <- c("Merc 240D", "Fiat X1-9", "Ferrari Dino")
higher_level_tibble <-
tibble(my_data = list(my_mtcars),
the_cars_i_want = list(cars_to_keep))
higher_level_tibble %>%
pmap_dfr(~ ..1 %>% filter(across(contains("cars"), \(x) x %in% ..2))) %>%
nest(my_filtered_data = everything()) %>%
bind_cols(higher_level_tibble, .)
#> # A tibble: 1 × 3
#> my_data the_cars_i_want my_filtered_data
#> <list> <list> <list>
#> 1 <df [32 × 12]> <chr [3]> <tibble [3 × 12]>