Using functions of multiple columns in a dplyr mutate_at call
Solution 1:
This was answered by @eipi10 in @eipi10's comment on the question, but I'm writing it here for posterity.
The solution here is to use:
df %>%
mutate_at(.vars = vars(y, z),
.funs = list(~ ifelse(x, ., NA)))
You can also use the new across()
function with mutate()
, like so:
df %>%
mutate(across(c(y, z), ~ ifelse(x, ., NA)))
The use of the formula operator (as in ~ ifelse(...)
) here indicates that ifelse(x, ., NA)
is an anonymous function that is being defined within the call to mutate_at()
.
This works similarly to defining the function outside of the call to mutate_at()
, like so:
temp_fn <- function(input) ifelse(test = df[["x"]],
yes = input,
no = NA)
df %>%
mutate_at(.vars = vars(y, z),
.funs = temp_fn)
Note on syntax changes in dplyr: Prior to dplyr version 0.8.0, you would simply write .funs = funs(ifelse(x, . , NA))
, but the funs()
function is being deprecated and will soon be removed from dplyr.
Solution 2:
To supplement the previous response, if you wanted mutate_at()
to add new variables (instead of replacing), with names such as z_1
and y_1
as in the original question, you just need to:
-
dplyr >=1 with
across()
: add.names="{.col}_1"
, or alternatively uselist('1'=~ifelse(x, ., NA)
(back ticks!) -
dplyr [0.8, 1[: use
list('1'=~ifelse(x, ., NA)
-
dplyr <0.8: use
funs('1'=ifelse(x, ., NA)
library(tidyverse)
df <- data.frame(
x = c(TRUE, TRUE, FALSE),
y = c("Hello", "Hola", "Ciao"),
z = c("World", "ao", "HaOlam")
)
## Version >=1
df %>%
mutate(across(c(y, z),
list(~ifelse(x, ., NA)),
.names="{.col}_1"))
#> x y z y_1 z_1
#> 1 TRUE Hello World Hello World
#> 2 TRUE Hola ao Hola ao
#> 3 FALSE Ciao HaOlam <NA> <NA>
## 0.8 - <1
df %>%
mutate_at(.vars = vars(y, z),
.funs = list(`1`=~ifelse(x, ., NA)))
#> x y z y_1 z_1
#> 1 TRUE Hello World Hello World
#> 2 TRUE Hola ao Hola ao
#> 3 FALSE Ciao HaOlam <NA> <NA>
## Before 0.8
df %>%
mutate_at(.vars = vars(y, z),
.funs = funs(`1`=ifelse(x, ., NA)))
#> Warning: `funs()` is deprecated as of dplyr 0.8.0.
#> Please use a list of either functions or lambdas:
#>
#> # Simple named list:
#> list(mean = mean, median = median)
#>
#> # Auto named with `tibble::lst()`:
#> tibble::lst(mean, median)
#>
#> # Using lambdas
#> list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_warnings()` to see where this warning was generated.
#> x y z y_1 z_1
#> 1 TRUE Hello World Hello World
#> 2 TRUE Hola ao Hola ao
#> 3 FALSE Ciao HaOlam <NA> <NA>
Created on 2020-10-03 by the reprex package (v0.3.0)
For more details and tricks, see: Create new variables with mutate_at while keeping the original ones