Correct syntax for mutate_if
I would like to replace NA
values with zeros via mutate_if
in dplyr
. The syntax below:
set.seed(1)
mtcars[sample(1:dim(mtcars)[1], 5),
sample(1:dim(mtcars)[2], 5)] <- NA
require(dplyr)
mtcars %>%
mutate_if(is.na,0)
mtcars %>%
mutate_if(is.na, funs(. = 0))
Returns error:
Error in
vapply(tbl, p, logical(1), ...)
: values must be length 1, butFUN(X[[1]])
result is length 32
What's the correct syntax for this operation?
Solution 1:
The "if" in mutate_if
refers to choosing columns, not rows. Eg mutate_if(data, is.numeric, ...)
means to carry out a transformation on all numeric columns in your dataset.
If you want to replace all NAs with zeros in numeric columns:
data %>% mutate_if(is.numeric, funs(ifelse(is.na(.), 0, .)))
Solution 2:
I learned this trick from the purrr tutorial, and it also works in dplyr.
There are two ways to solve this problem:
First, define custom functions outside the pipe, and use it in mutate_if()
:
any_column_NA <- function(x){
any(is.na(x))
}
replace_NA_0 <- function(x){
if_else(is.na(x),0,x)
}
mtcars %>% mutate_if(any_column_NA,replace_NA_0)
Second, use the combination of ~
,.
or .x
.( .x
can be replaced with .
, but not any other character or symbol):
mtcars %>% mutate_if(~ any(is.na(.x)),~ if_else(is.na(.x),0,.x))
#This also works
mtcars %>% mutate_if(~ any(is.na(.)),~ if_else(is.na(.),0,.))
In your case, you can also use mutate_all()
:
mtcars %>% mutate_all(~ if_else(is.na(.x),0,.x))
Using ~
, we can define an anonymous function, while .x
or .
stands for the variable. In mutate_if()
case, .
or .x
is each column.
Solution 3:
mtcars %>% mutate_if(is.numeric, replace_na, 0)
or more recent syntax
mtcars %>% mutate(across(where(is.numeric),
replace_na, 0))
Solution 4:
We can use set
from data.table
library(data.table)
setDT(mtcars)
for(j in seq_along(mtcars)){
set(mtcars, i= which(is.na(mtcars[[j]])), j = j, value = 0)
}
Solution 5:
I always struggle with replace_na function of dplyr
replace(is.na(.),0)
this works for me for what you are trying to do.