case_when in mutate pipe

Solution 1:

As of version 0.7.0 of dplyr, case_when works within mutate as follows:

library(dplyr) # >= 0.7.0
mtcars %>% 
  mutate(cg = case_when(carb <= 2 ~ "low",
                        carb > 2  ~ "high"))

For more information: http://dplyr.tidyverse.org/reference/case_when.html

Solution 2:

We can use .$

mtcars %>%  
     mutate(cg = case_when(.$carb <= 2 ~ "low",  .$carb > 2 ~ "high")) %>%
    .$cg %>%
    table()
# high  low 
#  15   17 

Solution 3:

With thanks to @sumedh: @hadley has explained that this is a known shortcoming of case_when:

case_when() is still somewhat experiment and does not currently work inside mutate(). That will be fixed in a future version.

Solution 4:

In my case, quasiquotation helped a lot. You can create in advance a set of quoted formulae that define the mutation rules (and either use known column names as in the first formula or benefit from !! and create rules dynamically as in the second formula), which is then utilized within mutate - case_when combination like here

    library(dplyr)
    library(rlang)
    pattern <- quos(gear == 3L ~ "three", !!sym("gear") == 4L ~ "four", gear == 5L ~ "five")
    # Or
    # pattern <- list(
    #     quo(gear == 3L ~ "three"), 
    #     quo(!!sym("gear") == 4L ~ "four"),
    #     quo(gear == 5L ~ "five"))
    #
    mtcars %>% mutate(test = case_when(!!!pattern)) %>% head(10L)
#>     mpg cyl  disp  hp drat    wt  qsec vs am gear carb  test
#> 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  four
#> 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  four
#> 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1  four
#> 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 three
#> 5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 three
#> 6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 three
#> 7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 three
#> 8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2  four
#> 9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2  four
#> 10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  four

I prefer such solution because it allows creating complex rules, e.g. using map2 with LHS conditions and RHS values to generate quoted formulas

    library(rlang)
    library(purrr)
    map2(c(3, 4, 5), c("three", "four", "five"), ~quo(gear == !!.x ~ !!.y))
#> [[1]]
#> <quosure>
#> expr: ^gear == 3 ~ "three"
#> env:  0000000014286520
#> 
#> [[2]]
#> <quosure>
#> expr: ^gear == 4 ~ "four"
#> env:  000000001273D0E0
#> 
#> [[3]]
#> <quosure>
#> expr: ^gear == 5 ~ "five"
#> env:  00000000125870E0

and using it in different places, applying to different data sets without the need to manually type in all the rules every time you need a complex mutation.

As a final answer to the problem, 7 additional symbols and two parentheses solve it

library(rlang)
library(dplyr)
mtcars %>% 
    mutate(test = case_when(!!!quos(gear == 3L ~ "three", gear != 3L ~ "not three"))) %>% 
    head(10L)
#>     mpg cyl  disp  hp drat    wt  qsec vs am gear carb      test
#> 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 not three
#> 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 not three
#> 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 not three
#> 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1     three
#> 5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2     three
#> 6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1     three
#> 7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4     three
#> 8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2 not three
#> 9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2 not three
#> 10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 not three

Created on 2019-01-16 by the reprex package (v0.2.1.9000)