dplyr mutate rowwise max of range of columns
I can use the following to return the maximum of 2 columns
newiris<-iris %>%
rowwise() %>%
mutate(mak=max(Sepal.Width,Petal.Length))
What I want to do is find that maximum across a range of columns so I don't have to name each one like this
newiris<-iris %>%
rowwise() %>%
mutate(mak=max(Sepal.Width:Petal.Length))
Any ideas?
Instead of rowwise()
, this can be done with pmax
iris %>%
mutate(mak=pmax(Sepal.Width,Petal.Length, Petal.Width))
May be we can use interp
from library(lazyeval)
if we want to reference the column names stored in a vector
.
library(lazyeval)
nm1 <- names(iris)[2:4]
iris %>%
mutate_(mak= interp(~pmax(v1), v1= as.name(nm1)))
With rlang
and quasiquotation we have another dplyr option. First, get the row names that we want to compute the parallel max for:
iris_cols <- iris %>% select(Sepal.Length:Petal.Width) %>% names()
Then we can use !!!
and rlang::syms
to compute the parallel max for every row of those columns:
iris %>%
mutate(mak=pmax(!!!rlang::syms(iris_cols)))
-
rlang::syms
takes a string input (the column names), and turns it into a symbol -
!!!
unquotes and splices its argument, here the column names
Which gives:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species mak
1 5.1 3.5 1.4 0.2 setosa 5.1
2 4.9 3.0 1.4 0.2 setosa 4.9
3 4.7 3.2 1.3 0.2 setosa 4.7
4 4.6 3.1 1.5 0.2 setosa 4.6
5 5.0 3.6 1.4 0.2 setosa 5.0
h/t: https://stackoverflow.com/a/47773379/1036500