most elegant way to calculate rowSums of colums that start AND end with certain strings, using dplyr

r dplyr

I am working with a dataset of which I want to calculate rowSums of columns that start with a certain string and end with an other specified string, using dplyr (in my example: starts_with('c_') & ends_with('_f'))

My current code is as follows (and works fine):

df <- df %>% mutate(row.sum = rowSums(select(select(., starts_with('c_')), ends_with('_f'))))

However, as you can see, using the select() function within a select() function seems a bit messy. Is there a way to combine the starts_with and ends_with within just one select() function? Or do you have other ideas to make this line of code more elegant via using dplyr?

EDIT: To make the example reproducible:

names <- c('c_first_f', 'c_second_o', 't_third_f', 'c_fourth_f')
values <- c(5, 3, 2, 5)
df <- t(values)
colnames(df) <- names
> df
     c_first_f c_second_o t_third_f c_fourth_f
[1,]         5          3         2          5

Thus, here I want to sum the first and fourth column, making the summed value 10.

We could use select_at with matches

library(dplyr)
df %>% select_at(vars(matches("^c_.*_f$"))) %>% mutate(row.sum = rowSums(.))

and with base R :

df$row.sum <- rowSums(df[grep("^c_.*_f$", names(df))])

We can use tidyverse approaches

library(dplyr)
library(purrr)
df %>%
     select_at(vars(matches("^c_.*_f$")))  %>%
     mutate(rowSum = reduce(., `+`))

Or with new versions of tidyverse, select can take matches

df %>%
    select(matches("^c_.*_f$")) %>%
    mutate(rowSum = reduce(., `+`))

most elegant way to calculate rowSums of colums that start AND end with certain strings, using dplyr

Related

Recent Posts