Convert column names to title case [duplicate]

I'm working on a Rmd which will be turned into a html report using kintr. I imported data from an xls, used clean_names() for the column names, and finished the manipulations. This is a sample of the data:

df <- data.frame(precinct = c("a_b_c", "b_c_d", "e_f_g"), steve_alpha = c(309, 337, 294), mike_bravo = c(120, 151, 240), allan_charlie = c(379, 442, 597))

Now I want to present the data aesthetically in a table using kable() but the column names and the contents of the "precinct" column need to be in title case. Is there a function that will do this all at once?

Solution 1:

There are a couple wrinkles here. First, I'm interpreting title case as meaning each word starts with a capital letter, and underscores are replaced by spaces. Second, some solutions will work on the data frame's names but not on the precinct column, because tools::toTitleCase, which underlies some other functions including the snakecase that I initially suggested, assumes that a single letter shouldn't be capitalized.

snakecase::to_title_case(names(df))
#> [1] "Precinct"      "Steve Alpha"   "Mike Bravo"    "Allan Charlie"
snakecase::to_title_case(df$precinct, sep_out = " ", sep_in = "_")
#> [1] "A b c" "B c d" "E f g"

That seems like not the correct outcome for precincts. Knowing that every precinct has only single-letter words, you could just replace the underscores and then convert to all caps, but that won't hold for any other words. Alternatively, stringr::str_to_title doesn't keep single-letter words lowercase, so do the replacement and then pass to that.

stringr::str_to_title(stringr::str_replace_all(df$precinct, "_", " "))
#> [1] "A B C" "B C D" "E F G"

I mentioned in a comment having made a similar function for a package at work, which handles a variety of cases and which people should feel free to copy. This is a greatly pared down version that replaces underscores, then converts any lowercase letter at the start of a word with its uppercase counterpart, so it will work on both instances.

clean_titles <- function(x) {
  x <- gsub("_", " ", x)
  x <- gsub("\\b([a-z])", "\\U\\1", x, perl = TRUE)
  x
}

clean_titles(names(df))
#> [1] "Precinct"      "Steve Alpha"   "Mike Bravo"    "Allan Charlie"
clean_titles(df$precinct)
#> [1] "A B C" "B C D" "E F G"

Finally, because you have a function that does this, you can use it in both dplyr::mutate to change that one column, and in dplyr::rename_with to change all column names.

library(dplyr)

df %>%
  mutate(precinct = clean_titles(precinct)) %>%
  rename_with(clean_titles)
#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1    A B C         309        120           379
#> 2    B C D         337        151           442
#> 3    E F G         294        240           597

Solution 2:

Can't this be done with pure regex, just passing perl = TRUE to gsub and using the \U modifier?

gsub("(^|_)([[:alpha:]])", "\\1\\U\\2", names(df), perl = TRUE)
## [1] "Precinct"      "Steve_Alpha"   "Mike_Bravo"    "Allan_Charlie"

Hence:

to_title <- function(x) {
  gsub("_", " ", gsub("(^|_)([[:alpha:]])", "\\1\\U\\2", x, perl = TRUE))
}
df$precinct <- to_title(df$precinct)
names(df) <- to_title(names(df))
df
##   Precinct Steve Alpha Mike Bravo Allan Charlie
## 1    A B C         309        120           379
## 2    B C D         337        151           442
## 3    E F G         294        240           597

Solution 3:

There's a few ways you can do this.

There's a function in the tools package called toTitleCase. With this, and sub you can rename the columns like this:

names(df)<-tools::toTitleCase(sub("_"," ",names(df)))

df
#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1        A         309        120           379
#> 2        B         337        151           442
#> 3        C         294        240           597

An equivalent way using the function str_to_title from the excellent stringr package:

names(df)<-stringr::str_to_title(sub("_"," ",names(df)))
df
#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1        A         309        120           379
#> 2        B         337        151           442
#> 3        C         294        240           597

Finally, if you are keen to use pipes and dplyr:

df <- df |>
  rename_all(~ gsub("_", " ", .)) |>
  rename_all(stringr::str_to_title)
df

#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1        A         309        120           379
#> 2        B         337        151           442
#> 3        C         294        240           597

Solution 4:

Maybe this could be another solution:

library(stringr)

names(df) <- sapply(strsplit(names(df), "_"), \(x) {
  paste0(str_to_title(x), collapse = "_")
})

df
  Precinct Steve_Alpha Mike_Bravo Allan_Charlie
1    a_b_c         309        120           379
2    b_c_d         337        151           442
3    e_f_g         294        240           597