Convert column names to title case [duplicate]
I'm working on a Rmd which will be turned into a html report using kintr. I imported data from an xls, used clean_names() for the column names, and finished the manipulations. This is a sample of the data:
df <- data.frame(precinct = c("a_b_c", "b_c_d", "e_f_g"), steve_alpha = c(309, 337, 294), mike_bravo = c(120, 151, 240), allan_charlie = c(379, 442, 597))
Now I want to present the data aesthetically in a table using kable() but the column names and the contents of the "precinct" column need to be in title case. Is there a function that will do this all at once?
Solution 1:
There are a couple wrinkles here. First, I'm interpreting title case as meaning each word starts with a capital letter, and underscores are replaced by spaces. Second, some solutions will work on the data frame's names but not on the precinct column, because tools::toTitleCase
, which underlies some other functions including the snakecase
that I initially suggested, assumes that a single letter shouldn't be capitalized.
snakecase::to_title_case(names(df))
#> [1] "Precinct" "Steve Alpha" "Mike Bravo" "Allan Charlie"
snakecase::to_title_case(df$precinct, sep_out = " ", sep_in = "_")
#> [1] "A b c" "B c d" "E f g"
That seems like not the correct outcome for precincts. Knowing that every precinct has only single-letter words, you could just replace the underscores and then convert to all caps, but that won't hold for any other words. Alternatively, stringr::str_to_title
doesn't keep single-letter words lowercase, so do the replacement and then pass to that.
stringr::str_to_title(stringr::str_replace_all(df$precinct, "_", " "))
#> [1] "A B C" "B C D" "E F G"
I mentioned in a comment having made a similar function for a package at work, which handles a variety of cases and which people should feel free to copy. This is a greatly pared down version that replaces underscores, then converts any lowercase letter at the start of a word with its uppercase counterpart, so it will work on both instances.
clean_titles <- function(x) {
x <- gsub("_", " ", x)
x <- gsub("\\b([a-z])", "\\U\\1", x, perl = TRUE)
x
}
clean_titles(names(df))
#> [1] "Precinct" "Steve Alpha" "Mike Bravo" "Allan Charlie"
clean_titles(df$precinct)
#> [1] "A B C" "B C D" "E F G"
Finally, because you have a function that does this, you can use it in both dplyr::mutate
to change that one column, and in dplyr::rename_with
to change all column names.
library(dplyr)
df %>%
mutate(precinct = clean_titles(precinct)) %>%
rename_with(clean_titles)
#> Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1 A B C 309 120 379
#> 2 B C D 337 151 442
#> 3 E F G 294 240 597
Solution 2:
Can't this be done with pure regex, just passing perl = TRUE
to gsub
and using the \U
modifier?
gsub("(^|_)([[:alpha:]])", "\\1\\U\\2", names(df), perl = TRUE)
## [1] "Precinct" "Steve_Alpha" "Mike_Bravo" "Allan_Charlie"
Hence:
to_title <- function(x) {
gsub("_", " ", gsub("(^|_)([[:alpha:]])", "\\1\\U\\2", x, perl = TRUE))
}
df$precinct <- to_title(df$precinct)
names(df) <- to_title(names(df))
df
## Precinct Steve Alpha Mike Bravo Allan Charlie
## 1 A B C 309 120 379
## 2 B C D 337 151 442
## 3 E F G 294 240 597
Solution 3:
There's a few ways you can do this.
There's a function in the tools
package called toTitleCase
. With this, and sub
you can rename the columns like this:
names(df)<-tools::toTitleCase(sub("_"," ",names(df)))
df
#> Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1 A 309 120 379
#> 2 B 337 151 442
#> 3 C 294 240 597
An equivalent way using the function str_to_title
from the excellent stringr
package:
names(df)<-stringr::str_to_title(sub("_"," ",names(df)))
df
#> Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1 A 309 120 379
#> 2 B 337 151 442
#> 3 C 294 240 597
Finally, if you are keen to use pipes
and dplyr
:
df <- df |>
rename_all(~ gsub("_", " ", .)) |>
rename_all(stringr::str_to_title)
df
#> Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1 A 309 120 379
#> 2 B 337 151 442
#> 3 C 294 240 597
Solution 4:
Maybe this could be another solution:
library(stringr)
names(df) <- sapply(strsplit(names(df), "_"), \(x) {
paste0(str_to_title(x), collapse = "_")
})
df
Precinct Steve_Alpha Mike_Bravo Allan_Charlie
1 a_b_c 309 120 379
2 b_c_d 337 151 442
3 e_f_g 294 240 597