R string split and compress empty space
This is supposed to be as simple question but I couldn't figure it out. I'm trying to use a given variables list to choose data variables, however, apply string split I got 8 instead of 5. Clearly, the extra spaces were splited into empty variables for 1,3,5. Any hints on how to solve this?
list <- " ethnicity_source_value race_source_value gender_source_value dx_age site"
unlist(strsplit(list, " "))
[1] "" "ethnicity_source_value" ""
[4] "race_source_value" "" "gender_source_value"
[7] "dx_age" "site"
We could also use str_squish
from stringr
package; str_squish()
also reduces repeated whitespace inside a string:
library(stringr)
unlist(strsplit(str_squish(list), " "))
[1] "ethnicity_source_value" "race_source_value"
[3] "gender_source_value" "dx_age"
[5] "site
The string already had a leading space, so we use trimws
to remove those leading/lagging spaces, and then use strsplit
with split
as one or more spaces (\\s+
). It is possible that there are more than a single spaces in between the words.
unlist(strsplit(trimws(list), "\\s+"))
[1] "ethnicity_source_value" "race_source_value" "gender_source_value" "dx_age"
[5] "site"
Or another option is scan
which removes the whitespace automatically
scan(text = list, what = "", quiet = TRUE)
[1] "ethnicity_source_value" "race_source_value" "gender_source_value" "dx_age"
[5] "site"