How to get the second sub element of every element in a list
I know I've come across this problem before, but I'm having a bit of a mental block at the moment. and as I can't find it on SO, I'll post it here so I can find it next time.
I have a dataframe that contains a field representing an ID label. This label has two parts, an alpha prefix and a numeric suffix. I want to split it apart and create two new fields with these values in.
structure(list(lab = c("N00", "N01", "N02", "B00", "B01", "B02",
"Z21", "BA01", "NA03")), .Names = "lab", row.names = c(NA, -9L
), class = "data.frame")
df$pre<-strsplit(df$lab, "[0-9]+")
df$suf<-strsplit(df$lab, "[A-Z]+")
Which gives
lab pre suf 1 N00 N , 00 2 N01 N , 01 3 N02 N , 02 4 B00 B , 00 5 B01 B , 01 6 B02 B , 02 7 Z21 Z , 21 8 BA01 BA , 01 9 NA03 NA , 03
So, the first strsplit works fine, but the second gives a list, each having two elements, an empty string and the result I want, and stuffs them both into the dataframe column.
How can I select the second sub-element from each element of the list ? (or, is there a better way to do this)
Solution 1:
To select the second element of each list item:
R> sapply(df$suf, "[[", 2)
[1] "00" "01" "02" "00" "01" "02" "21" "01" "03"
An alternative approach using regular expressions:
df$pre <- sub("^([A-Z]+)[0-9]+", "\\1", df$lab)
df$suf <- sub("^[A-Z]+([0-9]+)", "\\1", df$lab)
Solution 2:
with purrr::map this would be
df$suf %>% map_chr(c(2))
for further info on purrr::map
Solution 3:
First of all: if you use str(df)
you'll see that df$pre
is list
. I think you want vector
(but I might be wrong).
Return to problem - in this case I will use gsub
:
df$pre <- gsub("[0-9]", "", df$lab)
df$suf <- gsub("[A-Z]", "", df$lab)
This guarantee that both columns are vectors, but it fail if your label is not from key (i.e. 'AB01B'
).