Coalesce two string columns with alternating missing values to one
I have a data frame with two columns "a" and "b" with alternating missing values (NA
)
a b
dog <NA>
mouse <NA>
<NA> cat
bird <NA>
I want to "merge" / combine them to a new column c that looks like this, i.e. the non-NA
element in each row is selected:
c
dog
mouse
cat
bird
I tried merge
and join
, but neither worked as I wanted. Maybe because I do not have an id with which to merge? For integers I would just circumvent this and add both columns, but how in my case?
You may try pmax
df$c <- pmax(df$a, df$b)
df
# a b c
# 1 dog <NA> dog
# 2 mouse <NA> mouse
# 3 <NA> cat cat
# 4 bird <NA> bird
...or ifelse
:
df$c <- ifelse(is.na(df$a), df$b, df$a)
For more general solutions in cases with more than two columns, you find several ways to implement coalesce in R here.
I wrote a coalesce() function for this type of task which works much like the SQL coalesce function. You would use it like
dd<-read.table(text="a b
dog NA
mouse NA
NA cat
bird NA", header=T)
dd$c <- with(dd, coalesce(a,b))
dd
# a b c
# 1 dog <NA> dog
# 2 mouse <NA> mouse
# 3 <NA> cat cat
# 4 bird <NA> bird
dpyr
has exactly what you are looking for, function coalesce()
library(dplyr)
a<-c("dog","mouse",NA,"bird")
b<-c(NA,NA,"cat",NA)
coalesce(a,b)
[1] "dog" "mouse" "cat" "bird"