find multiple strings using str_extract_all

I have a list of strings as follows:

tofind<-c("aaa","bbb","ccc","ddd")

I also have a vector as follows:

n<-c("aaabbb","aaa","aaacccddd","eee")

I want to find all matches of my tofind string so that the output should be:

aaa,bbb
aaa
aaa,ccc,ddd

I think I can use str_extract_all but it doesn't give my the expected output

library(stringr)
sapply(n, function(x) str_extract_all(n,tofind)

How do I get the expected output?


Solution 1:

You could create a single regex:

tofind <- paste(c("aaa","bbb","ccc","ddd"), collapse="|")

str_extract_all(n, tofind)
[[1]]
[1] "aaa" "bbb"

[[2]]
[1] "aaa"

[[3]]
[1] "aaa" "ccc" "ddd"

[[4]]
character(0)

Solution 2:

The str_detect function can help here


suppressPackageStartupMessages(library(tidyverse))
library(stringr)

tofind <- c("aaa", "bbb", "ccc", "ddd")
n <- c("aaabbb", "aaa", "aaacccddd", "eee")

sapply(n, function(x) tofind[str_detect(x, tofind)], USE.NAMES = FALSE)
#> [[1]]
#> [1] "aaa" "bbb"
#> 
#> [[2]]
#> [1] "aaa"
#> 
#> [[3]]
#> [1] "aaa" "ccc" "ddd"
#> 
#> [[4]]
#> character(0)

# or the tidyverse alternative...  
n %>%
  map(function(x, y) y[str_detect(x, y)], tofind)
#> [[1]]
#> [1] "aaa" "bbb"
#> 
#> [[2]]
#> [1] "aaa"
#> 
#> [[3]]
#> [1] "aaa" "ccc" "ddd"
#> 
#> [[4]]
#> character(0)