R grep() return whole matching line (similar to the unix grep -A -B?)
I've been searching for a way to ask grep to return a whole line for a matching pattern. Is there functionality for this in R's grep()? I am imagining something like the unix grep arguments -An
Some context: For a paper I've written I want to create a data table or a vector of all citations in a paper. Extracting everything in the paper thats within parentheses using qdapRegex::rm_round()
sometimes only returns a year (in the case of citations written like: 'As put forth by Smith (2020)'). It would be nice to grab the whole sentence instead of just '2020'.
Any thoughts? Thank you!
Solution 1:
grep
has an argument value
which you can set as TRUE
to get the whole string back.
Consider this example where you are looking for numbers.
x <- c('This is 2022', 'This is not a year', '2021 was last year')
grep('\\d+', x)
#[1] 1 3
By default grep
returns an index where a match is found.
If you need the complete string as an output -
grep('\\d+', x, value = TRUE)
#[1] "This is 2022" "2021 was last year"
Solution 2:
s <- c("As put forth by Smith (2020)",
"As put forth by Smith 2020",
"As put forth by Smith",
"As put forth (Smith 2020)")
s[grep(pattern = "\\(.*\\)", x = s)]
#> [1] "As put forth by Smith (2020)" "As put forth (Smith 2020)"
Created on 2022-01-14 by the reprex package (v2.0.1)