Adding a string between some pattern in R
I would like to add the string "AND" between the words "STREET" and "HENRY" into the following string:
WEST 156 STREET HENRY HUDSON PARKWAY
So that it reads WEST 156 STREET AND HENRY HUDSON PARKWAY
. Essentially, I am trying to geocode intersections so I would like to be able to add "AND" between street types (AVENUE, STREET, BLVD, etc.) and whatever word comes after that to create the intersection like I specified above.
Here are a couple more examples (just made up):
strings = c("WEST 135TH AVE BROADWAY", # want WEST 135TH AVE AND BROADWAY,
"SUNSET BLVD MAIN ST", # SUNSET BLVD AND MAIN ST
"W 45TH ST LAKESHORE BLVD", #...
"HIGH ST BROAD ST") # ...
I would greatly appreciate any help! I am somewhat familiar with regular expressions, but I am not familiar with how to insert another word in this manner.
Solution 1:
capture the words as a group and replace with backreference (\\1
) along with the substring "AND". For the third and fourth strings, as it is at the end of the string, it wouldn't replace as we used \\s+
(one or more spaces)
sub("(AVENUE|AVE|STREET|BLVD)\\s+", "\\1 AND ", strings)
-output
[1] "WEST 135TH AVE AND BROADWAY" "SUNSET BLVD AND MAIN ST"
[3] "W 45TH ST LAKESHORE BLVD" "HIGH ST BROAD ST"