Apostrophes and regular expressions; Cleaning text in R
You can use
gsub("(?i)\\b(?<!')(?![AOI])\\p{L}\\b", "", x, perl=TRUE)
Details:
-
(?i)
- case insensitive matching on -
\b
- a word boundary -
(?<!')
- no'
is allowed immediately on the left -
(?![AOI])
- the next char cannot beA
,I
, orO
-
\p{L}
- any Unicod letter -
\b
- a word boundary