How to match something with regex that is not between two special characters?
Solution 1:
Assuming the quotes are correctly balanced and there are no escaped quotes, then it's easy:
result = subject.gsub(/a(?=(?:[^"]*"[^"]*")*[^"]*\Z)/, '')
This replaces all the a
s with the empty string if and only if there is an even number of quotes ahead of the matched a
.
Explanation:
a # Match a
(?= # only if it's followed by...
(?: # ...the following:
[^"]*" # any number of non-quotes, followed by one quote
[^"]*" # the same again, ensuring an even number
)* # any number of times (0, 2, 4 etc. quotes)
[^"]* # followed by only non-quotes until
\Z # the end of the string.
) # End of lookahead assertion
If you can have escaped quotes within quotes (a "length: 2\""
), it's still possible but will be more complicated:
result = subject.gsub(/a(?=(?:(?:\\.|[^"\\])*"(?:\\.|[^"\\])*")*(?:\\.|[^"\\])*\Z)/, '')
This is in essence the same regex as above, only substituting (?:\\.|[^"\\])
for [^"]
:
(?: # Match either...
\\. # an escaped character
| # or
[^"\\] # any character except backslash or quote
) # End of alternation
Solution 2:
js-coder, resurrecting this ancient question because it had a simple solution that wasn't mentioned. (Found your question while doing some research for a regex bounty quest.)
As you can see the regex is really tiny compared with the regex in the accepted answer: ("[^"]*")|a
subject = 'a b c a b " a b " b a " a "'
regex = /("[^"]*")|a/
replaced = subject.gsub(regex) {|m|$1}
puts replaced
See this live demo
Reference
How to match pattern except in situations s1, s2, s3
How to match a pattern unless...