How to negate the whole regex?
I have a regex, for example (ma|(t){1})
. It matches ma
and t
and doesn't match bla
.
I want to negate the regex, thus it must match bla
and not ma
and t
, by adding something to this regex. I know I can write bla
, the actual regex is however more complex.
Use negative lookaround: (?!
pattern
)
Positive lookarounds can be used to assert that a pattern matches. Negative lookarounds is the opposite: it's used to assert that a pattern DOES NOT match. Some flavor supports assertions; some puts limitations on lookbehind, etc.
Links to regular-expressions.info
- Lookahead and Lookbehind Zero-Width Assertions
- Flavor comparison
See also
- How do I convert CamelCase into human-readable names in Java?
- Regex for all strings not containing a string?
- A regex to match a substring that isn’t followed by a certain other substring.
More examples
These are attempts to come up with regex solutions to toy problems as exercises; they should be educational if you're trying to learn the various ways you can use lookarounds (nesting them, using them to capture, etc):
- codingBat plusOut using regex
- codingBat repeatEnd using regex
- codingbat wordEnds using regex
Assuming you only want to disallow strings that match the regex completely (i.e., mmbla
is okay, but mm
isn't), this is what you want:
^(?!(?:m{2}|t)$).*$
(?!(?:m{2}|t)$)
is a negative lookahead; it says "starting from the current position, the next few characters are not mm
or t
, followed by the end of the string." The start anchor (^
) at the beginning ensures that the lookahead is applied at the beginning of the string. If that succeeds, the .*
goes ahead and consumes the string.
FYI, if you're using Java's matches()
method, you don't really need the the ^
and the final $
, but they don't do any harm. The $
inside the lookahead is required, though.
\b(?=\w)(?!(ma|(t){1}))\b(\w*)
this is for the given regex.
the \b is to find word boundary.
the positive look ahead (?=\w) is here to avoid spaces.
the negative look ahead over the original regex is to prevent matches of it.
and finally the (\w*) is to catch all the words that are left.
the group that will hold the words is group 3.
the simple (?!pattern) will not work as any sub-string will match
the simple ^(?!(?:m{2}|t)$).*$ will not work as it's granularity is full lines