Which regular expression operator means 'Don't' match this character?
*, ?, +
characters all mean match this character. Which character means 'don't' match this? Examples would help.
You can use negated character classes to exclude certain characters: for example [^abcde]
will match anything but a,b,c,d,e characters.
Instead of specifying all the characters literally, you can use shorthands inside character classes: [\w]
(lowercase) will match any "word character" (letter, numbers and underscore), [\W]
(uppercase) will match anything but word characters; similarly, [\d]
will match the 0-9 digits while [\D]
matches anything but the 0-9 digits, and so on.
If you use PHP you can take a look at the regex character classes documentation.
There's two ways to say "don't match": character ranges, and zero-width negative lookahead/lookbehind.
The former: don't match a
, b
, c
or 0
: [^a-c0]
The latter: match any three-letter string except foo
and bar
:
(?!foo|bar).{3}
or
.{3}(?<!foo|bar)
Also, a correction for you: *
, ?
and +
do not actually match anything. They are repetition operators, and always follow a matching operator. Thus, a+
means match one or more of a
, [a-c0]+
means match one or more of a
, b
, c
or 0
, while [^a-c0]+
would match one or more of anything that wasn't a
, b
, c
or 0
.
[^]
( within [ ] ) is negation in regular expression whereas ^
is "begining of string"
[^a-z]
matches any single character that is not from "a" to "z"
^[a-z]
means string starts with from "a" to "z"
Reference
^
used at the beginning of a character range, or negative lookahead/lookbehind assertions.
>>> re.match('[^f]', 'foo')
>>> re.match('[^f]', 'bar')
<_sre.SRE_Match object at 0x7f8b102ad6b0>
>>> re.match('(?!foo)...', 'foo')
>>> re.match('(?!foo)...', 'bar')
<_sre.SRE_Match object at 0x7f8b0fe70780>