Variable-length lookbehind-assertion alternatives for regular expressions
Most of the time, you can avoid variable length lookbehinds by using \K
.
s/(?<=foo.*)bar/moo/s;
would be
s/foo.*\Kbar/moo/s;
Anything up to the last \K
encountered is not considered part of the match (e.g. for the purposes of replacement, $&
, etc)
Negative lookbehinds are a little trickier.
s/(?<!foo.*)bar/moo/s;
would be
s/^(?:(?!foo).)*\Kbar/moo/s;
because (?:(?!STRING).)*
is to STRING
as [^CHAR]*
is to CHAR
.
If you're just matching, you might not even need the \K
.
/foo.*bar/s
/^(?:(?!foo).)*bar/s
For Python there's a regex implementation which supports variable-length lookbehinds:
http://pypi.python.org/pypi/regex
It's designed to be backwards-compatible with the standard re module.
You can reverse the string AND the pattern and use variable length lookahead
(rab(?!\w*oof)\w*)
matches in bold:
raboof rab7790oof raboo rabof rab rabo raboooof rabo
Original solution as far as I know by:
Jeff 'japhy' Pinyan
The regexp you show will find any instance of bar
which is not preceded by foo
.
A simple alternative would be to first match foo
against the string, and find the index of the first occurrence. Then search for bar
, and see if you can find an occurrence which comes before that index.
If you want to find instances of bar
which are not directly preceded by foo
, I could also provide a regexp for that (without using lookbehind), but it will be very ugly. Basically, invert the sense of /foo/
-- i.e. /[^f]oo|[^o]o|[^o]|$/
.
foo.*|(bar)
If foo
is in the string first, then the regex will match, but there will be no groups.
Otherwise, it will find bar
and assign it to a group.
So you can use this regex and look for your results in the groups found:
>>> import re
>>> m = re.search('foo.*|(bar)', 'f00bar')
>>> if m: print(m.group(1))
bar
>>> m = re.search('foo.*|(bar)', 'foobar')
>>> if m: print(m.group(1))
None
>>> m = re.search('foo.*|(bar)', 'fobas')
>>> if m: print(m.group(1))
>>>
Source.