Python non-greedy regexes
How do I make a python regex like "(.*)"
such that, given "a (b) c (d) e"
python matches "b"
instead of "b) c (d"
?
I know that I can use "[^)]"
instead of "."
, but I'm looking for a more general solution that keeps my regex a little cleaner. Is there any way to tell python "hey, match this as soon as possible"?
You seek the all-powerful *?
From the docs, Greedy versus Non-Greedy
the non-greedy qualifiers
*?
,+?
,??
, or{m,n}?
[...] match as little text as possible.
>>> x = "a (b) c (d) e"
>>> re.search(r"\(.*\)", x).group()
'(b) c (d)'
>>> re.search(r"\(.*?\)", x).group()
'(b)'
According to the docs:
The '
*
', '+
', and '?
' qualifiers are all greedy; they match as much text as possible. Sometimes this behavior isn’t desired; if the RE<.*>
is matched against '<H1>title</H1>
', it will match the entire string, and not just '<H1>
'. Adding '?
' after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using.*?
in the previous expression will match only '<H1>
'.