What do 'lazy' and 'greedy' mean in the context of regular expressions?
Solution 1:
Greedy will consume as much as possible. From http://www.regular-expressions.info/repeat.html we see the example of trying to match HTML tags with <.+>
. Suppose you have the following:
<em>Hello World</em>
You may think that <.+>
(.
means any non newline character and +
means one or more) would only match the <em>
and the </em>
, when in reality it will be very greedy, and go from the first <
to the last >
. This means it will match <em>Hello World</em>
instead of what you wanted.
Making it lazy (<.+?>
) will prevent this. By adding the ?
after the +
, we tell it to repeat as few times as possible, so the first >
it comes across, is where we want to stop the matching.
I'd encourage you to download RegExr, a great tool that will help you explore Regular Expressions - I use it all the time.
Solution 2:
'Greedy' means match longest possible string.
'Lazy' means match shortest possible string.
For example, the greedy h.+l
matches 'hell'
in 'hello'
but the lazy h.+?l
matches 'hel'
.