Regex lazy vs greedy confusion

Solution 1:

You are missing the fact that a regex engine works from left to right, position by position, and succeeds as soon as it finds a match at the current position.

In your example, the first position where the pattern succeeds is at the second "a".

The laziness works only on the right side.

If you want to obtain "xxx", a better way is to use a negated character class [^ab]* instead of .*?

Note: not exactly related to the subject, but good to know: a DFA regex engine will try to get the largest result in case of alternation, a NFA gives you the first that succeeds.

Solution 2:

user1277327, the (?<=a) part of your pattern means "preceded by an 'a'". When the regex engine starts on your string aaxxxb, the first "a" doesn't fulfill the assertion of that lookbehind, but the second "a" does. Fine, but can the engine match that "a"? Yes, the dot in your .* allows the engine to match this "a". The lazy modifier ? tells the dot star to eat up only as many characters as necessary until we are able to match what comes next. What comes next is a lookahead asserting that the next character is a "b". So the engine eats up the three x characters. The total match is axxx.

If you are finding greed / laziness confusing, you may want to have a look at the levels of regex greed. The accompanying tut on lookarounds may also help.