What is a regex "independent non-capturing group"?

From the Java 6 Pattern documentation:

Special constructs (non-capturing)

(?:X)   X, as a non-capturing group

(?>X)   X, as an independent, non-capturing group

Between (?:X) and (?>X) what is the difference? What does the independent mean in this context?


Solution 1:

It means that the grouping is atomic, and it throws away backtracking information for a matched group. So, this expression is possessive; it won't back off even if doing so is the only way for the regex as a whole to succeed. It's "independent" in the sense that it doesn't cooperate, via backtracking, with other elements of the regex to ensure a match.

Solution 2:

I think this tutorial explains what exactly "independent, non-capturing group" or "Atomic Grouping" is

The regular expression a(bc|b)c (capturing group) matches abcc and abc. The regex a(?>bc|b)c (atomic group) matches abcc but not abc.

When applied to abc, both regexes will match a to a, bc to bc, and then c will fail to match at the end of the string. Here their paths diverge. The regex with the capturing group has remembered a backtracking position for the alternation. The group will give up its match, b then matches b and c matches c. Match found!

The regex with the atomic group, however, exited from an atomic group after bc was matched. At that point, all backtracking positions for tokens inside the group are discarded. In this example, the alternation's option to try b at the second position in the string is discarded. As a result, when c fails, the regex engine has no alternatives left to try.

Solution 3:

If you have foo(?>(co)*)co, that will never match. I'm sure there are practical examples of when this would be useful, try O'Reilly's book.