How does {m}{n} ("exactly n times" twice) work?
Solution 1:
IEEE-Standard 1003.1 says:
The behavior of multiple adjacent duplication symbols ( '*' and intervals) produces undefined results.
So every implementation can do as it pleases, just don't rely on anything specific...
Solution 2:
When I input your regex in RegexBuddy using the Java regex syntax, it displays following message
Quantifiers must be preceded by a token that can be repeated «{2}»
Changing the regex to explicitly use a grouping ^(\d{1}){2}
solves that error and works as you expect.
I assume that the java regex engine simply neglects the error/expression and works with what has been compiled so far.
Edit
The reference to the IEEE-Standard in @piet.t's answer seems to support that assumption.
Edit 2 (kudos to @fncomp)
For completeness, one would typically use (?:)
to avoid capturing the group. The complete regex then becomes ^(?:\d{1}){2}
Solution 3:
Scientific approach:
click on the patterns to see the example on regexplanet.com, and click on the green Java button.
- You've already showed
\d{1}{2}
matches"1"
, and doesn't match"12"
, so we know it isn't interpreted as(?:\d{1}){2}
. - Still, 1 is a boring number, and
{1}
might be optimized away, lets try something more interesting:\d{2}{3}
. This still only matches two characters (not six),{3}
is ignored. - Ok. There's an easy way to see what a regex engine does. Does it capture?
Lets try(\d{1})({2})
. Oddly, this works. The second group,$2
, captures the empty string. - So why do we need the first group? How about
({1})
? Still works. - And just
{1}
? No problem there.
It looks like Java is being a little weird here. -
Great! So
{1}
is valid. We know Java expands*
and+
to{0,0x7FFFFFFF}
and{1,0x7FFFFFFF}
, so will*
or+
work? No:Dangling meta character '+' near index 0
+
^The validation must come before
*
and+
are expanded.
I didn't find anything in the spec that explains that, it looks like a quantifier must come at least after a character, brackets, or parentheses.
Most of these patterns are considered invalid by other regex flavors, and for a good reason - they do not make sense.