Character class subtraction, converting from Java syntax to RegexBuddy
Which regular expression engine does Java uses?
In a tool like RegexBuddy if I use
[a-z&&[^bc]]
that expression in Java is good but in RegexBuddy it has not been understood.
In fact it reports:
Match a single character present in the list below
[a-z&&[^bc]
- A character in the range between
a
andz
:a-z
- One of the characters
&[^bc
:&&[^bc
- Match the character
]
literally :]
but i want to match a character between a
and z
intersected with a character that is not b
or c
Like most regex flavors, java.util.regex.Pattern
has its own specific features with syntax that may not be fully compatible with others; this includes character class union, intersection and subtraction:
[a-d[m-p]]
:a
throughd
, orm
throughp
:[a-dm-p]
(union)[a-z&&[def]]
:d
,e
, orf
(intersection)[a-z&&[^bc]]
:a
throughz
, except forb
andc
:[ad-z]
(subtraction)
The most important "caveat" of Java regex is that matches
attempts to match a pattern against the whole string. This is atypical of most engines, and can be a source of confusion at times.
See also
- regular-expressions.info/Flavor Comparison and Java Flavor Notes
On character class subtraction
Subtraction allows you to define for example "all consonants" in Java as [a-z&&[^aeiou]]
.
This syntax is specific to Java. In XML Schema, .NET, JGSoft and RegexBuddy, it's [a-z-[aeiou]]
. Other flavors may not support this feature at all.
References
- regular-expressions.info/Character Classes in XML Regular Expressions
- MSDN - Regular Expression Character Classes - Subtraction
Related questions
- What is the point behind character class intersections in Java’s Regex?
Java uses its own regular expression engine, which behaviour is defined in the Pattern class.
You can test it with an Eclipse plugin or online.
RegexBuddy does not yet support the character class union, intersection, and subtraction syntax that is unique to the Java regular expression flavor. This is the only part of the Java regex syntax that RegexBuddy does not yet support. We're planning to implement this in a future version of RegexBuddy. The reason this has been postponed is because no other regular expression flavor supports this syntax.
P.S.: If you have a question about RegexBuddy in particular, please add the "regexbuddy" tag to your question. Then the question automatically shows up in my RSS reader. I don't follow the "regex" tag because far too many questions use that tag, and most are already answered by the time I see them.