Punctuation Regex in Java
First, i'm read the documentation as follow
http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html
And i want find any punctuation character EXCEPT @',& but i don't quite understand.
Here is :
public static void main( String[] args )
{
// String to be scanned to find the pattern.
String value = "#`~!#$%^";
String pattern = "\\p{Punct}[^@',&]";
// Create a Pattern object
Pattern r = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
// Now create matcher object.
Matcher m = r.matcher(value);
if (m.find()) {
System.out.println("Found value: " + m.groupCount());
} else {
System.out.println("NO MATCH");
}
}
Result is NO MATCH.
Is there any mismatch ?
Thanks
MRizq
You're matching two characters, not one. Using a (negative) lookahead should solve the task:
(?![@',&])\\p{Punct}
You may use character subtraction here:
String pat = "[\\p{Punct}&&[^@',&]]";
The whole pattern represents a character class, [...]
, that contains a \p{Punct}
POSIX character class, the &&
intersection operator and [^...]
negated character class.
A Unicode modifier might be necessary if you plan to also match all Unicode punctuation:
String pat = "(?U)[\\p{Punct}&&[^@',&]]";
^^^^
The pattern matches any punctuation (with \p{Punct}
) except @
, '
, ,
and &
.
If you need to exclude more characters, add them to the negated character class. Just remember to always escape -
, \
, ^
, [
and ]
inside a Java regex character class/set. E.g. adding a backslash and -
might look like "[\\p{Punct}&&[^@',&\\\\-]]"
or "[\\p{Punct}&&[^@',&\\-\\\\]]"
.
Java demo:
String value = "#`~!#$%^,";
String pattern = "(?U)[\\p{Punct}&&[^@',&]]";
Pattern r = Pattern.compile(pattern); // Create a Pattern object
Matcher m = r.matcher(value); // Now create matcher object.
while (m.find()) {
System.out.println("Found value: " + m.group());
}
Output:
Found value: #
Found value: !
Found value: #
Found value: %
Found value: ,