Use Java and RegEx to convert casing in a string
Problem: Turn
"My Testtext TARGETSTRING My Testtext"
into
"My Testtext targetstring My Testtext"
Perl supports the "\L"-operation which can be used in the replacement-string.
The Pattern-Class does not support this operation:
Perl constructs not supported by this class: [...] The preprocessing operations \l \u, \L, and \U. https://docs.oracle.com/javase/10/docs/api/java/util/regex/Pattern.html
You can't do this in Java regex. You'd have to manually post-process using String.toUpperCase()
and toLowerCase()
instead.
Here's an example of how you use regex to find and capitalize words of length at least 3 in a sentence
String text = "no way oh my god it cannot be";
Matcher m = Pattern.compile("\\b\\w{3,}\\b").matcher(text);
StringBuilder sb = new StringBuilder();
int last = 0;
while (m.find()) {
sb.append(text.substring(last, m.start()));
sb.append(m.group(0).toUpperCase());
last = m.end();
}
sb.append(text.substring(last));
System.out.println(sb.toString());
// prints "no WAY oh my GOD it CANNOT be"
Note on appendReplacement
and appendTail
Note that the above solution uses substring
and manages a tail
index, etc. In fact, you can go without these if you use Matcher.appendReplacement
and appendTail
.
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group().toUpperCase());
}
m.appendTail(sb);
Note how sb
is now a StringBuffer
instead of StringBuilder
. Until Matcher
provides StringBuilder
overloads, you're stuck with the slower StringBuffer
if you want to use these methods.
It's up to you whether the trade-off in less efficiency for higher readability is worth it or not.
See also
StringBuilder
andStringBuffer
in Java
To do this on regexp level you have to use \U
to switch on uppercase mode and \E
to switch it off. Here is an example how to use this feature in IntelliJ IDEA find-and-replace
dialog which transforms set of class fields to JUnit assertions (at IDE tooltip is a result of find-and-replace
transformation):
You could use the regexp capturing group (if you really need to use regex, that is, meaning if "TARGETSTRING
" is complex enough and "regular" enough to justify being detected by a regex).
You would then apply toLowerCase()
to the group #1.
import java.util.regex.*;
public class TargetToLowerCase {
public static void main(String[] args) {
StringBuilder sb= new StringBuilder(
"my testtext TARGETSTRING my testtext");
System.out.println(sb);
String regex= "TARGETSTRING ";
Pattern p = Pattern.compile(regex); // Create the pattern.
Matcher matcher = p.matcher(sb); // Create the matcher.
while (matcher.find()) {
String buf= sb.substring(matcher.start(), matcher.end()).toLowerCase();
sb.replace(matcher.start(), matcher.end(), buf);
}
System.out.println(sb);
}
}