Regex split numbers and letter groups without spaces
If I have a string like "11E12C108N" which is a concatenation of letter groups and digit groups, how do I split them without a delimiter space character inbetween?
For example, I want the resulting split to be:
tokens[0] = "11"
tokens[1] = "E"
tokens[2] = "12"
tokens[3] = "C"
tokens[4] = "108"
tokens[5] = "N"
I have this right now.
public static void main(String[] args) {
String stringToSplit = "11E12C108N";
Pattern pattern = Pattern.compile("\\d+\\D+");
Matcher matcher = pattern.matcher(stringToSplit);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
Which gives me:
11E
12C
108N
Can I make the original regex do a complete split in one go? Instead of having to run the regex again on the intermediate tokens?
Solution 1:
Use the following regex, and get a list of all matches. That will be what you are looking for.
\d+|\D+
In Java, I think the code would look something like this:
Matcher matcher = Pattern.compile("\\d+|\\D+").matcher(theString);
while (matcher.find())
{
// append matcher.group() to your list
}
Solution 2:
You can also use "look around" in split regex
String stringToSplit = "11E12C108N";
String[] tokens = stringToSplit .split("(?<=\\d)(?=\\D)|(?=\\d)(?<=\\D)");
System.out.println(Arrays.toString(tokens));
out
[11, E, 12, C, 108, N]
Idea is to split in places which are between digit (\d
) and non-digit (\D
). In other words it is place (empty string) which have:
- digit before
(?<=\d)
and non-digit after it(?=\D)
- non-digit before
(?<=\D)
and digit after it(?=\d)
More info about (?<=..)
and (?=..)
(and few more) you can find at http://www.regular-expressions.info/lookaround.html