Java split String performances
Here is the current code in my application:
String[] ids = str.split("/");
When profiling the application, a non-negligeable time is spent string splitting. Also, the split
method takes a regular expression, which is superfluous here.
What alternative can I use in order to optimize the string splitting? Is StringUtils.split
faster?
(I would've tried and tested myself but profiling my application takes a lot of time.)
String.split(String)
won't create regexp if your pattern is only one character long. When splitting by single character, it will use specialized code which is pretty efficient. StringTokenizer
is not much faster in this particular case.
This was introduced in OpenJDK7/OracleJDK7. Here's a bug report and a commit. I've made a simple benchmark here.
$ java -version
java version "1.8.0_20"
Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)
$ java Split
split_banthar: 1231
split_tskuzzy: 1464
split_tskuzzy2: 1742
string.split: 1291
StringTokenizer: 1517
If you can use third-party libraries, Guava's Splitter
doesn't incur the overhead of regular expressions when you don't ask for it, and is very fast as a general rule. (Disclosure: I contribute to Guava.)
Iterable<String> split = Splitter.on('/').split(string);
(Also, Splitter
is as a rule much more predictable than String.split
.)