Is it better to reuse a StringBuilder in a loop?
I've a performance related question regarding use of StringBuilder.
In a very long loop I'm manipulating a StringBuilder
and passing it to another method like this:
for (loop condition) {
StringBuilder sb = new StringBuilder();
sb.append("some string");
. . .
sb.append(anotherString);
. . .
passToMethod(sb.toString());
}
Is instantiating StringBuilder
at every loop cycle is a good solution? And is calling a delete instead better, like the following?
StringBuilder sb = new StringBuilder();
for (loop condition) {
sb.delete(0, sb.length);
sb.append("some string");
. . .
sb.append(anotherString);
. . .
passToMethod(sb.toString());
}
The second one is about 25% faster in my mini-benchmark.
public class ScratchPad {
static String a;
public static void main( String[] args ) throws Exception {
long time = System.currentTimeMillis();
for( int i = 0; i < 10000000; i++ ) {
StringBuilder sb = new StringBuilder();
sb.append( "someString" );
sb.append( "someString2"+i );
sb.append( "someStrin4g"+i );
sb.append( "someStr5ing"+i );
sb.append( "someSt7ring"+i );
a = sb.toString();
}
System.out.println( System.currentTimeMillis()-time );
time = System.currentTimeMillis();
StringBuilder sb = new StringBuilder();
for( int i = 0; i < 10000000; i++ ) {
sb.delete( 0, sb.length() );
sb.append( "someString" );
sb.append( "someString2"+i );
sb.append( "someStrin4g"+i );
sb.append( "someStr5ing"+i );
sb.append( "someSt7ring"+i );
a = sb.toString();
}
System.out.println( System.currentTimeMillis()-time );
}
}
Results:
25265
17969
Note that this is with JRE 1.6.0_07.
Based on Jon Skeet's ideas in the edit, here's version 2. Same results though.
public class ScratchPad {
static String a;
public static void main( String[] args ) throws Exception {
long time = System.currentTimeMillis();
StringBuilder sb = new StringBuilder();
for( int i = 0; i < 10000000; i++ ) {
sb.delete( 0, sb.length() );
sb.append( "someString" );
sb.append( "someString2" );
sb.append( "someStrin4g" );
sb.append( "someStr5ing" );
sb.append( "someSt7ring" );
a = sb.toString();
}
System.out.println( System.currentTimeMillis()-time );
time = System.currentTimeMillis();
for( int i = 0; i < 10000000; i++ ) {
StringBuilder sb2 = new StringBuilder();
sb2.append( "someString" );
sb2.append( "someString2" );
sb2.append( "someStrin4g" );
sb2.append( "someStr5ing" );
sb2.append( "someSt7ring" );
a = sb2.toString();
}
System.out.println( System.currentTimeMillis()-time );
}
}
Results:
5016
7516
Faster still:
public class ScratchPad {
private static String a;
public static void main( String[] args ) throws Exception {
final long time = System.currentTimeMillis();
// Pre-allocate enough space to store all appended strings.
// StringBuilder, ultimately, uses an array of characters.
final StringBuilder sb = new StringBuilder( 128 );
for( int i = 0; i < 10000000; i++ ) {
// Resetting the string is faster than creating a new object.
// Since this is a critical loop, every instruction counts.
sb.setLength( 0 );
sb.append( "someString" );
sb.append( "someString2" );
sb.append( "someStrin4g" );
sb.append( "someStr5ing" );
sb.append( "someSt7ring" );
setA( sb.toString() );
}
System.out.println( System.currentTimeMillis() - time );
}
private static void setA( final String aString ) {
a = aString;
}
}
In the philosophy of writing solid code, the inner workings of the method are hidden from the client objects. Thus it makes no difference from the system's perspective whether you re-declare the StringBuilder
within the loop or outside of the loop. Since declaring it outside of the loop is faster, and it does not make the code significantly more complicated, reuse the object.
Even if it was much more complicated, and you knew for certain that object instantiation was the bottleneck, comment it.
Three runs with this answer:
$ java ScratchPad
1567
$ java ScratchPad
1569
$ java ScratchPad
1570
Three runs with the other answer:
$ java ScratchPad2
1663
2231
$ java ScratchPad2
1656
2233
$ java ScratchPad2
1658
2242
Although not significant, setting the StringBuilder
's initial buffer size, to prevent memory re-allocations, will give a small performance gain.
In the philosophy of writing solid code its always better to put your StringBuilder inside your loop. This way it doesnt go outside the code its intended for.
Secondly the biggest improvment in StringBuilder comes from giving it an initial size to avoid it growing bigger while the loop runs
for (loop condition) {
StringBuilder sb = new StringBuilder(4096);
}
Okay, I now understand what's going on, and it does make sense.
I was under the impression that toString
just passed the underlying char[]
into a String constructor which didn't take a copy. A copy would then be made on the next "write" operation (e.g. delete
). I believe this was the case with StringBuffer
in some previous version. (It isn't now.) But no - toString
just passes the array (and index and length) to the public String
constructor which takes a copy.
So in the "reuse the StringBuilder
" case we genuinely create one copy of the data per string, using the same char array in the buffer the whole time. Obviously creating a new StringBuilder
each time creates a new underlying buffer - and then that buffer is copied (somewhat pointlessly, in our particular case, but done for safety reasons) when creating a new string.
All this leads to the second version definitely being more efficient - but at the same time I'd still say it's uglier code.
Since I don't think it's been pointed out yet, because of optimizations built into the Sun Java compiler, which automatically creates StringBuilders (StringBuffers pre-J2SE 5.0) when it sees String concatenations, the first example in the question is equivalent to:
for (loop condition) {
String s = "some string";
. . .
s += anotherString;
. . .
passToMethod(s);
}
Which is more readable, IMO, the better approach. Your attempts to optimize may result in gains in some platform, but potentially losses others.
But if you really are running into issues with performance, then sure, optimize away. I'd start with explicitly specifying the buffer size of the StringBuilder though, per Jon Skeet.