How many String objects will be created

I have the following Java code:

public String makinStrings() {
  String s = "Fred";
  s = s + "47";
  s = s.substring(2, 5);
  s = s.toUpperCase();
  return s.toString();

The question is somehow simple: how many String objects will be created when this method is invoked?

At the beginning I answered that 5 String objects are created, but the answer from my book says that only 3 objects are created and no explanation was given (this is a SCJP question).

From my point of view there are 5 objects: "Fred", "47", "Fred47", "ed4", "ED4".

I also found this question on a SCJP simulation exam, with the same answer 3.

"Fred" and "47" will come from the string literal pool. As such they won't be created when the method is invoked. Instead they will be put there when the class is loaded (or earlier, if other classes use constants with the same value).

"Fred47", "ed4" and "ED4" are the 3 String objects that will be created on each method invocation.

Programs tend to contain a lot of String literals in their code. In Java, these constants are collected in something called the string table for efficiency. For instance, if you use the string "Name: " in ten different places, the JVM (typically) has just one instance of that String and in all ten places where it's used, the references all point to that one instance. This saves memory.

This optimization is possible because String is immutable. If it were possible to change a String, changing it one place would mean it changes in the other nine as well. That's why any operation that changes a String returns a new instance. That's why if you do this:

String s = "boink";

it prints boink, not BOINK.

Now there's one more tricky bit: multiple instances of java.lang.String may point to the same underlying char[] for their character data, in other words, they may be different views on the same char[], by using just a slice of the array. Again, an optimization for efficiency. The substring() method is one of the cases where this happens.

s1 = "Fred47";

//String s1: data=[ 'F', 'r', 'e', 'd', '4', '7'], offset=0, length=6
//                   ^........................^

s2 = s1.substring(2, 5);

//String s2: data=[ 'F', 'r', 'e', 'd', '4', '7'], offset=2, length=3
//                             ^.........^
// the two strings are sharing the same char[]!

In your SCJP question, all this boils down to:

  • The string "Fred" is taken from the String table.
  • The string "47" is taken from the String table.
  • The string "Fred47" is created during the method call. //1
  • The string "ed4" is created during the method call, sharing the same backing array as "Fred47" //2
  • The string "ED4" is created during the method call. //3
  • The s.toString() doesn't create a new one, it just returns this.

One interesting edge case of all this: consider what happens if you have a really long String, for example, a web page taken from the Internet, let's say the length of the char[] is two megabytes. If you take the substring(0, 4) of this, you get a new String that looks like it's just four characters long, but it still shares those two megabytes of backing data. This isn't all that common in the real world, but it can be a huge waste of memory! In the (rare) case that you run into this as a problem, you can use new String(hugeString.substring(0, 4)) to create a String with new, small backing array.

Finally, it's possible to force a String into the string table at runtime by calling intern() on it. The basic rule in this case: don't do it. The extended rule: don't do it unless you've used a memory profiler to ascertain that it's a useful optimization.

Based on the javap output, it looks like during concatentation a StringBuilder is created, not a String. There are then three Strings called for substring(), toUpperCase() and toString().

The last call is not redundant because it transforms the StringBuilder into a String.

>javap -c Test
Compiled from ""

public java.lang.String makinStrings();
0:   ldc     #5; //String Fred
2:   astore_1
3:   new     #6; //class java/lang/StringBuilder
6:   dup
7:   invokespecial   #7; //Method java/lang/StringBuilder."<init>":()V
10:  aload_1
11:  invokevirtual   #8; //Method java/lang/StringBuilder.append:   (Ljava/lang/String;)Ljava/lang/StringBuilder;
14:  ldc     #9; //String 47
16:  invokevirtual   #8; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19:  invokevirtual   #10; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
22:  astore_1
23:  aload_1
24:  iconst_2
25:  iconst_5
26:  invokevirtual   #11; //Method java/lang/String.substring:(II)Ljava/lang/String;
29:  astore_1
30:  aload_1
31:  invokevirtual   #12; //Method java/lang/String.toUpperCase:()Ljava/lang/String;
34:  astore_1
35:  aload_1
36:  invokevirtual   #13; //Method java/lang/String.toString:()Ljava/lang/String;
39:  areturn
