Concatenating null strings in Java [duplicate]
Why must it work?
The JLS 5, Section 15.18.1.1 JLS 8 § 15.18.1 "String Concatenation Operator +", leading to JLS 8, § 5.1.11 "String Conversion", requires this operation to succeed without failure:
...Now only reference values need to be considered. If the reference is null, it is converted to the string "null" (four ASCII characters n, u, l, l). Otherwise, the conversion is performed as if by an invocation of the toString method of the referenced object with no arguments; but if the result of invoking the toString method is null, then the string "null" is used instead.
How does it work?
Let's look at the bytecode! The compiler takes your code:
String s = null;
s = s + "hello";
System.out.println(s); // prints "nullhello"
and compiles it into bytecode as if you had instead written this:
String s = null;
s = new StringBuilder(String.valueOf(s)).append("hello").toString();
System.out.println(s); // prints "nullhello"
(You can do so yourself by using javap -c
)
The append methods of StringBuilder
all handle null just fine. In this case because null
is the first argument, String.valueOf()
is invoked instead since StringBuilder does not have a constructor that takes any arbitrary reference type.
If you were to have done s = "hello" + s
instead, the equivalent code would be:
s = new StringBuilder("hello").append(s).toString();
where in this case the append method takes the null and then delegates it to String.valueOf()
.
Note: String concatenation is actually one of the rare places where the compiler gets to decide which optimization(s) to perform. As such, the "exact equivalent" code may differ from compiler to compiler. This optimization is allowed by JLS, Section 15.18.1.2:
To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.
The compiler I used to determine the "equivalent code" above was Eclipse's compiler, ecj.
See section 5.4 and 15.18 of the Java Language specification:
String conversion applies only to the operands of the binary + operator when one of the arguments is a String. In this single special case, the other argument to the + is converted to a String, and a new String which is the concatenation of the two strings is the result of the +. String conversion is specified in detail within the description of the string concatenation + operator.
and
If only one operand expression is of type String, then string conversion is performed on the other operand to produce a string at run time. The result is a reference to a String object (newly created, unless the expression is a compile-time constant expression (§15.28))that is the concatenation of the two operand strings. The characters of the left-hand operand precede the characters of the right-hand operand in the newly created string. If an operand of type String is null, then the string "null" is used instead of that operand.
The second line is transformed to the following code:
s = (new StringBuilder()).append((String)null).append("hello").toString();
The append methods can handle null
arguments.