JavaScript String concatenation behavior with null or undefined values

You can use Array.prototype.join to ignore undefined and null:

['a', 'b', void 0, null, 6].join(''); // 'ab6'

According to the spec:

If element is undefined or null, Let next be the empty String; otherwise, let next be ToString(element).


Given that,

  • What is the history behind the oddity that makes JS converting null or undefined to their string value in String concatenation?

    In fact, in some cases, the current behavior makes sense.

    function showSum(a,b) {
        alert(a + ' + ' + b + ' = ' + (+a + +b));
    }

    For example, if the function above is called without arguments, undefined + undefined = NaN is probably better than + = NaN.

    In general, I think that if you want to insert some variables in a string, displaying undefined or null makes sense. Probably, Eich thought that too.

    Of course, there are cases in which ignoring those would be better, such as when joining strings together. But for those cases you can use Array.prototype.join.

  • Is there any chance for a change in this behavior in future ECMAScript versions?

    Most likely not.

    Since there already is Array.prototype.join, modifying the behavior of string concatenation would only cause disadvantages, but no advantages. Moreover, it would break old codes, so it wouldn't be backwards compatible.

  • What is the prettiest way to concatenate String with potential null or undefined?

    Array.prototype.join seems to be the simplest one. Whether it's the prettiest or not may be opinion-based.


What is the prettiest way to concatenate String with potential null or undefined object without falling into this problem [...]?

There are several ways, and you partly mentioned them yourself. To make it short, the only clean way I can think of is a function:

const Strings = {};
Strings.orEmpty = function( entity ) {
    return entity || "";
};

// usage
const message = "This is a " + Strings.orEmpty( test );

Of course, you can (and should) change the actual implementation to suit your needs. And this is already why I think this method is superior: it introduced encapsulation.

Really, you only have to ask what the "prettiest" way is, if you don't have encapsulation. You ask yourself this question because you already know that you are going to get yourself into a place where you cannot change the implementation anymore, so you want it to be perfect right away. But that's the thing: requirements, views and even envrionments change. They evolve. So why not allow yourself to change the implementation with as little as adapting one line and perhaps one or two tests?

You could call this cheating, because it doesn't really answer how to implement the actual logic. But that's my point: it doesn't matter. Well, maybe a little. But really, there is no need to worry because of how simple it would be to change. And since it's not inlined, it also looks a lot prettier – whether or not you implement it this way or in a more sophisticated way.

If, throughout your code, you keep repeating the || inline, you run into two problems:

  • You duplicate code.
  • And because you duplicate code, you make it hard to maintain and change in the future.

And these are two points commonly known to be anti-patterns when it comes to high-quality software development.

Some people will say that this is too much overhead; they will talk about performance. It's non-sense. For one, this barely adds overhead. If this is what you are worried about, you chose the wrong language. Even jQuery uses functions. People need to get over micro-optimization.

The other thing is: you can use a code "compiler" = minifier. Good tools in this area will try to detect which statements to inline during the compilation step. This way, you keep your code clean and maintainable and can still get that last drop of performance if you still believe in it or really do have an environment where this matters.

Lastly, have some faith in browsers. They will optimize code and they do a pretty darn good job at it these days.


The ECMA Specification

Just to flesh out the reason it behaves this way in terms of the spec, this behavior has been present since version one. The definition there and in 5.1 are semantically equivalent, I'll show the 5.1 definitions.

Section 11.6.1: The Addition operator ( + )

The addition operator either performs string concatenation or numeric addition.

The production AdditiveExpression : AdditiveExpression + MultiplicativeExpression is evaluated as follows:

  1. Let lref be the result of evaluating AdditiveExpression.
  2. Let lval be GetValue(lref).
  3. Let rref be the result of evaluating MultiplicativeExpression.
  4. Let rval be GetValue(rref).
  5. Let lprim be ToPrimitive(lval).
  6. Let rprim be ToPrimitive(rval).
  7. If Type(lprim) is String or Type(rprim) is String, then
    a. Return the String that is the result of concatenating ToString(lprim) followed by ToString(rprim)
  8. Return the result of applying the addition operation to ToNumber(lprim) and ToNumber(rprim). See the Note below 11.6.3.

So, if either value ends up being a String, then ToString is used on both arguments (line 7) and those are concatenated (line 7a). ToPrimitive returns all non-object values unchanged, so null and undefined are untouched:

Section 9.1 ToPrimitive

The abstract operation ToPrimitive takes an input argument and an optional argument PreferredType. The abstract operation ToPrimitive converts its input argument to a non-Object type ... Conversion occurs according to Table 10:

For all non-Object types, including both Null and Undefined, [t]he result equals the input argument (no conversion). So ToPrimitive does nothing here.

Finally, Section 9.8 ToString

The abstract operation ToString converts its argument to a value of type String according to Table 13:

Table 13 gives "undefined" for the Undefined type and "null" for the Null type.

Will it change? Is it even an "oddity"?

As others have pointed out, this is very unlikely to change as it would break backward compatibility (and bring no real benefit), even more so given that this behavior is the same since the 1997 version of the spec. I would also not really consider it an oddity.

If you were to change this behavior, would you change the definition of ToString for null and undefined or would you special-case the addition operator for these values? ToString is used many, many places throughout the spec and "null" seems like an uncontroversial choice for representing null. Just to give a couple of examples, in Java "" + null is the string "null" and in Python str(None) is the string "None".

Workaround

Others have given good workarounds, but I would add that I doubt you want to use entity || "" as your strategy since it resolves true to "true" but false to "". The array join in this answer has the more expected behavior, or you could change the implementation of this answer to check entity == null (both null == null and undefined == null are true).


Is there any chance for a change in this behavior in future ECMAScript versions?

I would say the chances are very slim. And there are several reasons:

We already know what ES5 and ES6 look like

The future ES versions are already done or in draft. Neither one, afaik, changes this behavior. And the thing to keep in mind here is that it will take years for these standards to be established in browsers in the sense that you can write applications with these standards without relying on proxy tools that compile it to actual Javascript.

Just try to estimate the duration. Not even ES5 is fully supported by the majority of browsers out there and it will probably take another few years. ES6 is not even fully specified yet. Out of the blue, we are looking at at least another five years.

Browsers do their own things

Browsers are known to make their own decisions on certain topics. You don't know whether all browsers will fully support this feature in exactly the same way. Of course you would know once it is part of the standard, but as of now, even if it was announced to become part of ES7, it would only be speculation at best.

And browsers may make their own decision here especially because:

This change is breaking

One of the biggest things about standards is that they usually try to be backwards compatible. This is especially true for the web where the same code has to run on all kinds of envrionments.

If the standard introduces a new feature and it's not supported in old browsers, that's one thing. Tell your client to update their browser to use the site. But if you update your browser and suddenly half the internet breaks for you, that's a bug uhm-no.

Sure, this particular change is unlikely to break a lot of scripts. But that's usually a poor arguments because a standard is universal and has to take every chance into account. Just consider

"use strict";

as the instruction to switch to strict mode. It goes to show huw much effort a standard puts into trying to make everything compatible, because they could've made strict mode the default (and even only mode). But with this clever instruction, you allow old code to run without a change and still can take advantage of the new, stricter mode.

Another example for backwards compatibility: the === operator. == is fundamentally flawed (though some people disagree) and it could've just changed its meaning. Instead, === was introduced, allowing old code to still run without breaking; at the same time allowing new programs to be written using a more strict check.

And for a standard to break compatibility, there has to be a very good reason. Which brings us to

There is just no good reason

Yes, it bugs you. That's understandable. But ultimately, it is nothing that can't be solved very easily. Use ||, write a function – whatever. You can make it work at almost no cost. So what is really the benefit for investing all the time and effort into analyzing this change which we know is breaking anyway? I just don't see the point.

Javascript has several weak points in its design. And it has increasingly become a bigger issue as the language became more and more important and powerful. But while there are very good reasons to change a lot of its design, other things just arent't meant to be changed.


Disclaimer: This answer is partly opinion-based.