Why can some ASCII characters not be expressed in the form '\uXXXX' in Java source code?

Solution 1:

Unicode characters are replaced by their value, so your line is replaced by the compiler with:

char error = '
';

which is not a valid Java statement.

This is dictated by the Language Specification:

A compiler for the Java programming language ("Java compiler") first recognizes Unicode escapes in its input, translating the ASCII characters \u followed by four hexadecimal digits to the UTF-16 code unit (§3.1) of the indicated hexadecimal value, and passing all other characters unchanged. Representing supplementary characters requires two consecutive Unicode escapes. This translation step results in a sequence of Unicode input characters.

This can lead to surprising stuff, for example, this is a valid Java program (it contains hidden unicode characters) - courtesy of Peter Lawrey:

public static void main(String[] args) {
    for (char c‮h = 0; c‮h < Character.MAX_VALUE; c‮h++) {
        if (Character.isJavaIdentifierPart(c‮h) && !Character.isJavaIdentifierStart(c‮h)) {
            System.out.printf("%04x <%s>%n", (int) c‮h, "" + c‮h);
        }
    }
}

Solution 2:

Unicode escape sequences like \u000a are replaced by the actual characters they represent before the Java compiler does anything else with the source code. And so, your program eventually ends up at

char ch = '
';

So the \u000a in your source code is replaced internally by a linefeed character. Note that this happens before the compiler actually reads and interprets your source code.

Referring to the Java Language Specification:

It is a compile-time error for a line terminator (§3.4) to appear after the opening ' and before the closing '.

And as well all know by heart, \n is a line terminator, quoting:

 LineTerminator:
    the ASCII LF character, also known as "newline"
    the ASCII CR character, also known as "return"
    the ASCII CR character followed by the ASCII LF character

Other symbols that could cause problems are \, ' and " for example.