Why does this Java code compile?

In method or class scope, the line below compiles (with warning):

int x = x = 1;

In class scope, where variables get their default values, the following gives 'undefined reference' error:

int x = x + 1;

Isn't it the first x = x = 1 should end up with same 'undefined reference' error? Or maybe the second line int x = x + 1 should compile? Or there is something I am missing?


tl;dr

For fields, int b = b + 1 is illegal because b is an illegal forward reference to b. You can actually fix this by writing int b = this.b + 1, which compiles without complaints.

For local variables, int d = d + 1 is illegal because d is not initialized before use. This is not the case for fields, which are always default-initialized.

You can see the difference by attempting to compile

int x = (x = 1) + x;

as a field declaration and as a local variable declaration. The former will fail, but the latter will succeed, because of the difference in semantics.

Introduction

First off, the rules for field and local variable initializers are very different. So this answer will tackle the rules in two parts.

We'll use this test program throughout:

public class test {
    int a = a = 1;
    int b = b + 1;
    public static void Main(String[] args) {
        int c = c = 1;
        int d = d + 1;
    }
}

The declaration of b is invalid and fails with an illegal forward reference error.
The declaration of d is invalid and fails with an variable d might not have been initialized error.

The fact that these errors are different should hint that the reasons for the errors are also different.

Fields

Field initializers in Java are governed by JLS §8.3.2, Initialization of Fields.

The scope of a field is defined in JLS §6.3, Scope of a Declaration.

Relevant rules are:

  • The scope of a declaration of a member m declared in or inherited by a class type C (§8.1.6) is the entire body of C, including any nested type declarations.
  • Initialization expressions for instance variables may use the simple name of any static variable declared in or inherited by the class, even one whose declaration occurs textually later.
  • Use of instance variables whose declarations appear textually after the use is sometimes restricted, even though these instance variables are in scope. See §8.3.2.3 for the precise rules governing forward reference to instance variables.

§8.3.2.3 says:

The declaration of a member needs to appear textually before it is used only if the member is an instance (respectively static) field of a class or interface C and all of the following conditions hold:

  • The usage occurs in an instance (respectively static) variable initializer of C or in an instance (respectively static) initializer of C.
  • The usage is not on the left hand side of an assignment.
  • The usage is via a simple name.
  • C is the innermost class or interface enclosing the usage.

You can actually refer to fields before they have been declared, except in certain cases. These restrictions are intended to prevent code like

int j = i;
int i = j;

from compiling. The Java spec says "the restrictions above are designed to catch, at compile time, circular or otherwise malformed initializations."

What do these rules actually boil down to?

In short, the rules basically say that you must declare a field in advance of a reference to that field if (a) the reference is in an initializer, (b) the reference is not being assigned to, (c) the reference is a simple name (no qualifiers like this.) and (d) it is not being accessed from within an inner class. So, a forward reference that satisfies all four conditions is illegal, but a forward reference that fails on at least one condition is OK.

int a = a = 1; compiles because it violates (b): the reference a is being assigned to, so it's legal to refer to a in advance of a's complete declaration.

int b = this.b + 1 also compiles because it violates (c): the reference this.b is not a simple name (it's qualified with this.). This odd construct is still perfectly well-defined, because this.b has the value zero.

So, basically, the restrictions on field references within initializers prevent int a = a + 1 from being successfully compiled.

Observe that the field declaration int b = (b = 1) + b will fail to compile, because the final b is still an illegal forward reference.

Local variables

Local variable declarations are governed by JLS §14.4, Local Variable Declaration Statements.

The scope of a local variable is defined in JLS §6.3, Scope of a Declaration:

  • The scope of a local variable declaration in a block (§14.4) is the rest of the block in which the declaration appears, starting with its own initializer and including any further declarators to the right in the local variable declaration statement.

Note that initializers are within the scope of the variable being declared. So why doesn't int d = d + 1; compile?

The reason is due to Java's rule on definite assignment (JLS §16). Definite assignment basically says that every access to a local variable must have a preceding assignment to that variable, and the Java compiler checks loops and branches to ensure that assignment always occurs prior to any use (this is why definite assignment has an entire specification section dedicated to it). The basic rule is:

  • For every access of a local variable or blank final field x, x must be definitely assigned before the access, or a compile-time error occurs.

In int d = d + 1;, the access to d is resolved to the local variable fine, but since d has not been assigned before d is accessed, the compiler issues an error. In int c = c = 1, c = 1 happens first, which assigns c, and then c is initialized to the result of that assignment (which is 1).

Note that because of definite assignment rules, the local variable declaration int d = (d = 1) + d; will compile successfully (unlike the field declaration int b = (b = 1) + b), because d is definitely assigned by the time the final d is reached.


int x = x = 1;

is equivalent to

int x = 1;
x = x; //warning here

while in

int x = x + 1; 

first we need to compute x+1 but the value of x is not known so you get an error (the compiler knows that the value of x is not known)