Uninitialized object leaked to another thread despite no code explicitly leaking it?
Let's see this simple Java program:
import java.util.*;
class A {
static B b;
static class B {
int x;
B(int x) {
this.x = x;
}
}
public static void main(String[] args) {
new Thread() {
void f(B q) {
int x = q.x;
if (x != 1) {
System.out.println(x);
System.exit(1);
}
}
@Override
public void run() {
while (b == null);
while (true) f(b);
}
}.start();
for (int x = 0;;x++)
b = new B(Math.max(x%2,1));
}
}
Main thread
The main thread creates an instance of B
with x
set to 1, then writes that instance to the static field A.b
. It repeats this action forever.
Polling thread
The spawned thread polls until it finds that A.b.x
is not 1.
?!?
Half the time it goes in an infinite loop as expected, but half the time I get this output:
$ java A
0
Why is the polling thread able to see a B
that has x
not set to 1?
x%2
instead of just x
is here simply because the issue is reproducible with it.
I'm running openjdk 6 on linux x64.
Here is what I think: because b is not final, the compiler is free to reorder the operations as it likes, right? So this, fundamentally is a reordering issue and as a result a unsafe publication issue Marking the variable as final will fix the problem.
More or less, it is the same example as provided here in the Java memory model docs.
The real question is how is this possible. I can also speculate here (since I have no idea how the compiler will reorder), but maybe the reference to B is written to the main memory (where it is visible to the other thread) BEFORE the write to x happens. In between these two operations the read happens, thus the zero value
Often the considerations surrounding concurrency focus on erroneous changes to state or on deadlocks. But visibility of state from different threads is equally important. There are many places in a modern computer where state can be cached. In registers, L1 cache on the processor, L2 cache between the processor and memory, etc. JIT compilers and the Java memory model are designed to take advantage of caching whenever possible or legal, because it can speed things up.
It can also give unexpected and counterintuitive results. I believe that is happening in this case.
When an instance of B is created, the instance variable x is briefly set to 0 before being set to whatever value was passed to the constructor. In this case, 1. If another thread tries to read the value of x, it could see the value 0 even if x has already been set to 1. It could be seeing a stale, cached value.
To ensure that the up-to-date value of x is seen, there are several things you can do. You could make x volatile, or you could protect the read of x with synchronization on the B instance (for example, by adding a synchronized getX()
method). You could even change x from an int to a java.util.concurrent.atomic.AtomicInteger
.
But by far the simplest way to correct the problem is to make x final. It is never going to change during the lifetime of B anyway. Java makes special guarantees for final fields, and one of them is that after the constructor completes, a final field set in the constructor will be visible to any other thread. That is, no other thread will see a stale value for that field.
Making fields immutable has many other benefits as well, but this is a great one.
See also Atomicity, Visibility, and Ordering by Jeremy Manson. Particularly the part where he says:
(Note: when I say synchronization in this post, I don't actually mean locking. I mean anything that guarantees visibility or ordering in Java. This can include final and volatile fields, as well as class initialization and thread starts and joins and all sorts of other good stuff.)