Solution 1:

A good resource for understanding why volatile is needed comes from the JCIP book. Wikipedia has a decent explanation of that material as well.

The real problem is that Thread A may assign a memory space for instance before it is finished constructing instance. Thread B will see that assignment and try to use it. This results in Thread B failing because it is using a partially constructed version of instance.

Solution 2:

As quoted by @irreputable, volatile is not expensive. Even if it is expensive, consistency should be given priority over performance.

There is one more clean elegant way for Lazy Singletons.

public final class Singleton {
    private Singleton() {}
    public static Singleton getInstance() {
        return LazyHolder.INSTANCE;
    }
    private static class LazyHolder {
        private static final Singleton INSTANCE = new Singleton();
    }
}

Source article : Initialization-on-demand_holder_idiom from wikipedia

In software engineering, the Initialization on Demand Holder (design pattern) idiom is a lazy-loaded singleton. In all versions of Java, the idiom enables a safe, highly concurrent lazy initialization with good performance

Since the class does not have any static variables to initialize, the initialization completes trivially.

The static class definition LazyHolder within it is not initialized until the JVM determines that LazyHolder must be executed.

The static class LazyHolder is only executed when the static method getInstance is invoked on the class Singleton, and the first time this happens the JVM will load and initialize the LazyHolder class.

This solution is thread-safe without requiring special language constructs (i.e. volatile or synchronized).