What is immutability and why should I worry about it?

What is Immutability?

  • Immutability is applied primarily to objects (strings, arrays, a custom Animal class)
  • Typically, if there is an immutable version of a class, a mutable version is also available. For instance, Objective-C and Cocoa define both an NSString class (immutable) and an NSMutableString class.
  • If an object is immutable, it can't be changed after it is created (basically read-only). You could think of it as "only the constructor can change the object".

This doesn't directly have anything to do with user input; not even your code can change the value of an immutable object. However, you can always create a new immutable object to replace it. Here's a pseudocode example; note that in many languages you can simply do myString = "hello"; instead of using a constructor as I did below, but I included it for clarity:

String myString = new ImmutableString("hello");
myString.appendString(" world"); // Can't do this
myString.setValue("hello world"); // Can't do this
myString = new ImmutableString("hello world"); // OK

You mention "an object that just returns information"; this doesn't automatically make it a good candidate for immutability. Immutable objects tend to always return the same value that they were constructed with, so I'm inclined to say the current time wouldn't be ideal since that changes often. However, you could have a MomentOfTime class that is created with a specific timestamp and always returns that one timestamp in the future.

Benefits of Immutabilty

  • If you pass an object to another function/method, you shouldn't have to worry about whether that object will have the same value after the function returns. For instance:

    String myString = "HeLLo WoRLd";
    String lowercasedString = lowercase(myString);
    print myString + " was converted to " + lowercasedString;
    

    What if the implementation of lowercase() changed myString as it was creating a lowercase version? The third line wouldn't give you the result you wanted. Of course, a good lowercase() function wouldn't do this, but you're guaranteed this fact if myString is immutable. As such, immutable objects can help enforce good object-oriented programming practices.

  • It's easier to make an immutable object thread-safe

  • It potentially simplifies the implementation of the class (nice if you're the one writing the class)

State

If you were to take all of an object's instance variables and write down their values on paper, that would be the state of that object at that given moment. The state of the program is the state of all its objects at a given moment. State changes rapidly over time; a program needs to change state in order to continue running.

Immutable objects, however, have fixed state over time. Once created, the state of an immutable object doesn't change although the state of the program as a whole might. This makes it easier to keep track of what is happening (and see other benefits above).


Immutability

Simply put, memory is immutable when it is not modified after being initialised.

Programs written in imperative languages such as C, Java and C# may manipulate in-memory data at will. An area of physical memory, once set aside, may be modified in whole or part by a thread of execution at any time during the program's execution. In fact, imperative languages encourage this way of programming.

Writing programs in this way has been incredibly successful for single-threaded applications. However as modern application development moves towards multiple concurrent threads of operation within a single process, a world of potential problems and complexity is introduced.

When there is only one thread of execution, you can imagine that this single thread 'owns' all of the data in memory, and so therefore can manipulate it at will. However, there is no implicit concept of ownership when multiple executing threads are involved.

Instead, this burden falls upon the programmer who must go to great pains to ensure that in-memory structures are in a consistent state for all readers. Locking constructs must be used in careful measure to prohibit one thread from seeing data while it is being updated by another thread. Without this coordination, a thread would inevitably consume data that was only halfway through being updated. The outcome from such a situation is unpredictable and often catastrophic. Furthermore, making locking work correctly in code is notoriously difficult and when done badly can cripple performance or, in the worst case, case deadlocks that halt execution irrecoverably.

Using immutable data structures alleviates the need to introduce complex locking into code. When a section of memory is guaranteed not to change during the lifetime of a program then multiple readers may access the memory simultaneously. It is not possible for them to observe that particular data in an inconsistent state.

Many functional programming languages, such as Lisp, Haskell, Erlang, F# and Clojure, encourage immutable data structures by their very nature. It is for this reason that they are enjoying a resurgence of interest as we move towards increasingly complex multi-threaded application development and many-computer computer architectures.

State

The state of an application can simply be thought of as the contents of all the memory and CPU registers at a given point in time.

Logically, a program's state can be divided into two:

  1. The state of the heap
  2. The state of the stack of each executing thread

In managed environments such as C# and Java, one thread cannot access the memory of another. Therefore, each thread 'owns' the state of its stack. The stack can be thought of as holding local variables and parameters of value type (struct), and the references to objects. These values are isolated from outside threads.

However, data on the heap is shareable amongst all threads, hence care must be taken to control concurrent access. All reference-type (class) object instances are stored on the heap.

In OOP, the state of an instance of a class is determined by its fields. These fields are stored on the heap and so are accessible from all threads. If a class defines methods that allow fields to be modified after the constructor completes, then the class is mutable (not immutable). If the fields cannot be changed in any way, then the type is immutable. It is important to note that a class with all C# readonly/Java final fields is not necessarily immutable. These constructs ensure the reference cannot change, but not the referenced object. For example, a field may have an unchangeable reference to a list of objects, but the actual content of the list may be modified at any time.

By defining a type as being truly immutable, its state can be considered frozen and therefore the type is safe for access by multiple threads.

In practice, it can be inconvenient to define all of your types as immutable. To modify the a value on an immutable type can involve a fair bit of memory copying. Some languages make this process easier than others, but either way the CPU will end up doing some extra work. Many factors contribute to determine whether the time spent copying memory outweighs the impact of locking contentions.

A lot of research has gone into the development of immutable data structures such as lists and trees. When using such structures, say a list, the 'add' operation will return a reference to a new list with the new item added. References to the previous list do not see any change and still have a consistent view of the data.