How does ConcurrentHashMap work internally?
I would read the source of ConcurrentHashMap as it is rather complicated in the detail. In short it has
- Multiple partitions which can be locked independently. (16 by default)
- Using concurrent Locks operations for thread safety instead of synchronized.
- Has thread safe Iterators. synchronizedCollection's iterators are not thread safe.
- Does not expose the internal locks. synchronizedCollection does.
The ConcurrentHashMap
is very similar to the java.util.HashTable
class, except that ConcurrentHashMap
offers better concurrency than HashTable
or synchronizedMap
does. ConcurrentHashMap
does not lock the Map while you are reading from it. Additionally,ConcurrentHashMap
does not lock the entire Map
when writing to it. It only locks the part of the Map
that is being written to, internally.
Another difference is that ConcurrentHashMap does not throw ConcurrentModificationException
if the ConcurrentHashMap
is changed while being iterated. The Iterator
is not designed to be used by more than one thread though whereas synchronizedMap
may throw ConcurrentModificationException
This is the article that helped me understand it Why ConcurrentHashMap is better than Hashtable and just as good as a HashMap
Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation. While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason.
This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism. Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map. The concurrency level of the collection, the number of threads that can modify it at the same time without blocking, can be increased. However a higher number means more overhead of maintaining this list of locks.
The "scalability issues" for Hashtable
are present in exactly the same way in Collections.synchronizedMap(Map)
- they use very simple synchronization, which means that only one thread can access the map at the same time.
This is not much of an issue when you have simple inserts and lookups (unless you do it extremely intensively), but becomes a big problem when you need to iterate over the entire Map, which can take a long time for a large Map - while one thread does that, all others have to wait if they want to insert or lookup anything.
The ConcurrentHashMap
uses very sophisticated techniques to reduce the need for synchronization and allow parallel read access by multiple threads without synchronization and, more importantly, provides an Iterator that requires no synchronization and even allows the Map to be modified during interation (though it makes no guarantees whether or not elements that were inserted during iteration will be returned).