HashMap vs ConcurrentHashMap vs LoadingCache(Guava)

To locally cache some data in spring boot application which technique would be better in terms of read/write operations? HashMap vs ConcurrentHashMap vs LoadingCache(Guava library) I tried writing and reading operations on each of these, HashMap was fastest and LoadingCache was slowest, then why should we use LoadingCache, for what purpose?

Edit: The application is multithreaded. And functionalities like maximum size of cache, expiry time can be compromised. Also, Main motive is to increase read speed.


Solution 1:

If your application is multi-threaded and the cache is shared by the threads, then HashMap is not an option. It is not thread-safe, and if you guard it with locks, etc, the locking is liable to becomes a concurrency bottleneck.

If your application needs the cache to be LRU or have some other "smart" policy for deciding what to evict, then ConcurrentHashMap doesn't provide a good way to do that. If you want your cache to be sensitive to memory pressure (e.g. using weak referenced keys), then ConcurrentHashMap doesn't do that either.

That is where Guava Cache classes come in. Read this to get an idea of the functionality they provide.


The bottom line is that while HashMap is most performant in a single-threaded benchmark, it probably doesn't provide all of the functionality that you need. Performance is not everything.

My advice would be to get the code working properly first ... then look at ways to optimize it. An ultra-fast cache implementation that breaks occasionally or fills up memory is not a good idea. And there is a good chance that the cache read / write performance won't be critical anyway.

Solution 2:

With respect to performance, it depends on the size of your data and the ratio of modifications in between reads. Here is a suggestion:

Static data: If your data is static, initialize a read only map in the constructor like so:

final Map map;
MyClass(Map inputMap) {
  map = Map.copyOf(inputMap);
}
Object get(Object key) {
  return map.get(key);
}

Rare modifications: If you have rare modifications and the data is not too big:

volatile Map map = Map.of();
Object synchronized put(Object key, Object value) {
  Map mutable = new HashMap(map);
  mutable.put(key, value);
  map = Map.copyOf(mutable);
}
Object get(Object key) {
  return map.get(key);
}

The Map.copyOf is available since Java 9. It creates an immutable hash table, that is using an open addressing scheme, unlike HashMap. That will be even faster than the HashMap. You can also use the HashMap with the scheme above in a multi threaded environment, since it is not modified once it was created.

synchronized is needed, to make sure that you do not use an update if multiple threads use put at the same time. volatile is needed to make sure the update becomes visible in other threads.

Main motive is to increase read speed.

So, the solutions above would give the best read speed but compromise on update speed.

Lots of data and/or lots of modifications: Use the ConcurrentHashMap.

Even if there is a slight performance benefit I recommend using ConcurrentHashMap, because:

  • It is less error prone and ConcurrentHashMap is proven to work. Will you write unit tests with multiple threads proofing your code works correctly?
  • Less code. Less bugs
  • Less code. Less confusion for your fellow developers
  • The usage pattern might change over time and your home grown "performance improvement" will turn into a "performance problem".

Footnotes:

Use of a Cache

Cache and LoadingCache: The Guava LoadingCache is meant to be used with a CacheLoader. A cache loader is useful to make the cache automatically populate the cache and or do refreshs. Guava cache is outdated meanwhile, I recommend looking at Caffine or cache2k, when looking for a caching solution working in the Java heap.

A cache always has additional overhead in the read path because it needs to do some bookkeeping to know which entries are currently accessed. That overhead is minimal in cache2k, at least according to my (disclaimer...) benchmarks.

Spring Boot

When used with the Spring cache abstraction, e.g. with @Cacheable there will be not a big performance difference within the implementations, since the cache abstraction has a very relevant overhead as well.

The simple cache implementation in Spring that is based on ConcurrentHashMap is only meant for testing and prototyping. I recommend to always use a real cache implementation as soon as possible and set reasonable resource limits.

Profile and optimize the whole application

Every optimization you do has trade offs, so you should always look at the whole application and compare "optimizations" to the simplest or most common solution possible.