TreeMap or HashMap? [duplicate]
When to use hashmaps or treemaps?
I know that I can use TreeMap to iterate over the elements when I need them to be sorted. But is just that? There is no optimization when I just want to consult the maps, or some optimal specific uses?
Solution 1:
TreeMap
provides guaranteed O(log n) lookup time (and insertion etc), whereas HashMap
provides O(1) lookup time if the hash code disperses keys appropriately.
Unless you need the entries to be sorted, I'd stick with HashMap
. Or there's ConcurrentHashMap
of course. I can't remember the details of the differences between all of them, but HashMap
is a perfectly reasonable "default" option :)
For completeness, I should point out that there was a discussion on Stack Overflow a month or so ago about the internals of various maps. See the comments in this question, which I will copy into this answer if bestsss is happy for me to do so.
Solution 2:
Hashtables (usually) perform search operations (look up) bounded within the complexity of O(n)<=T(n)<=O(1)
, with an average case complexity of O(1 + n/k)
; however, binary search trees, (BST's), perform search operations (lookup) bounded within the complexity of O(n)<=T(n)<=O(log_2(n))
, with an average case complexity of O(log_2(n))
. The implementation for each (and every) data structure should be known (by you), to understand the advantages, drawbacks, time complexity of operations, and code complexity.
For example, the number of entries in a hashtable often have some fixed number of entries (some part of which may not be filled at all) with lists of collisions. Trees, on the other hand, usually have two pointers (references) per node, but this can be more if the implementation allows more than two child nodes per node, and this allows the tree to grow as nodes are added, but may not allow duplicates. (The default implementation of a Java TreeMap does not allow for duplicates)
There are special cases to consider as well, for example, what if the number of elements in a particular data structure increases without bound or approaches the limit of an underlying part of the data structure? What about amortized operations that perform some rebalancing or cleanup operation?
For example, in a hashtable, when the number of elements in the table become sufficiently large, and arbitrary number of collisions can occur. On the other hand, trees usually require come re-balancing procedure after an insertion (or deletion).
So, if you have something like a cache (Ex. the number of elements in bounded, or size is known) then a hashtable is probably your best bet; however, if you have something more like a dictionary (Ex. populated once and looked up many times) then I'd use a tree.
This is only in the general case, however, (no information was given). You have to understand process that happen how they happen to make the right choice in deciding which data structure to use.
When I need a multi-map (ranged lookup) or sorted flattening of a collection, then it can't be a hashtable.
Solution 3:
The largest difference between the two is the underlying structure used in the implementation.
HashMaps use an array and a hashing function to store elements. When you try to insert or delete an item in the array the hashing function converts the key into an index on the array where the object is/should be stored (ignoring conflicts). While hashmaps are generally very fast because they don't need to iterate over large amounts of data, they slow down when they're filled because they need to copy all the key/values into a new array.
TreeMaps store a the data in a sorted tree structure. While this means that they'll never have to allocate more space and copy over to it, operations require that part of the data already stored be iterated over. Sometimes changing large amounts of the structure.
Out of the two Hashmaps will generally have better performance when you don't need sorting.
Solution 4:
Inserting new elements into a HashMap will, on average, be a good deal faster than inserting elements into a TreeMap. Unless you need your elements sorted, I'd go with the HashMap.
Solution 5:
Don't forget there is also LinkedHashMap
which is nearly as fast as HashMap
for add/contains/remove operations but also maintains the insertion order.