Why does the get method of HashMap have a FOR loop?
I am looking at the source code for HashMap
in Java 7, and I see that the put
method will check if any entry is already present and if it is present then it will replace the old value with the new value.
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
So, basically it means that there would always be only one entry for the given key, I have seen this by debugging as well, but if I am wrong then please correct me.
Now, since there is only one entry for a given key, why does the get
method have a FOR loop, since it could have simply returned the value directly?
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
I feel the above loop is unnecessary. Please help me understand if I am wrong.
Solution 1:
table[indexFor(hash, table.length)]
is the bucket of the HashMap
that may contain the key we are looking for (if it is present in the Map
).
However, each bucket may contain multiple entries (either different keys having the same hashCode()
, or different keys with different hashCode()
that still got mapped to the same bucket), so you must iterate over these entries until you find the key you are looking for.
Since the expected number of entries in each bucket should be very small, this loop is still executed in expected O(1)
time.
Solution 2:
If you see the internal working of get method of HashMap.
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];e != null;e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
- First, it gets the hash code of the key object, which is passed, and finds the bucket location.
- If the correct bucket is found, it returns the value (e.value)
- If no match is found, it returns null.
Some times there may be chances of Hashcode collision and for solving this collision Hashmap uses equals() and then store that element into LinkedList in same bucket.
Lets take example:
Fetch the data for key vaibahv: map.get(new Key("vaibhav"));
Steps:
Calculate hash code of Key {“vaibhav”}.It will be generated as 118.
Calculate index by using index method it will be 6.
Go to index 6 of array and compare first element’s key with given key. If both are equals then return the value, otherwise check for next element if it exists.
In our case it is not found as first element and next of node object is not null.
If next of node is null then return null.
If next of node is not null traverse to the second element and repeat the process 3 until key is not found or next is not null.
For this retrieval process for loop will be used. For more reference you can refer this
Solution 3:
For the record, in java-8, this is present also (sort of, since there are TreeNode
s also):
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
Basically (for the case when the bin is not a Tree
), iterate the entire bin, until you find the entry we are looking for.
Looking at this implementation you might understand why providing a good hash is good - so that not all entries end up in the same bucket, thus a bigger time to search for it.
Solution 4:
I think @Eran has already answered your query well and @Prashant has also made a good attempt along with other people who have answered, so let me explain it using an example so that concept becomes very clear.
Concepts
Basically what @Eran is trying to say that in a given bucket (basically at given index of the array) it is possible that there is more than one entry (nothing but Entry
object) and this is possible when 2 or more keys give different hash but give the same index/bucket location.
Now, in order to put the entry in the hashmap, this is what happens at a high level (read carefully because I have gone the extra mile to explain some good things which are otherwise not part of your question):
-
Get the hash: what happens here is that first hash is calculated for a given key (notice that this is not
hashCode
, a hash is calculated using thehashCode
and it is done as-as to mitigate the risk of poorly written hash function). - Get the index: This is basically the index of the array or in other words bucket. Now, why this index is calculated instead of directly using the hash as the index is because to mitigate the risk that hash could more than the size of the hashmap, so this index calculation step ensures that index will always be less than the size of the hashmap.
And when a situation occurs when 2 keys give different hash but the same index, then both those will go in the same bucket, and that is the reason that FOR loop is important.
Example
Below is a simple example I have created to demonstrate the concept to you:
public class Person {
private int id;
Person(int _id){
id = _id;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
@Override
public int hashCode() {
return id;
}
}
Test class:
import java.util.Map;
public class HashMapHashingTest {
public static void main(String[] args) {
Person p1 = new Person(129);
Person p2 = new Person(133);
Map<Person, String> hashMap = new MyHashMap<>(2);
hashMap.put(p1, "p1");
hashMap.put(p2, "p2");
System.out.println(hashMap);
}
}
Debug screenshot (please click and zoom because it is looking small):
Notice, that in above example, both Person
object gives different hash value (136 and 140 respectively) but gives the same index of 0, so both objects go in the same bucket. In the screenshot, you can see that both objects are at index 0
and there you have a next
also populated which basically points to the second object.
Update: Another easiest way to see that more than one key is going into the same bucket is by creating a class and overriding the
hashCode
method to always return the same int value, now what would happen is that all the objects of that class would give the same index/bucket location but since you have not overridden the equals
method so they would not be considered same and hence will form a list at that index/bucket location.
Another twist in this would suppose you override the equals
method as well and compare all objects equal then only one object will be present at the index/bucket location because all objects are equal.