Consistency of hashCode() on a Java string

The hashCode value of a Java String is computed as (String.hashCode()):

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

Are there any circumstances (say JVM version, vendor, etc.) under which the following expression will evaluate to false?

boolean expression = "This is a Java string".hashCode() == 586653468

Update #1: If you claim that the answer is "yes, there are such circumstances" - then please give a concrete example of when "This is a Java string".hashCode() != 586653468. Try to be as specific/concrete as possible.

Update #2: We all know that relying on the implementation details of hashCode() is bad in general. However, I'm talking specifically about String.hashCode() - so please keep the answer focused to String.hashCode(). Object.hashCode() is totally irrelevant in the context of this question.


Solution 1:

I can see that documentation as far back as Java 1.2.

While it's true that in general you shouldn't rely on a hash code implementation remaining the same, it's now documented behaviour for java.lang.String, so changing it would count as breaking existing contracts.

Wherever possible, you shouldn't rely on hash codes staying the same across versions etc - but in my mind java.lang.String is a special case simply because the algorithm has been specified... so long as you're willing to abandon compatibility with releases before the algorithm was specified, of course.

Solution 2:

I found something about JDK 1.0 and 1.1 and >= 1.2:

In JDK 1.0.x and 1.1.x the hashCode function for long Strings worked by sampling every nth character. This pretty well guaranteed you would have many Strings hashing to the same value, thus slowing down Hashtable lookup. In JDK 1.2 the function has been improved to multiply the result so far by 31 then add the next character in sequence. This is a little slower, but is much better at avoiding collisions. Source: http://mindprod.com/jgloss/hashcode.html

Something different, because you seem to need a number: How about using CRC32 or MD5 instead of hashcode and you are good to go - no discussions and no worries at all...