Garbage collection behaviour for String.intern()

Solution 1:

String.intern() manages an internal, native-implemented pool, which has some special GC-related features. This is old code, but if it were implemented anew, it would use a java.util.WeakHashMap. Weak references are a way to keep a pointer to an object without preventing it from being collected. Just the right thing for a unifying pool such as interned strings.

That interned strings are garbage collected can be demonstrated with the following Java code:

public class InternedStringsAreCollected {

    public static void main(String[] args)
    {
        for (int i = 0; i < 30; i ++) {
            foo();  
            System.gc();
        }   
    }

    private static void foo()
    {
        char[] tc = new char[10];
        for (int i = 0; i < tc.length; i ++)
            tc[i] = (char)(i * 136757);
        String s = new String(tc).intern();
        System.out.println(System.identityHashCode(s));
    }
}

This code creates 30 times the same string, interning it each time. Also, it uses System.identityHashCode() to show what hash code Object.hashCode() would have returned on that interned string. When run, this code prints out distinct integer values, which means that you do not get the same instance each time.

Anyway, usage of String.intern() is somewhat discouraged. It is a shared static pool, which means that it easily turns into a bottleneck on multi-core systems. Use String.equals() to compare strings, and you will live longer and happier.

Solution 2:

In fact, this not a garbage collection optimisation, but rather a string pool optimization. When you call String.intern(), you replace reference to your initial String with its base reference (the reference of the first time this string was encountered, or this reference if it is not yet known).

However, it will become a garbage collector issue once your string is of no more use in application, since the interned string pool is a static member of the String class and will never be garbage collected.

As a rule of thumb, i consider preferrable to never use this intern method and let the compiler use it only for constants Strings, those declared like this :

String myString = "a constant that will be interned";

This is better, in the sense it won't let you do the false assumption == could work when it won't.

Besides, the fact is String.equals underlyingly calls == as an optimisation, making it sure interned strings optimization are used under the hood. This is one more evidence == should never be used on Strings.

Solution 3:

This article provides the full answer.

In java 6 the string pool resides in the PermGen, since java 7 the string pool resides in the heap memory.

Manually interned strings will be garbage-collected.
String literals will be only garbage collected if the class that defines them is unloaded.

The string pool is a HashMap with fixed size which was small in java 6 and early versions of java 7, but increased to 60013 since java 7u40.
It can be changed with -XX:StringTableSize=<new size> and viewed with -XX:+PrintFlagsFinal java options.