ThreadLocal & Memory Leak
It is mentioned at multiple posts: improper use of ThreadLocal
causes Memory Leak. I am struggling to understand how Memory Leak would happen using ThreadLocal
.
The only scenario I have figured out it as below:
A web-server maintains a pool of Threads (e.g. for servlets). Those threads can create memory leak if the variables in
ThreadLocal
are not removed because Threads do not die.
This scenario does not mention "Perm Space" memory leak. Is that the only (major) use case of memory leak?
Solution 1:
PermGen exhaustions in combination with ThreadLocal
are often caused by classloader leaks.
An example:
Imagine an application server which has a pool of worker threads.
They will be kept alive until application server termination.
A deployed web application uses a static ThreadLocal
in one of its classes in order to store some thread-local data, an instance of another class (lets call it SomeClass
) of the web application. This is done within the worker thread (e.g. this action originates from a HTTP request).
Important:
By definition, a reference to a ThreadLocal
value is kept until the "owning" thread dies or if the ThreadLocal itself is no longer reachable.
If the web application fails to clear the reference to the ThreadLocal
on shutdown, bad things will happen:
Because the worker thread will usually never die and the reference to the ThreadLocal
is static, the ThreadLocal
value still references the instance of SomeClass
, a web application's class - even if the web application has been stopped!
As a consequence, the web application's classloader cannot be garbage collected, which means that all classes (and all static data) of the web application remain loaded (this affects the PermGen memory pool as well as the heap).
Every redeployment iteration of the web application will increase permgen (and heap) usage.
=> This is the permgen leak
One popular example of this kind of leak is this bug in log4j (fixed in the meanwhile).
Solution 2:
The accepted answer to this question, and the "severe" logs from Tomcat about this issue are misleading. The key quote there is:
By definition, a reference to a ThreadLocal value is kept until the "owning" thread dies or if the ThreadLocal itself is no longer reachable. [My emphasis].
In this case the only references to the ThreadLocal are in the static final field of a class that has now become a target for GC, and the reference from the worker threads. However, the references from the worker threads to the ThreadLocal are WeakReferences!
The values of a ThreadLocal are not weak references, however. So, if you have references in the values of a ThreadLocal to application classes, then these will maintain a reference to the ClassLoader and prevent GC. However, if your ThreadLocal values are just integers or strings or some other basic object type (e.g., a standard collection of the above), then there should not be a problem (they will only prevent GC of the boot/system classloader, which is never going to happen anyway).
It is still good practice to explicitly clean up a ThreadLocal when you are done with it, but in the case of the cited log4j bug the sky was definitely not falling (as you can see from the report, the value is an empty Hashtable).
Here is some code to demonstrate. First, we create a basic custom classloader implementation with no parent that prints to System.out on finalization:
import java.net.*;
public class CustomClassLoader extends URLClassLoader {
public CustomClassLoader(URL... urls) {
super(urls, null);
}
@Override
protected void finalize() {
System.out.println("*** CustomClassLoader finalized!");
}
}
We then define a driver application which creates a new instance of this class loader, uses it to load a class with a ThreadLocal and then remove the reference to the classloader allowing it to be GC'ed. Firstly, in the case where the ThreadLocal value is a reference to a class loaded by the custom classloader:
import java.net.*;
public class Main {
public static void main(String...args) throws Exception {
loadFoo();
while (true) {
System.gc();
Thread.sleep(1000);
}
}
private static void loadFoo() throws Exception {
CustomClassLoader cl = new CustomClassLoader(new URL("file:/tmp/"));
Class<?> clazz = cl.loadClass("Main$Foo");
clazz.newInstance();
cl = null;
}
public static class Foo {
private static final ThreadLocal<Foo> tl = new ThreadLocal<Foo>();
public Foo() {
tl.set(this);
System.out.println("ClassLoader: " + this.getClass().getClassLoader());
}
}
}
When we run this, we can see that the CustomClassLoader is indeed not garbage collected (as the thread local in the main thread has a reference to a Foo instance that was loaded by our custom classloader):
$ java Main ClassLoader: CustomClassLoader@7a6d084b
However, when we change the ThreadLocal to instead contain a reference to a simple Integer rather than a Foo instance:
public static class Foo {
private static final ThreadLocal<Integer> tl = new ThreadLocal<Integer>();
public Foo() {
tl.set(42);
System.out.println("ClassLoader: " + this.getClass().getClassLoader());
}
}
Then we see that the custom classloader is now garbage collected (as the thread local on the main thread only has a reference to an integer loaded by the system classloader):
$ java Main ClassLoader: CustomClassLoader@e76cbf7 *** CustomClassLoader finalized!
(The same is true with Hashtable). So in the case of log4j they didn't have a memory leak or any kind of bug. They were already clearing the Hashtable and this was sufficient to ensure GC of the classloader. IMO, the bug is in Tomcat, which indiscriminately logs these "SEVERE" errors on shutdown for all ThreadLocals that have not been explicitly .remove()d, regardless of whether they hold a strong reference to an application class or not. It seems that at least some developers are investing time and effort on "fixing" phantom memory leaks on the say-so of the sloppy Tomcat logs.
Solution 3:
There is nothing inherently wrong with thread locals: They do not cause memory leaks. They are not slow. They are more local than their non-thread-local counterparts (i.e., they have better information hiding properties). They can be misused, of course, but so can most other programming tools…
Refer to this link by Joshua Bloch
Solution 4:
The previous posts explain the problem but don't provide any solution. I found that there is no way to "clear" a ThreadLocal. In a container environment where I'm handling requests, I finally just called .remove() at the end of every request. I realize that could be problematic using container managed transactions.