java8 "java.lang.OutOfMemoryError: Metaspace"

After switching our java application (services running on Tomcat) JRE from Java 7 to Java 8, we started to see java.lang.OutOfMemoryError: Metaspace after running a few days with high traffic volume.

Heap usage was OK. Metaspace jumps after sometime when the same code flow was executed during performance testing.

What could be possible causes of the metaspace memory issue?

Current settings is:

-server -Xms8g -Xmx8g -XX:MaxMetaspaceSize=3200m  -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:MaxGCPauseMillis=1000 
-XX:+DisableExplicitGC -XX:+PrintGCDetails 
-XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=7 -XX:NewSize=5004m 
-XX:MaxNewSize=5004m -XX:MaxTenuringThreshold=12 
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintFlagsFinal  
-XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution 
-XX:+PrintGCCause -XX:+PrintAdaptiveSizePolicy 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=200M 

Also the application has heavy use of reflection. Also we use a custom class loader. All of them were working fine in java 7.


I assume you can create the issue with same request (Set of requests) over a period of time. It is a good thing that you have MaxMetaspaceSize defined, otherwise app will use native memory till it runs out to grow. But i will start with following steps:

  1. Check if your number of classes that are loaded in JVM keeps growing for same request when you send it to the server multiple times. If yes, you may be creating dynamic classes which would cause growth in classes loaded in metaspace. Well how to check the number of classes loaded, you can use visualvm to connect to the server using JMX or run locally to simulate. I will mention steps for local , but for remote attaching JMX , you should add following to the JVM parameters to application and start it and remote connect on port 9999 and with -XX:+UnlockDiagnosticVMOptions.
   -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -XX:+UnlockDiagnosticVMOptions

Once you have the visualvm (jvisualvm) connected to the JVM, click on monitor and then see the number of classes loaded. There you can monitor the heap as well as metaspace. But i will add the other tools to closely monitor the metaspace.

  1. Also once you're connected to the jvm, you may want to take a heap snapshot and find out the classes loaded using OQL. So to before you take a heap dump, stop the requests to the server , so you dont catch any inflight request/executing code and their associated objects, but it is not necessary. So after running same set of requests multiple times , inside visualvm , in the "monitor" space, click on "Heap Dump" on the right top". Then open/load the snapshot, and you would see the option to OQL console. And you would see some predefined OQL queries on the right bottom panel under permgen analysis. Run the query named "classloader loaded class histogram", i guess that will give the count of classes loaded by each classloader. You can use it to find out which classloader is loading classes.

select map(sort(map(heap.objects('java.lang.ClassLoader'), '{loader: it, count: it.classes.elementCount }'), 'lhs.count < rhs.count'), 'toHtml(it) + "
"')

But the query above that named "classloader loaded class" will be slow which will actually show the classes loaded by each classloader.

select { loader: cl,
             classes: filter(map(cl.classes.elementData, 'it'), 'it != null') }
    from instanceof java.lang.ClassLoader cl
  1. Then try to trace down the growth in metaspace area. Now we will use jconsole and something new that java has: jmc (java mission control). You may use jconsole to connect to the jvm (local or remote) and once you are connected go the memory tab and you can monitor the non heap growth there, which should have metaspace and code cache and compressed class space. And now connect

jmc

to connect to the VM and then once you are connected click on "Diagnostic commands" in the JMC which is on the right top side. Since we have enabled UnlockDiagnosticVMOptions , GC.class_stats could be executed. You may want to run it with show all columns and print in csv. So the command will look like:

GC.class_stats -all=true -csv=true

And then you can compare the class stats over different periods and find out which classes are causing trouble(metaspace growth) or which classes have related information( method/method data) in metaspace. How to analyze the csv outputs collected over at time: well i would take that csv and load it up in two similar table(representing csv) in a database or some other place to compare GC.class_stats csv outputs where i can run some SQL or any other analyzing tools. That would give a better idea of what is exactly growing in metaspace. The GC class stats has following columns:

Index,Super,InstSize,InstCount,InstBytes,Mirror,KlassBytes,K_secondary_supers,VTab,ITab,OopMap,IK_methods,IK_method_ordering,IK_default_methods,IK_default_vtable_indices,IK_local_interfaces,IK_transitive_interfaces,IK_fields,IK_inner_classes,IK_signers,class_annotations,class_type_annotations,fields_annotations,fields_type_annotations,methods_annotations,methods_parameter_annotations,methods_type_annotations,methods_default_annotations,annotations,Cp,CpTags,CpCache,CpOperands,CpRefMap,CpAll,MethodCount,MethodBytes,ConstMethod,MethodData,StackMap,Bytecodes,MethodAll,ROAll,RWAll,Total,ClassName,ClassLoader

Hope it helps. Also it appears the bug may be in Java 8 if it does not cause any leak in 1.7.

Also the classes will not get unloaded from metaspace if any one is holding any reference to classloader. If you know your classloaders are supposed to be GCed and no one should hold the reference to your class loader, you can go back to heap dump in visualvm and click on class loader instance and right click to find "nearest GC root" which will tell you who is holding the reference to the classloaders.


we had similar issue and the root cause was 60K class files are getting loaded into metaspace memory, but nothing getting unloaded.Adding below JVM arg fixed out issue.

-Dcom.sun.xml.bind.v2.bytecode.ClassTailor.noOptimize=true

https://issues.apache.org/jira/browse/CXF-2939

Hope this helps.