What's the size cost of Java inheritance?

There are various articles out there on the interwebs that try to empirically estimate the overhead of a java.lang.Object in particular JVM implementations. For example, I've seen the size overhead of a bare Object estimated at 8 bytes in some JVMs.

What I would like to know is whether a typical JVM implementation of the extends relationship introduces incremental size overhead at every level of the class hierarchy. In other words, suppose you have a class hierarchy with N levels of subclasses. Is the overhead of the in-memory representation of a class instance O(1) or O(N)?

I imagine it is O(1) because although the size of some of the hidden fluffy stuff you need to be a Java Object (vtable, chain of classes) will grow as the inheritance hierarchy grows, they grow per-class, not per-instance, and the JVM implementation can store constant-size pointers to these entities in a constant-size header attached to every Object.

So in theory, the overhead directly attached to the in-memory representation of any Java object should be O(1) for inheritance depth N. Does anyone know if it's true in practice?


Solution 1:

When in doubt, look at the source (well, a source; each JVM is free to choose how to do it, as the standard does not mandate any internal representation). So I had a look, and found the following comment within the implementation of JDK 7-u60's hotspot JVM:

// A Klass is the part of the klassOop that provides:
//  1: language level class object (method dictionary etc.)
//  2: provide vm dispatch behavior for the object
// Both functions are combined into one C++ class. The toplevel class "Klass"
// implements purpose 1 whereas all subclasses provide extra virtual functions
// for purpose 2.

// One reason for the oop/klass dichotomy in the implementation is
// that we don't want a C++ vtbl pointer in every object.  Thus,
// normal oops don't have any virtual functions.  Instead, they
// forward all "virtual" functions to their klass, which does have
// a vtbl and does the C++ dispatch depending on the object's

The way I read it, this means that, for this (very popular) implementation, object instances only store a pointer to their class. The cost, per-instance of a class, of that class having a longer or shorter inheritance chain is effectively 0. The classes themselves do take up space in memory though (but only once per class). Run-time efficiency of deep inheritance chains is another matter.

Solution 2:

The JVM specification states

The Java Virtual Machine does not mandate any particular internal structure for objects.

So the specification does not care how you do it. But...

In some of Oracle’s implementations of the Java Virtual Machine, a reference to a class instance is a pointer to a handle that is itself a pair of pointers: one to a table containing the methods of the object and a pointer to the Class object that represents the type of the object, and the other to the memory allocated from the heap for the object data.

So in typical Oracle implementations it is O(1) for methods. This method table is the Method Area which is per class.

The Java Virtual Machine has a method area that is shared among all Java Virtual Machine threads. The method area is analogous to the storage area for compiled code of a conventional language or analogous to the "text" segment in an operating system process. It stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors, including the special methods (§2.9) used in class and instance initialization and interface initialization.

Also, about method entries

The method_info structures represent all methods declared by this class or interface type, including instance methods, class methods, instance initialization methods (§2.9), and any class or interface initialization method (§2.9). The methods table does not include items representing methods that are inherited from superclasses or superinterfaces.