C++ object size with virtual methods
Solution 1:
This is all implementation defined. I'm using VC10 Beta2. The key to help understanding this stuff (the implementation of virtual functions), you need to know about a secret switch in the Visual Studio compiler, /d1reportSingleClassLayoutXXX. I'll get to that in a second.
The basic rule is the vtable needs to be located at offset 0 for any pointer to an object. This implies multiple vtables for multiple inheritance.
Couple questions here, I'll start at the top:
Does it mean that only one vptr is there even both of class B and class A have virtual function? Why there is only one vptr?
This is how virtual functions work, you want the base class and derived class to share the same vtable pointer (pointing to the implementation in the derived class.
It seems that in this case, two vptrs are in the layout.....How does this happen? I think the two vptrs one is for class A and another is for class B....so there is no vptr for the virtual function of class C?
This is the layout of class C, as reported by /d1reportSingleClassLayoutC:
class C size(20):
+---
| +--- (base class A)
0 | | {vfptr}
4 | | a
| +---
| +--- (base class B)
8 | | {vfptr}
12 | | b
| +---
16 | c
+---
You are correct, there are two vtables, one for each base class. This is how it works in multiple inheritance; if the C* is casted to a B*, the pointer value gets adjusted by 8 bytes. A vtable still needs to be at offset 0 for virtual function calls to work.
The vtable in the above layout for class A is treated as class C's vtable (when called through a C*).
The sizeof B is 16 bytes -------------- Without virtual it should be 4 + 4 + 4 = 12. why there is another 4 bytes here? What's the layout of class B ?
This is the layout of class B in this example:
class B size(20):
+---
0 | {vfptr}
4 | {vbptr}
8 | b
+---
+--- (virtual base A)
12 | {vfptr}
16 | a
+---
As you can see, there is an extra pointer to handle virtual inheritance. Virtual inheritance is complicated.
The sizeof D is 32 bytes -------------- it should be 16(class B) + 12(class C) + 4(int d) = 32. Is that right?
No, 36 bytes. Same deal with the virtual inheritance. Layout of D in this example:
class D size(36):
+---
| +--- (base class B)
0 | | {vfptr}
4 | | {vbptr}
8 | | b
| +---
| +--- (base class C)
| | +--- (base class A)
12 | | | {vfptr}
16 | | | a
| | +---
20 | | c
| +---
24 | d
+---
+--- (virtual base A)
28 | {vfptr}
32 | a
+---
My question is , why there is an extra space when virtual inheritance is applied?
Virtual base class pointer, it's complicated. Base classes are "combined" in virtual inheritance. Instead of having a base class embedded into a class, the class will have a pointer to the base class object in the layout. If you have two base classes using virtual inheritance (the "diamond" class hierarchy), they will both point to the same virtual base class in the object, instead of having a separate copy of that base class.
What's the underneath rule for the object size in this case?
Important point; there are no rules: the compiler can do whatever it needs to do.
And a final detail; to make all these class layout diagrams I am compiling with:
cl test.cpp /d1reportSingleClassLayoutXXX
Where XXX is a substring match of the structs/classes you want to see the layout of. Using this you can explore the affects of various inheritance schemes yourself, as well as why/where padding is added, etc.
Solution 2:
Quote> My question is, what's the rule about the number of vptrs in inheritance?
There are no rulez, every compiler vendor is allowed to implement the semantics of inheritance the way he sees fit.
class B: public A {}, size = 12. That's pretty normal, one vtable for B that has both virtual methods, vtable pointer + 2*int = 12
class C : public A, public B {}, size = 20. C can arbitrarily extend the vtable of either A or B. 2*vtable pointer + 3*int = 20
Virtual inheritance: that's where you really hit the edges of undocumented behavior. For example, in MSVC the #pragma vtordisp and /vd compile options become relevant. There's some background info in this article. I studied this a few times and decided the compile option acronym was representative for what could happen to my code if I ever used it.
Solution 3:
A good way to think about it is to understand what has to be done to handle up-casts. I'll try to answer your questions by showing the memory layout of objects of the classes you describe.
Code sample #2
The memory layout is as follows:
vptr | A::a | B::b
Upcasting a pointer to B to type A will result in the same address, with the same vptr being used. This is why there's no need for additional vptr's here.
Code sample #3
vptr | A::a | vptr | B::b | C::c
As you can see, there are two vptr's here, just like you guessed. Why? Because it's true that if we upcast from C to A we don't need to modify the address, and thus can use the same vptr. But if we upcast from C to B we do need that modification, and correspondingly we need a vptr at the start of the resulting object.
So, any inherited class beyond the first will require an additional vptr (unless that inherited class has no virtual methods, in which case it has no vptr).
Code sample #4 and beyond
When you derive virtually, you need a new pointer, called a base pointer, to point to the location in the memory layout of the derived classes. There can be more than one base pointer, of course.
So how does the memory layout look? That depends on the compiler. In your compiler it's probably something like
vptr | base pointer | B::b | vptr | A::a | C::c | vptr | A::a \-----------------------------------------^
But other compilers may incorporate base pointers in the virtual table (by using offsets - that deserves another question).
You need a base pointer because when you derive in a virtual fashion, the derived class will appear only once in the memory layout (it may appear additional times if it's also derived normally, as in your example), so all its children must point to the exact same location.
EDIT: clarification - it all really depends on the compiler, the memory layout I showed can be different in different compilers.