memory layout C++ objects [closed]

I am basically wondering how C++ lays out the object in memory. So, I hear that dynamic casts simply adjust the object's pointer in memory with an offset; and reinterpret kind of allows us to do anything with this pointer. I don't really understand this. Details would be appreciated!


Solution 1:

Memory layout is mostly left to the implementation. The key exception is that member variables for a given access specifier will be in order of their declaration.

§ 9.2.14

Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified (11). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).

Other than member variables, a class or struct needs to provide space for member variables, subobjects of base classes, virtual function management (e.g. a virtual table), and padding and alignment of these data. This is up to the implementation but the Itanium ABI specification is a popular choice. gcc and clang adhere to it (at least to a degree).

http://mentorembedded.github.io/cxx-abi/abi.html#layout

The Itanium ABI is of course not part of the C++ standard and is not binding. To get more detailed you need to turn to your implementor's documentation and tools. clang provides a tool to view the memory layout of classes. As an example, the following:

class VBase {
    virtual void corge();
    int j;
};

class SBase1 {
    virtual void grault();
    int k;
};

class SBase2 {
    virtual void grault();
    int k;
};

class SBase3 {
    void grault();
    int k;
};

class Class : public SBase1, SBase2, SBase3, virtual VBase {
public:
    void bar();
    virtual void baz();
    // virtual member function templates not allowed, thinking about memory
    // layout and vtables will tell you why
    // template<typename T>
    // virtual void quux();
private:
    int i;
    char c;
public:
    float f;
private:
    double d;
public:
    short s;
};

class Derived : public Class {
    virtual void qux();
};

int main() {
    return sizeof(Derived);
}

After creating a source file that uses the memory layout of the class, clang will reveal the memory layout.

$ clang -cc1 -fdump-record-layouts layout.cpp

The layout for Class:

*** Dumping AST Record Layout
   0 | class Class
   0 |   class SBase1 (primary base)
   0 |     (SBase1 vtable pointer)
   8 |     int k
  16 |   class SBase2 (base)
  16 |     (SBase2 vtable pointer)
  24 |     int k
  28 |   class SBase3 (base)
  28 |     int k
  32 |   int i
  36 |   char c
  40 |   float f
  48 |   double d
  56 |   short s
  64 |   class VBase (virtual base)
  64 |     (VBase vtable pointer)
  72 |     int j
     | [sizeof=80, dsize=76, align=8
     |  nvsize=58, nvalign=8]

More on this clang feature can be found on Eli Bendersky's blog:

http://eli.thegreenplace.net/2012/12/17/dumping-a-c-objects-memory-layout-with-clang/

gcc provides a similar tool, `-fdump-class-hierarchy'. For the class given above, it prints (among other things):

Class Class
   size=80 align=8
   base size=58 base align=8
Class (0x0x141f81280) 0
    vptridx=0u vptr=((& Class::_ZTV5Class) + 24u)
  SBase1 (0x0x141f78840) 0
      primary-for Class (0x0x141f81280)
  SBase2 (0x0x141f788a0) 16
      vptr=((& Class::_ZTV5Class) + 56u)
  SBase3 (0x0x141f78900) 28
  VBase (0x0x141f78960) 64 virtual
      vptridx=8u vbaseoffset=-24 vptr=((& Class::_ZTV5Class) + 88u)

It doesn't itemize the member variables (or at least I don't know how to get it to) but you can tell they would have to be between offset 28 and 64, just as in the clang layout.

You can see that one base class is singled out as primary. This removes the need for adjustment of the this pointer when Class is accessed as an SBase1.

The equivalent for gcc is:

$ g++ -fdump-class-hierarchy -c layout.cpp

The equivalent for Visual C++ is:

cl main.cpp /c /d1reportSingleClassLayoutTest_A

see: https://blogs.msdn.microsoft.com/vcblog/2007/05/17/diagnosing-hidden-odr-violations-in-visual-c-and-fixing-lnk2022/

Solution 2:

Each class lays out its data members in the order of declaration.
The compiler is allowed to place padding between members to make access efficient (but it is not allowed to re-order).

How dynamic_cast<> works is a compiler implementation detail and not defined by the standard. It will all depend on the ABI used by the compiler.

reinterpret_cast<> works by just changing the type of the object. The only thing that you can guarantee that works is that casting a pointer to a void* and back to the same the pointer to class will give you the same pointer.

Solution 3:

The answer is, "it's complicated". Dynamic cast does not simply adjust pointers with an offset; it may actually retrieve internal pointers inside the object in order to do its work. GCC follows an ABI designed for Itanium but implemented more broadly. You can find the gory details here: Itanium C++ ABI.