Double alignment

Following the discussion from this post, I have understood that the main reason for the alignment of structure members is performance (and some architectures restrictions).

If we will investigate Microsoft (Visual C++), Borland/CodeGear (C++-Builder), Digital Mars (DMC) and GNU (GCC) when compiling for 32-bit x86: The alignment for int is 4 bytes and if int is not aligned, it can happen that 2 rows of memory banks will be read.

My question is why not to make double to be 4 bytes aligned also? 4 bytes aligned double also will cause 2 rows of memory banks reading....

For example in the following example, since double is 8-aligned, the actual size of structure will be sizeof(char) + (alignment for double padding) + sizeof(int) = 20 bytes.

typedef struct structc_tag{
    char        c;
    double      d;
    int         s;
} structc_t;

Thank you

Solution 1:

An extended comment:

According to GCC documentation about -malign-double:

Aligning double variables on a two-word boundary produces code that runs somewhat faster on a Pentium at the expense of more memory.

On x86-64, -malign-double is enabled by default.

Warning: if you use the -malign-double switch, structures containing the above types are aligned differently than the published application binary interface specifications for the 386 and are not binary compatible with structures in code compiled without that switch.

A word here means i386 word which is 32 bits.

Windows uses 64-bit alignment of double values even in 32-bit mode, while SysV i386 ABI conformant Unices use 32-bit alignment. The 32-bit Windows API, Win32, comes from Windows NT 3.1, which, unlike current generation Windows versions, targeted Intel i386, Alpha, MIPS and even the obscure Intel i860. As native RISC systems like Alpha and MIPS require double values to be 64-bit aligned (otherwise hardware fault occurs), portability might have been the rationale behind the 64-bit alignment in the Win32 i386 ABI.

64-bit x86 systems, know also as AMD64 or x86-64, or x64, require double values to be 64-bit aligned otherwise a misalignment fault occurs and the hardware does an expensive "fix-up" which considreably slows down memory access. That's why double values are 64-bit aligned in all modern x86-64 ABIs (SysV and Win32).

Solution 2:

Most compilers will automatically align data values to the word size of the platform, or to the size of the data type, whichever is smaller. The vast majority of consumer and enterprise processors use a 32 bit word size. (Even 64 bit systems usually use 32 bits as a native word size)

As such, the ordering of members in your struct could possibly waste some memory. In your specific case, you're fine. I'll add in comments the actual footprint of used memory:

typedef struct structc_tag{
          char        c; // 1 byte
                         // 3 bytes (padding)
          double      d; // 8 bytes
          int         s; // 4 bytes
} structc_t;             // total: 16 bytes

This rule applies to structures too, so even if you rearranged them so the smallest field was last, you would still have a struct of the same size (16 bytes).

typedef struct structc_tag{
          double      d; // 8 bytes
          int         s; // 4 bytes
          char        c; // 1 byte
                         // 3 bytes (padding)
} structc_t;             // total: 16 bytes

If you were to declare more fields that were smaller than 4 bytes, you could see some memory reductions if you grouped them together by size. For example:

typedef struct structc_tag{
          double      d1; // 8 bytes
          double      d2; // 8 bytes
          double      d3; // 8 bytes
          int         s1; // 4 bytes
          int         s2; // 4 bytes
          int         s3; // 4 bytes
          short       s4; // 2 bytes
          short       s5; // 2 bytes
          short       s6; // 2 bytes
          char        c1; // 1 byte
          char        c2; // 1 byte
          char        c3; // 1 byte
                          // 3 bytes (padding)
} structc_t;              // total: 48 bytes

Declaring a stupid struct could waste a lot of memory, unless the compiler reorders your elements (which, in general, it won't, without being explicitly told to)

typedef struct structc_tag{
          int         s1; // 4 bytes
          char        c1; // 1 byte
                          // 3 bytes (padding)
          int         s2; // 4 bytes
          char        c2; // 1 byte
                          // 3 bytes (padding)
          int         s3; // 4 bytes
          char        c3; // 1 byte
                          // 3 bytes (padding)
} structc_t;              // total: 24 bytes 
                          // (9 bytes wasted, or 38%)
                          // (optimal size: 16 bytes (1 byte wasted))

Doubles are larger than 32 bits, and thus according to the rule in the first section, are 32 bit aligned. Someone mentioned a compiler option that changes the alignment, and that the default compiler option is different between 32 and 64 bit systems, this is also valid. So the real answer about doubles is that it depends on the platform and the compiler.

Memory performance is governed by words: loading from memory happens in stages that depend on the placement of data. If the data covers one word (i.e. is word aligned), only that word need be loaded. If it is not aligned correctly (i.e. an int at 0x2), the processor must load 2 words in order to correctly read its value. The same applies to doubles, which normally take up 2 words, but if misaligned, take up 3. On 64 bit systems where native loading of 64 bit quantities is possible, they behave like 32 bit ints on 32 bit systems in that if properly aligned, they can be fetched with one load, but otherwise, they will require 2.

Solution 3:

First of all it's the architecture that impose the alignment requirement, some will tolerate the unaligned memory accesses, others wont.

Lets take x86-32bit windows platform as an example, in this platform the alignment requirement for int and long is 4 bytes and 8 bytes respectively.

It is clear why int alignment requirement is 4 bytes, simply so that the cpu can read it all only by one access.

The reason why the alignment requirement for doulbe is 8 bytes and not 4 bytes, is because if it was 4 bytes then think about what will happen if this double was located at the address 60 and the cache line size was 64bits, in this case the processor need to load 2 cache lines from memory to cache, but if this double was aligned this won't happen, since in this case the double will always be part of one cache line and not divided between two.

   ...58 59|60 61 62 63    64 65 66 67|68 69 70 71...
     -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
 ----------+  +  +  +  .  .  +  +  +  +--------------
           |           .  .           |
 ----------+  +  +  +  .  .  +  +  +  +-------------- 
                       .  .         
      Cache Line 1     .  .  Cache Line 2
     -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -

Double alignment

Solution 1:

Solution 2:

Solution 3:

Related

Recent Posts