Padding in structures in C
I don't think there's an advantage for any of this structures. There is one(!) constant in this equation. The order of the members of the struct is guaranteed to be as declared.
So in case like the following, the second structure might have an advantage, since it probably has a smaller size, but not in your example, as they will probably have the same size:
struct {
char a;
int b;
char c;
} X;
Vs.
struct {
char a;
char b;
int c;
} Y;
A little more explanation regarding comments below:
All the below is not a 100%, but the common way the structs will be constructed in 32 bits system where int is 32 bits:
Struct X:
| | | | | | | | | | | | |
char pad pad pad ---------int---------- char pad pad pad = 12 bytes
struct Y:
| | | | | | | | |
char char pad pad ---------int---------- = 8 bytes
Some machines access data more efficiently when the values aligned to some boundary. Some require data to be aligned.
On modern 32-bit machines like the SPARC or the Intel [34]86, or any Motorola chip from the 68020 up, each data iten must usually be ``self-aligned'', beginning on an address that is a multiple of its type size. Thus, 32-bit types must begin on a 32-bit boundary, 16-bit types on a 16-bit boundary, 8-bit types may begin anywhere, struct/array/union types have the alignment of their most restrictive member.
So you could have
struct B {
char a;
/* 3 bytes of padding ? More ? */
int* b;
}
A simple rule that minimize padding in the ``self-aligned'' case (and does no harm in most others) is to order your struct members by decreasing size.
Personally I see not disadvantage with the first struct when compared to the second.
I can't think of a disadvantage of the first structure over the second in this particular case, but it's possible to come up with examples where there are disadvantages to the general rule of putting the largest members first:
struct A {
int* a;
short b;
A(short num) : b(2*num+1), a(new int[b]) {}
// OOPS, `b` is used uninitialized, and a good compiler will warn.
// The only way to get `b` initialized before `a` is to declare
// it first in the class, or of course we could repeat `2*num+1`.
}
I also heard about quite a complicated case for large structs, where the CPU has fast addressing modes for accessing pointer+offset, for small values of offset (up to 8 bits, for example, or some other limit of an immediate value). You best micro-optimize a large structure by putting as many of the most commonly-used fields as possible within range of the fastest instructions.
The CPU might even have fast addressing for pointer+offset and pointer+4*offset. Then suppose you had 64 char fields and 64 int fields: if you put the char fields first then all fields of both types can be addressed using the best instructions, whereas if you put the int fields first then the char fields that aren't 4-aligned will just have to be accessed differently, perhaps by loading a constant into a register rather than with an immediate value, because they're outside the 256-byte limit.
Never had to do it myself, and for instance x86 allows big immediate values anyway. It's not the sort of optimization that anyone would normally think about unless they spend a lot of time staring at assembly.
Briefly, there's no advantage in choosing either in the general case. The only situation where the choice would matter in practice is if structure packing is enabled, in the case struct A
would be a better choice (since both fields would be aligned in memory, while in struct B
the b
field would be located at an odd offset). Structure packing means that no padding bytes are inserted inside the structure.
However, this is a rather uncommon scenario: structure packing is generally only enabled in specific situations. It is not a concern on most programs. And it is also not controllable through any portable construction in the C standard.