Order of local variable allocation on the stack

Solution 1:

I've no idea why GCC organizes its stack the way it does (though I guess you could crack open its source or this paper and find out), but I can tell you how to guarantee the order of specific stack variables if for some reason you need to. Simply put them in a struct:

void function1() {
    struct {
        int x;
        int y;
        int z;
        int *ret;
    } locals;
}

If my memory serves me correctly, spec guarantees that &ret > &z > &y > &x. I left my K&R at work so I can't quote chapter and verse though.

Solution 2:

So, I did some more experimenting and here's what I found. It seems to be based on whether or not each variable is an array. Given this input:

void f5() {
        int w;
        int x[1];
        int *ret;
        int y;
        int z[1];
}

I end up with this in gdb:

(gdb) p &w
$1 = (int *) 0xbffff4c4
(gdb) p &x
$2 = (int (*)[1]) 0xbffff4c0
(gdb) p &ret 
$3 = (int **) 0xbffff4c8
(gdb) p &y
$4 = (int *) 0xbffff4cc
(gdb) p &z
$5 = (int (*)[1]) 0xbffff4bc

In this case, ints and pointers are dealt with first, last declared on the top of the stack and first declared closer to the bottom. Then arrays are handled, in the opposite direction, the earlier the declaration, the highest up on the stack. I'm sure there's a good reason for this. I wonder what it is.

Solution 3:

Not only does ISO C say nothing about the ordering of local variables on the stack, it doesn't even guarantee that a stack even exists. The standard just talks about the scope and lifetime of variables inside a block.

Solution 4:

Usually it has to do with alignment issues.

Most processors are slower at fetching data that isn't processor-word aligned. They have to grab it in pieces and splice it together.

Probably what's happening is it's putting all of the objects which are bigger than or equal to the processor optimal alignment together, and then packing more tightly the things which may not be aligned. It just so happens that in your example all of your char arrays are 4 bytes, but I bet if you make them 3 bytes, they'll still end up in the same places.

But if you had four one-byte arrays, they may end up in one 4-byte range, or aligned in four separate ones.

It's all about what's easiest (translates to "fastest") for the processor to grab.