How do programs know how much space to allocate for local variables on the stack?

The compiler knows because it looked at the source code (or actually its internal representation of the logic after parsing it) and added up the total size needed for all the things that it had to allocate stack space for. And also it has to get RSP 16-byte aligned before the call, given that RSP % 16 == 8 on function entry.

So alignment is one reason compilers may reserve more than the function actually uses, but also compiler missed-optimization bugs can make it waste space: common for GCC to waste an extra 16 bytes, although that's not happening here.

Yes, modern compilers parse the entire function (actually whole source file) before emitting any code for it. That's kind of the point of an ahead-of-time optimizing compiler, so it's designed around doing that, even if you make a debug build. By comparison, TCC, the Tiny C Compiler, is one-pass, and leaves a spot in its function prologue to go back later and fill in whatever total size after getting to the bottom of the function in the source code. See Tiny C Compiler's generated code emits extra (unnecessary?) NOPs and JMPs - when that number happens to be zero, there's still a sub esp, 0 there. (TCC only targets 32-bit mode.)

Related: Function Prologue and Epilogue in C


In leaf functions, compilers can use the red zone below RSP when targeting the x86-64 System V, avoiding the need to reserve as much (or any) stack space even if there are some locals they choose to spill/reload. (e.g. any at all in unoptimized code.) See also Why is there no "sub rsp" instruction in this function prologue and why are function parameters stored at negative rbp offsets? Except for kernel code, or other code compiled with -mno-red-zone.

Or in Windows x64, callers need to reserve shadow space for their callee to use, which also gives small functions the chance to not spend any instructions moving RSP around, just using the shadow space above their return address. But for non-leaf functions, this means reserving at least 32 bytes of shadow space plus any for alignment or locals. See for example Shadow space example

In standard calling conventions for ISAs other than x86-64, other rules may come into play that affect things.


Note that in 64-bit code, leave pops RBP, not EBP, and that ret pops into RIP, not EIP.

Also, mov ecx,DWORD PTR [rbp-0x4] is not variable initialization. That's a load, from uninitialized memory into a register. Probably you did something like int a,b,c; without initializers, then passed them as args to printf.