What is exactly the base pointer and stack pointer? To what do they point?
esp
is as you say it is, the top of the stack.
ebp
is usually set to esp
at the start of the function. Function parameters and local variables are accessed by adding and subtracting, respectively, a constant offset from ebp
. All x86 calling conventions define ebp
as being preserved across function calls. ebp
itself actually points to the previous frame's base pointer, which enables stack walking in a debugger and viewing other frames local variables to work.
Most function prologs look something like:
push ebp ; Preserve current frame pointer
mov ebp, esp ; Create new frame pointer pointing to current stack top
sub esp, 20 ; allocate 20 bytes worth of locals on stack.
Then later in the function you may have code like (presuming both local variables are 4 bytes)
mov [ebp-4], eax ; Store eax in first local
mov ebx, [ebp - 8] ; Load ebx from second local
FPO or frame pointer omission optimization which you can enable will actually eliminate this and use ebp
as another register and access locals directly off of esp
, but this makes debugging a bit more difficult since the debugger can no longer directly access the stack frames of earlier function calls.
EDIT:
For your updated question, the missing two entries in the stack are:
var_C = dword ptr -0Ch
var_8 = dword ptr -8
var_4 = dword ptr -4
*savedFramePointer = dword ptr 0*
*return address = dword ptr 4*
hInstance = dword ptr 8h
PrevInstance = dword ptr 0C
hlpCmdLine = dword ptr 10h
nShowCmd = dword ptr 14h
This is because the flow of the function call is:
- Push parameters (
hInstance
, etc.) - Call function, which pushes return address
- Push
ebp
- Allocate space for locals
ESP
is the current stack pointer, which will change any time a word or address is pushed or popped onto/off off the stack. EBP
is a more convenient way for the compiler to keep track of a function's parameters and local variables than using the ESP
directly.
Generally (and this may vary from compiler to compiler), all of the arguments to a function being called are pushed onto the stack by the calling function (usually in the reverse order that they're declared in the function prototype, but this varies). Then the function is called, which pushes the return address (EIP
) onto the stack.
Upon entry to the function, the old EBP
value is pushed onto the stack and EBP
is set to the value of ESP
. Then the ESP
is decremented (because the stack grows downward in memory) to allocate space for the function's local variables and temporaries. From that point on, during the execution of the function, the arguments to the function are located on the stack at positive offsets from EBP
(because they were pushed prior to the function call), and the local variables are located at negative offsets from EBP
(because they were allocated on the stack after the function entry). That's why the EBP
is called the Frame Pointer, because it points to the center of the function call frame.
Upon exit, all the function has to do is set ESP
to the value of EBP
(which deallocates the local variables from the stack, and exposes the entry EBP
on the top of the stack), then pop the old EBP
value from the stack, and then the function returns (popping the return address into EIP
).
Upon returning back to the calling function, it can then increment ESP
in order to remove the function arguments it pushed onto the stack just prior to calling the other function. At this point, the stack is back in the same state it was in prior to invoking the called function.
You have it right. The stack pointer points to the top item on the stack and the base pointer points to the "previous" top of the stack before the function was called.
When you call a function, any local variable will be stored on the stack and the stack pointer will be incremented. When you return from the function, all the local variables on the stack go out of scope. You do this by setting the stack pointer back to the base pointer (which was the "previous" top before the function call).
Doing memory allocation this way is very, very fast and efficient.
EDIT: For a better description, see x86 Disassembly/Functions and Stack Frames in a WikiBook about x86 assembly. I try to add some info you might be interested in using Visual Studio.
Storing the caller EBP as the first local variable is called a standard stack frame, and this may be used for nearly all calling conventions on Windows. Differences exist whether the caller or callee deallocates the passed parameters, and which parameters are passed in registers, but these are orthogonal to the standard stack frame problem.
Speaking about Windows programs, you might probably use Visual Studio to compile your C++ code. Be aware that Microsoft uses an optimization called Frame Pointer Omission, that makes it nearly impossible to do walk the stack without using the dbghlp library and the PDB file for the executable.
This Frame Pointer Omission means that the compiler does not store the old EBP on a standard place and uses the EBP register for something else, therefore you have hard time finding the caller EIP without knowing how much space the local variables need for a given function. Of course Microsoft provides an API that allows you to do stack-walks even in this case, but looking up the symbol table database in PDB files takes too long for some use cases.
To avoid FPO in your compilation units, you need to avoid using /O2 or need to explicitly add /Oy- to the C++ compilation flags in your projects. You probably link against the C or C++ runtime, which uses FPO in the Release configuration, so you will have hard time to do stack walks without the dbghlp.dll.