ARM: link register and frame pointer

I'm trying to understand how the link register and the frame pointer work in ARM. I've been to a couple of sites, and I wanted to confirm my understanding.

Suppose I had the following code:

int foo(void)
{
    // ..
    bar();
    // (A)
    // ..
}

int bar(void)
{
    // (B)
    int b1;
    // ..
    // (C)
    baz();
    // (D)
}

int baz(void)
{
    // (E)
    int a;
    int b;
    // (F)
}

and I call foo(). Would the link register contain the address for the code at point (A) and the frame pointer contain the address at the code at point (B)? And the stack pointer would could be any where inside bar(), after all the locals have been declared?


Solution 1:

Some register calling conventions are dependent on the ABI (Application Binary Interface). The FP is required in the APCS standard and not in the newer AAPCS (2003). For the AAPCS (GCC 5.0+) the FP does not have to be used but certainly can be; debug info is annotated with stack and frame pointer use for stack tracing and unwinding code with the AAPCS. If a function is static, a compiler really doesn't have to adhere to any conventions.

Generally all ARM registers are general purpose. The lr (link register, also R14) and pc (program counter also R15) are special and enshrine in the instruction set. You are correct that the lr would point to A. The pc and lr are related. One is "where you are" and the other is "where you were". They are the code aspect of a function.

Typically, we have the sp (stack pointer, R13) and the fp (frame pointer, R11). These two are also related. This Microsoft layout does a good job describing things. The stack is used to store temporary data or locals in your function. Any variables in foo() and bar(), are stored here, on the stack or in available registers. The fp keeps track of the variables from function to function. It is a frame or picture window on the stack for that function. The ABI defines a layout of this frame. Typically the lr and other registers are saved here behind the scenes by the compiler as well as the previous value of fp. This makes a linked list of stack frames and if you want you can trace it all the way back to main(). The root is fp, which points to one stack frame (like a struct) with one variable in the struct being the previous fp. You can go along the list until the final fp which is normally NULL.

So the sp is where the stack is and the fp is where the stack was, a lot like the pc and lr. Each old lr (link register) is stored in the old fp (frame pointer). The sp and fp are a data aspect of functions.

Your point B is the active pc and sp. Point A is actually the fp and lr; unless you call yet another function and then the compiler might get ready to setup the fp to point to the data in B.

Following is some ARM assembler that might demonstrate how this all works. This will be different depending on how the compiler optimizes, but it should give an idea,

; Prologue - setup
mov     ip, sp                 ; get a copy of sp.
stmdb   sp!, {fp, ip, lr, pc}  ; Save the frame on the stack. See Addendum
sub     fp, ip, #4             ; Set the new frame pointer.
    ...
; Maybe other functions called here.
; Older caller return lr stored in stack frame. bl baz ... ; Epilogue - return ldm sp, {fp, sp, lr} ; restore stack, frame pointer and old link. ... ; maybe more stuff here. bx lr ; return.
This is what foo() would look like. If you don't call bar(), then the compiler does a leaf optimization and doesn't need to save the frame; only the bx lr is needed. Most likely this maybe why you are confused by web examples. It is not always the same.

The take-away should be,

  1. pc and lr are related code registers. One is "Where you are", the other is "Where you were".
  2. sp and fp are related local data registers.
    One is "Where local data is", the other is "Where the last local data is".
  3. The work together along with parameter passing to create function machinery.
  4. It is hard to describe a general case because we want compilers to be as fast as possible, so they use every trick they can.

These concepts are generic to all CPUs and compiled languages, although the details can vary. The use of the link register, frame pointer are part of the function prologue and epilogue, and if you understood everything, you know how a stack overflow works on an ARM.

See also: ARM calling convention.
                MSDN ARM stack article
                University of Cambridge APCS overview
                ARM stack trace blog
                Apple ABI link

The basic frame layout is,

  • fp[-0] saved pc, where we stored this frame.
  • fp[-1] saved lr, the return address for this function.
  • fp[-2] previous sp, before this function eats stack.
  • fp[-3] previous fp, the last stack frame.
  • many optional registers...

An ABI may use other values, but the above are typical for most setups. The indexes above are for 32 bit values as all ARM registers are 32 bits. If you are byte-centric, multiply by four. The frame is also aligned to at least four bytes.

Addendum: This is not an error in the assembler; it is normal. An explanation is in the ARM generated prologs question.