What does KEEP mean in a linker script?

Afaik LD keeps the symbols in the section even if symbols are not referenced. (--gc-sections).

Usually used for sections that have some special meaning in the binary startup process, more or less to mark the roots of the dependency tree.


(For Sabuncu below)

Dependency tree:

If you eliminate unused code, you analyze the code and mark all reachable sections (code+global variables + constants).

So you pick a section, mark it as "used" and see what other (unused) section it references, then you mark those section as "used", and check what they reference etc.

The section that are not marked "used" are then redundant, and can be eliminated.

Since a section can reference multiple other sections (e.g. one procedure calling three different other ones), if you would draw the result you get a tree.

Roots:

The above principle however leaves us with a problem: what is the "first" section that is always used? The first node (root) of the tree so to speak? This is what "keep()" does, it tells the linker which sections (if available) are the first ones to look at. As a consequence these are always linked in.

Typically these are sections that are called from the program loader to perform tasks related to dynamic linking (can be optional, and OS/fileformat dependent), and the entry point of the program.


Minimal Linux IA-32 example that illustrates its usage

main.S

.section .text
.global _start
_start:
    /* Dummy access so that after will be referenced and kept. */
    mov after, %eax
    /*mov keep, %eax*/

    /* Exit system call. */
    mov $1, %eax

    /* Take the exit status 4 bytes after before. */
    mov $4, %ebx
    mov before(%ebx), %ebx

    int $0x80

.section .before
    before: .long 0
/* TODO why is the `"a"` required? */
.section .keep, "a"
    keep: .long 1
.section .after
    after: .long 2

link.ld

ENTRY(_start)
SECTIONS
{
    . = 0x400000;
    .text :
    {
        *(.text)
        *(.before)
        KEEP(*(.keep));
        *(.keep)
        *(.after)
    }
}

Compile and run:

as --32 -o main.o main.S
ld --gc-sections -m elf_i386 -o main.out -T link.ld main.o
./main.out
echo $?

Output:

1

If we comment out the KEEP line the output is:

2

If we either:

  • add a dummy mov keep, %eax
  • remove --gc-sections

The output goes back to 1.

Tested on Ubuntu 14.04, Binutils 2.25.

Explanation

There is no reference to the symbol keep, and therefore its containing section .keep.

Therefore if garbage collection is enabled and we don't use KEEP to make an exception, that section will not be put in the executable.

Since we are adding 4 to the address of before, if the keep section is not present, then the exit status will be 2, which is present on the next .after section.

TODO: nothing happens if we remove the "a" from .keep, which makes it allocatable. I don't understand why that is so: that section will be put inside the .text segment, which because of it's magic name will be allocatable.


Force the linker to keep some specific sections

SECTIONS 
{
....
....

*(.rodata .rodata.*)

KEEP(*(SORT(.scattered_array*)));
}