What is the -fPIE option for position-independent executables in gcc and ld?

How will it change the code, e.g. function calls?


Solution 1:

PIE is to support address space layout randomization (ASLR) in executable files.

Before the PIE mode was created, the program's executable could not be placed at a random address in memory, only position independent code (PIC) dynamic libraries could be relocated to a random offset. It works very much like what PIC does for dynamic libraries, the difference is that a Procedure Linkage Table (PLT) is not created, instead PC-relative relocation is used.

After enabling PIE support in gcc/linkers, the body of program is compiled and linked as position-independent code. A dynamic linker does full relocation processing on the program module, just like dynamic libraries. Any usage of global data is converted to access via the Global Offsets Table (GOT) and GOT relocations are added.

PIE is well described in this OpenBSD PIE presentation.

Changes to functions are shown in this slide (PIE vs PIC).

x86 pic vs pie

Local global variables and functions are optimized in pie

External global variables and functions are same as pic

and in this slide (PIE vs old-style linking)

x86 pie vs no-flags (fixed)

Local global variables and functions are similar to fixed

External global variables and functions are same as pic

Note, that PIE may be incompatible with -static

Solution 2:

Minimal runnable example: GDB the executable twice

For those that want to see some action, let's see ASLR work on the PIE executable and change addresses across runs:

main.c

#include <stdio.h>

int main(void) {
    puts("hello");
}

main.sh

#!/usr/bin/env bash
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
for pie in no-pie pie; do
  exe="${pie}.out"
  gcc -O0 -std=c99 "-${pie}" "-f${pie}" -ggdb3 -o "$exe" main.c
  gdb -batch -nh \
    -ex 'set disable-randomization off' \
    -ex 'break main' \
    -ex 'run' \
    -ex 'printf "pc = 0x%llx\n", (long  long unsigned)$pc' \
    -ex 'run' \
    -ex 'printf "pc = 0x%llx\n", (long  long unsigned)$pc' \
    "./$exe" \
  ;
  echo
  echo
done

For the one with -no-pie, everything is boring:

Breakpoint 1 at 0x401126: file main.c, line 4.

Breakpoint 1, main () at main.c:4
4           puts("hello");
pc = 0x401126

Breakpoint 1, main () at main.c:4
4           puts("hello");
pc = 0x401126

Before starting execution, break main sets a breakpoint at 0x401126.

Then, during both executions, run stops at address 0x401126.

The one with -pie however is much more interesting:

Breakpoint 1 at 0x1139: file main.c, line 4.

Breakpoint 1, main () at main.c:4
4           puts("hello");
pc = 0x5630df2d6139

Breakpoint 1, main () at main.c:4
4           puts("hello");
pc = 0x55763ab2e139

Before starting execution, GDB just takes a "dummy" address that is present in the executable: 0x1139.

After it starts however, GDB intelligently notices that the dynamic loader placed the program in a different location, and the first break stopped at 0x5630df2d6139.

Then, the second run also intelligently noticed that the executable moved again, and ended up breaking at 0x55763ab2e139.

echo 2 | sudo tee /proc/sys/kernel/randomize_va_space ensures that ASLR is on (the default in Ubuntu 17.10): How can I temporarily disable ASLR (Address space layout randomization)? | Ask Ubuntu.

set disable-randomization off is needed otherwise GDB, as the name suggests, turns off ASLR for the process by default to give fixed addresses across runs to improve the debugging experience: Difference between gdb addresses and "real" addresses? | Stack Overflow.

readelf analysis

Furthermore, we can also observe that:

readelf -s ./no-pie.out | grep main

gives the actual runtime load address (pc pointed to the following instruction 4 bytes after):

64: 0000000000401122    21 FUNC    GLOBAL DEFAULT   13 main

while:

readelf -s ./pie.out | grep main

gives just an offset:

65: 0000000000001135    23 FUNC    GLOBAL DEFAULT   14 main

By turning ASLR off (with either randomize_va_space or set disable-randomization off), GDB always gives main the address: 0x5555555547a9, so we deduce that the -pie address is composed from:

0x555555554000 + random offset + symbol offset (79a)

TODO where is 0x555555554000 hard coded in the Linux kernel / glibc loader / wherever? How is the address of the text section of a PIE executable determined in Linux?

Minimal assembly example

Another cool thing we can do is to play around with some assembly code to understand more concretely what PIE means.

We can do that with a Linux x86_64 freestanding assembly hello world:

main.S

.text
.global _start
_start:
asm_main_after_prologue:
    /* write */
    mov $1, %rax   /* syscall number */
    mov $1, %rdi   /* stdout */
    mov $msg, %rsi  /* buffer */
    mov $len, %rdx /* len */
    syscall

    /* exit */
    mov $60, %rax   /* syscall number */
    mov $0, %rdi    /* exit status */
    syscall
msg:
    .ascii "hello\n"
len = . - msg

GitHub upstream

and it assembles and runs fine with:

as -o main.o main.S
ld -o main.out main.o
./main.out

However, if we try to link it as PIE with (--no-dynamic-linker is required as explained at: How to create a statically linked position independent executable ELF in Linux?):

ld --no-dynamic-linker -pie -o main.out main.o

then link will fail with:

ld: main.o: relocation R_X86_64_32S against `.text' can not be used when making a PIE object; recompile with -fPIC
ld: final link failed: nonrepresentable section on output

Because the line:

mov $msg, %rsi  /* buffer */

hardcodes the message address in the mov operand, and is therefore not position independent.

If we instead write it in a position independent way:

lea msg(%rip), %rsi

then PIE link works fine, and GDB shows us that the executable does get loaded at a different location in memory every time.

The difference here is that lea encoded the address of msg relative to the current PC address due to the rip syntax, see also: How to use RIP Relative Addressing in a 64-bit assembly program?

We can also figure that out by disassembling both versions with:

objdump -S main.o

which give respectively:

e:   48 c7 c6 00 00 00 00    mov    $0x0,%rsi
e:   48 8d 35 19 00 00 00    lea    0x19(%rip),%rsi        # 2e <msg>

000000000000002e <msg>:
  2e:   68 65 6c 6c 6f          pushq  $0x6f6c6c65

So we see clearly that lea already has the full correct address of msg encoded as current address + 0x19.

The mov version however has set the address to 00 00 00 00, which means that a relocation will be performed there: What do linkers do? The cryptic R_X86_64_32S in the ld error message is the actual type of relocation that was required and which cannot happen in PIE executables.

Another fun thing that we can do is to put the msg in the data section instead of .text with:

.data
msg:
    .ascii "hello\n"
len = . - msg

Now the .o assembles to:

e:   48 8d 35 00 00 00 00    lea    0x0(%rip),%rsi        # 15 <_start+0x15>

so the RIP offset is now 0, and we guess that a relocation has been requested by the assembler. We confirm that with:

readelf -r main.o

which gives:

Relocation section '.rela.text' at offset 0x160 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000011  000200000002 R_X86_64_PC32     0000000000000000 .data - 4

so clearly R_X86_64_PC32 is a PC relative relocation that ld can handle for PIE executables.

This experiment taught us that the linker itself checks the program can be PIE and marks it as such.

Then when compiling with GCC, -pie tells GCC to generate position independent assembly.

But if we write assembly ourselves, we must manually ensure that we have achieved position independence.

In ARMv8 aarch64, the position independent hello world can be achieved with the ADR instruction.

How to determine if an ELF is position independent?

Besides just running it through GDB, some static methods are mentioned at:

  • executable: https://unix.stackexchange.com/questions/89211/how-to-test-whether-a-linux-binary-was-compiled-as-position-independent-code/435038#435038
  • library: How can I tell, with something like objdump, if an object file has been built with -fPIC?

Tested in Ubuntu 18.10.