Why can't ld ignore an unused unresolved symbol?
Consider the following source files:
a.c:
extern int baz();
int foo() { return 123; }
int bar() { return baz() + 1; }
b.c:
extern int foo();
int main() { return foo(); }
Now, when I try to build a program using these sources, here's what happens:
$ gcc -c -o a.o a.c
$ gcc -c -o b.o b.c
$ gcc -o prog a.o b.o
/usr/bin/ld: a.o: in function `bar':
a.c:(.text+0x15): undefined reference to `baz'
collect2: error: ld returned 1 exit status
This is on Devuan GNU/Linux Chimaera, with GNU ld 2.35.2, GCC 10.2.1.
Why does this happen? I mean, one does not need any complex optimization to know that baz()
is not really needed in bar()
- ld naturally notices this at some point - e.g. when finishing its traversal of bar()
without noticing a location where baz()
is used.
Now, you could say "einpoklum, you didn't ask the compiler to go to any trouble for you" - and that's fair, I guess, but even if I use -O3
with these instructions, I get the same error.
Note: with LTO and optimization enabled, we can circumvent this issue:
$ gcc -c -flto -O1 -o b.o b.c
$ gcc -c -flto -O1 -o a.o a.c
$ gcc -o prog -O1 -flto a.o b.o
$ /prog ; echo $?;
123
Solution 1:
In a “plain” traditional compilation of this code:
extern int baz();
int foo() { return 123; }
int bar() { return baz() + 1; }
the compiler creates one object module that contains the code of both routines along with definitions for symbols foo
and bar
and a reference to baz
. There is nothing to tell the linker where the code belonging to foo
begins and ends, where the code belonging to bar
begins and ends or even that any given piece of code—or any given byte in the object module—belongs only to one of foo
or bar
. Had I written in assembly and assembled to make an object module, I could have included code in foo
that jumped into bar
(using only hard-coded offsets calculated by the assembler and not revealed in any symbols visible to the linker) or vice-versa.
So the linker has no way of knowing that foo
and bar
can be separated.
Later, a protocol was created for the compiler to keep functions separated and to provide sufficient information in the object modules that the linker could determine where they were separated and to tell the linker it was okay to separate functions. When the options for that are enabled, the linker may be able to include foo
in the program without including bar
.
That this feature is not yet the default in the tools is a matter of legacy in various build systems and projects, inertia, and current practice.
Solution 2:
If you use gcc
and binutils ld
to build your programs you need to place functions in separate sections. It is archived by -fdata-sections
& -ffunction-sections
command line options.
Same with data. Then if you do not want dead code to be included in your executable you need to enable it by using --gc-sections
ld
option.
Putting this all together:
$ gcc -fdata-sections -ffunction-sections -c -o a.o a.c
$ gcc -c -o b.o b.c
$ gcc -Wl,--gc-sections -o prog a.o b.o
$ /prog ; echo $?
123
If you want to enable it by default simple build GCC
with those options enabled.