.bss vs COMMON: what goes where?
// file a.c
// file-scope
int a = 0; // goes into BSS
after compilation of a.c
into object file a.o
, a
symbol goes into BSS section.
// file b.c
// file-scope
int b; // goes into COMMON section
after compilation of b.c
into object file b.o
, b
symbol goes into COMMON section.
After linking of a.o
and b.o
, both a
and b
symbols goes into BSS section. Common symbols only exist in object files, not in executable files. The idea of COMMON symbols in Unix is to allow multiple external definitions of a same variable (in different compilation units) under a single common symbol under certain conditions.
Commons only appear before the linking stage. Commons are what later goes into the bss or data‚ but it's up to the linker to decide where it goes. This allows you to have the same variable defined in different compilation units. As far as I know this is mostly to allow some ancient header files that had int foo;
in them instead of extern int foo;
.
Here's how it works:
$ cat > a.c
int foo;
$ cat > b.c
int foo;
$ cat > main.c
extern int foo;
int main(int argc, char **argv) { return foo; }
$ cc -c a.c && cc -c b.c && cc -c main.c && cc -o x a.o b.o main.o
$ objdump -t a.o | grep foo
0000000000000004 O *COM* 0000000000000004 foo
$ objdump -t b.o | grep foo
0000000000000004 O *COM* 0000000000000004 foo
$ objdump -t x | grep foo
0000000000600828 g O .bss 0000000000000004 foo
$
Notice that this only works when at most one of the variables in the different compilation units is initialized.
$ echo "int foo = 0;" > a.c
$ cc -c a.c && cc -c b.c && cc -c main.c && cc -o x a.o b.o main.o
$ echo "int foo = 0;" > b.c
$ cc -c a.c && cc -c b.c && cc -c main.c && cc -o x a.o b.o main.o
b.o:(.bss+0x0): multiple definition of `foo'
a.o:(.bss+0x0): first defined here
collect2: ld returned 1 exit status
$
This is scary stuff, compatibility with ancient systems and you should never rely on it. Do things properly - only one definition of global variables in all compilation units, declare it extern it everywhere else through a header.
If you allow common
during linking different units can declare the same variable and the the linker will locate them at the same location. The types don't even need to be the same, so it is some kind of link time union. This is the COMMON
feature from Fortran. If you don't allow common
in linking C then, such a situation will result in a link time error. Such common
linking is only possible for uninitialized globals, because otherwise it is unclear which initialization should be taken.
The globals going to bss
are just uninitialized globals that C defines as being initialized to 0. Most object formats support sections where only the size is given and the loader will fill the whole section with zeros.
P.S: If you use gcc
you can use the -fno-common
option to force common
symbols to the bss
section, which as Art argues is good and advisable practice.