Header file included only once in entire program?
I know this is a common question but I still can't fully get my head around it.
In a C or C++ program generated from multiple different source and header files, will each header file be only included once in the entire code when the header guards are used?
Someone told me previously that a header file (with include guards) will get included only once in one translation unit but multiple times in the entire code. Is this true?
If it gets included only once throughout the entire code, when one file wishes to include it and the preprocessor detects that it has already been included, how does that file that wishes to use it know whereabouts in the code it was previously included ?
This is the process:
source header source header header
\ / \ | / /
\ / \ | / /
PREPROCESSOR PREPROCESSOR
| |
V V
preprocessed code preprocessed code
| |
COMPILER COMPILER
| |
V V
object code object code
\ /
\ /
\ /
LINKER
|
V
executable
Preprocessing
#include
is for this first step. It instructs the preprocessor to processes the specified file, and insert the result into the output.
If A
includes B
and C
, and B
includes C
, the preprocessor's output for A
will include the processed text of C
twice.
This is a problem, since it will result in duplicate declarations. A remedy is to use preprocessor variables track whether the source code has been included (aka header guards).
#ifndef EXAMPLE_H
#define EXAMPLE_H
// header contents
#endif
The first time, EXAMPLE_H
is undefined, and the preprocessor will evaluate the contents within the ifndef
/endif
block. The second time, it will skip that block. So the processed output changes, and the definitions are included only once.
This is so common that there is a non-standard directive implemented by some compilers that is shorter and does not require choosing a unique preprocessor variable:
#pragma once
// header contents
You can figure out how portable you want your C/C++ code, and which header guard to use.
Headers guards will ensure the contents of each header file are present at most once in the preprocessed code for a translation unit.
Compiling
The compiler generates machine code from your preprocessed C/C++.
Generally, the header files only include declarations, not the actual definitions (aka implementations). The compiler includes a symbol table for anything that is currently missing an definition.
Linking
The linker combines the object files. It matches up the definitions (aka implementations) with the references to the symbol table.
It may be that two object files provide the definition, and the linker will take one. This happens if you've put executable code in your headers. This generally does not happen in C, but it happens very frequently in C++, because of templates.
The header "code", whether declarations or definitions, is included multiple times across all object files but the linker merges all of that together, so that it is only present once in the executable. (I'm excluding inlined functions, which are present multiple times.)
A "header file" is actually inserted by the pre-processor before compilation starts. Just think of it as just "replacing" its #include
directive.
The guard ...
#ifndef MY_HEADER_H
#define MY_HEADER_H
....
#endif
... is executed after the replacement. So, the header may actually be included multiple times, but the "guarded" part of the text is only passed to the compiler once by the preprocessor.
So, if there are any code-generation definitions in the header, they will - of course - be included into the object file of the compilation unit (aka "module"). If the same header is #include
ded in multiple modules, these will appear multiple times.
For static
definitions, this is no problem at all, as these will not be visible beyond the module (aka file scope). For program-global definitions, that is different and will result in "multiple definitions" error.
Note: this is mostly for C. For C++, there are significant differences, as classes, etc. add additional complexity to what/when multiple global objects are allowed.
A header file with appropriate include guards will be included only once per translation unit. Strictly speaking, it may be included multiple times, but the parts between the preprocessor #ifndef
and #endif
will be skipped on subsequent inclusions. If done correctly, this should be all (or most) of the file.
A translation unit usually corresponds to a "source file", although some obscure implementations may use a different definition. If a separately compiled source file includes the same header, the preprocessor has no way of knowing that another file had already included it, or that any other file was part of the same project.
Note that when you come to link together multiple source files (translation units) into a single binary, you may encounter problems with multiple definitions if the header does not consist only of declarations, templates, function definitions that are marked inline
, or static variable definitions. To avoid this, you should declare functions in the header and define them in a separate source file, which you link together with your other source files.
The header file will be included once per translation unit, yes. It can be included multiple times per program, as each translation unit is handled separately for the compile process. They are brought together during the linking process to form a complete program.