What does "#define _GNU_SOURCE" imply?
Today I had to use the basename()
function, and the man 3 basename
(here) gave me some strange message:
Notes
There are two different versions of basename() - the POSIX version described above, and the GNU version, which one gets after
#define _GNU_SOURCE
#include <string.h>
I'm wondering what this #define _GNU_SOURCE
means: is it tainting the code I write with a GNU-related license? Or is it simply used to tell the compiler something like "Well, I know, this set of functions is not POSIX, thus not portable, but I'd like to use it anyway".
If so, why not give people different headers, instead of having to define some obscure macro to get one function implementation or the other?
Something also bugs me: how does the compiler know which function implementation to link with the executable? Does it use this #define
as well?
Anybody have some pointers to give me?
Solution 1:
Defining _GNU_SOURCE
has nothing to do with license and everything to do with writing (non-)portable code. If you define _GNU_SOURCE
, you will get:
- access to lots of nonstandard GNU/Linux extension functions
- access to traditional functions which were omitted from the POSIX standard (often for good reason, such as being replaced with better alternatives, or being tied to particular legacy implementations)
- access to low-level functions that cannot be portable, but that you sometimes need for implementing system utilities like
mount
,ifconfig
, etc. - broken behavior for lots of POSIX-specified functions, where the GNU folks disagreed with the standards committee on how the functions should behave and decided to do their own thing.
As long as you're aware of these things, it should not be a problem to define _GNU_SOURCE
, but you should avoid defining it and instead define _POSIX_C_SOURCE=200809L
or _XOPEN_SOURCE=700
when possible to ensure that your programs are portable.
In particular, the things from _GNU_SOURCE
that you should never use are #2 and #4 above.
Solution 2:
For exact details on what are all enabled by _GNU_SOURCE
, documentation can help.
From the GNU documentation:
Macro: _GNU_SOURCE
If you define this macro, everything is included: ISO C89, ISO C99, POSIX.1, POSIX.2, BSD, SVID, X/Open, LFS, and GNU extensions. In the cases where POSIX.1 conflicts with BSD, the POSIX definitions take precedence.
From the Linux man page on feature test macros:
_GNU_SOURCE
Defining this macro (with any value) implicitly defines _ATFILE_SOURCE, _LARGEFILE64_SOURCE, _ISOC99_SOURCE, _XOPEN_SOURCE_EXTENDED, _POSIX_SOURCE, _POSIX_C_SOURCE with the value 200809L (200112L in glibc versions before 2.10; 199506L in glibc versions before 2.5; 199309L in glibc ver‐ sions before 2.1) and _XOPEN_SOURCE with the value 700 (600 in glibc versions before 2.10; 500 in glibc versions before 2.2). In addition, various GNU-specific extensions are also exposed.
Since glibc 2.19, defining _GNU_SOURCE also has the effect of implicitly defining _DEFAULT_SOURCE. In glibc versions before 2.20, defining _GNU_SOURCE also had the effect of implicitly defining _BSD_SOURCE and _SVID_SOURCE.
Note: _GNU_SOURCE
needs to be defined before including header files so that the respective headers enable the features. For example:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
...
_GNU_SOURCE
can be also be enabled per compilation using -D
flag:
$ gcc -D_GNU_SOURCE file.c
(-D
is not specific to _GNU_SOURCE
but any macro be defined this way).
Solution 3:
Let me answer two further points:
Something also bugs me: how does the compiler know which function implementation to link with the executable? Does it use this #define as well?
A common approach is to conditionally #define
identifier basename
to different names, depending on whether _GNU_SOURCE
is defined. For instance:
#ifdef _GNU_SOURCE
# define basename __basename_gnu
#else
# define basename __basename_nongnu
#endif
Now the library simply needs to provide both behaviors under those names.
If so, why not give people different headers, instead of having to define some obscure environment variable to get one function implementation or the other?
Often the same header had slightly different contents in different Unix versions, so there is no single right content for, say, <string.h>
— there are many standards (xkcd).
There's a whole set of macros to pick your favorite one, so that if your program expects one standard, the library will conform to that.