Is &errno legal C?
Per 7.5,
[errno] expands to a modifiable lvalue175) that has type int, the value of which is set to a positive error number by several library functions. It is unspecified whether errno is a macro or an identifier declared with external linkage. If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the name errno, the behavior is undefined.
175) The macro errno need not be the identifier of an object. It might expand to a modifiable lvalue resulting from a function call (for example, *errno()).
It's not clear to me whether this is sufficient to require that &errno
not be a constraint violation. The C language has lvalues (such as register-storage-class variables; however these can only be automatic so errno
could not be defined as such) for which the &
operator is a constraint violation.
If &errno
is legal C, is it required to be constant?
So §6.5.3.2p1 specifies
The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
Which I think can be taken to mean that &lvalue
is fine for any lvalue that is not in those two categories. And as you mentioned, errno
cannot be declared with the register storage-class specifier, and I think (although am not chasing references to check right now) that you cannot have a bitfield that has type of plain int
.
So I believe that the spec requires &(errno)
to be legal C.
If &errno is legal C, is it required to be constant?
As I understand it, part of the point of allowing errno
to be a macro (and the reason it is in e.g. glibc) is to allow it to be a reference to thread-local storage, in which case it will certainly not be constant across threads. And I don't see any reason to expect it must be constant. As long as the value of errno
retains the semantics specified, I see no reason a perverse C library could not change &errno
to refer to different memory addresses over the course of a program -- e.g. by freeing and reallocating the backing store every time you set errno
.
You could imagine maintaining a ring buffer of the last N errno values set by the library, and having &errno
always point to the latest. I don't think it would be particularly useful, but I can't see any way it violates the spec.
I am surprised nobody has cited the C11 spec yet. Apologies for the long quote, but I believe it is relevant.
7.5 Errors
The header defines several macros...
...and
errno
which expands to a modifiable lvalue(201) that has type
int
and thread local storage duration, the value of which is set to a positive error number by several library functions. If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the nameerrno
, the behavior is undefined.The value of
errno
in the initial thread is zero at program startup (the initial value oferrno
in other threads is an indeterminate value), but is never set to zero by any library function.(202) The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use oferrno
is not documented in the description of the function in this International Standard.(201) The macro
errno
need not be the identifier of an object. It might expand to a modifiable lvalue resulting from a function call (for example,*errno()
).(202) Thus, a program that uses
errno
for error checking should set it to zero before a library function call, then inspect it before a subsequent library function call. Of course, a library function can save the value oferrno
on entry and then set it to zero, as long as the original value is restored iferrno
’s value is still zero just before the return.
"Thread local" means register
is out. Type int
means bitfields are out (IMO). So &errno
looks legal to me.
Persistent use of words like "it" and "the value" suggests the authors of the standard did not contemplate &errno
being non-constant. I suppose one could imagine an implementation where &errno
was not constant within a particular thread, but to be used the way the footnotes say (set to zero, then check after calling library function), it would have to be deliberately adversarial, and possibly require specialized compiler support just to be adversarial.
In short, if the spec does permit a non-constant &errno
, I do not think it was deliberate.
[update]
R. asks an excellent question in the comments. After thinking about it, I believe I now know the correct answer to his question, and to the original question. Let me see if I can convince you, dear reader.
R. points out that GCC allows something like this at the top level:
register int errno asm ("r37"); // line R
This would declare errno
as a global value held in register r37
. Obviously, it would be a thread-local modifiable lvalue. So, could a conforming C implementation declare errno
like this?
The answer is no. When you or I use the word "declaration", we usually have a colloquial and intuitive concept in mind. But the standard does not speak colloquially or intuitively; it speaks precisely, and it aims only to use terms that are well-defined. In the case of "declaration", the standard itself defines the term; and when it uses the term, it is using its own definition.
By reading the spec, you can learn precisely what a "declaration" is and precisely what it is not. Put another way, the standard describes the language "C". It does not describe "some language that is not C". As far as the standard is concerned, "C with extensions" is just "some language that is not C".
Thus, from the standard's point of view, line R is not a declaration at all. It does not even parse! It might as well read:
long long long __Foo_e!r!r!n!o()blurfl??/**
As far as the spec is concerned, this is just as much a "declaration" as line R; i.e., not at all.
So, when C11 spec says, in section 6.5.3.2:
The operand of the unary
&
operator shall be either a function designator, the result of a[]
or unary*
operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
...it means something very precise that does not refer to anything like Line R.
Now, consider the declaration of the int
object to which errno
refers. (Note: I do not mean the declaration of the errno
name, since of course there might be no such declaration if errno
is, say, a macro. I mean the declaration of the underlying int
object.)
The above language says you can take the address of an lvalue unless it designates a bit-field or it designates an object "declared" register
. And the spec for the underlying errno
object says it is a modifiable int
lvalue with thread-local duration.
Now, it is true that the spec does not say that the underlying errno
object must be declared at all. Maybe it just appears via some implementation-defined compiler magic. But again, when the spec says "declared with the register storage-class specifier", it is using its own terminology.
So either the underlying errno
object is "declared" in the standard sense, in which case it cannot be both register
and thread-local; or it is not declared at all, in which case it is not declared register
. Either way, since it is an lvalue, you may take its address.
(Unless it is a bit-field, but I think we agree that a bit field is not an object of type int
.)
The original implementation of errno
was as a global int variable that various Standard C Library components used to indicate an error value if they ran into an error. However even in those days one had to be careful about reentrant code or with library function calls that could set errno
to a different value as you were handling an error. Normally one would save the value in a temporary variable if the error code was needed for any length of time due to the possibility of some other function or piece of code setting the value of errno
either explicitly or through a library function call.
So with this original implementation of a global int, using the address of operator and depending on the address to remain constant was pretty much built into the fabric of the library.
However with multi-threading, there was no longer a single global because having a single global was not thread safe. So the idea of having thread local storage perhaps using a function that returns a pointer to an allocated area. So you might see a construct something like the following entirely imaginary example:
#define errno (*myErrno())
typedef struct {
// various memory areas for thread local stuff
int myErrNo;
// more memory areas for thread local stuff
} ThreadLocalData;
ThreadLocalData *getMyThreadData () {
ThreadLocalData *pThreadData = 0; // placeholder for the real thing
// locate the thread local data for the current thread through some means
// then return a pointer to this thread's local data for the C run time
return pThreadData;
}
int *myErrno () {
return &(getMyThreadData()->myErrNo);
}
Then errno
would be used as if it were a single global rather than a thread safe int variable by errno = 0;
or checking it like if (errno == 22) { // handle the error
and even something like int *pErrno = &errno;
. This all works because in the end the thread local data area is allocated and stays put and is not moving around and the macro definition which makes errno
look like an extern int
hides the plumbing of its actual implementation.
The one thing that we do not want is to have the address of errno
suddenly shift between time slices of a thread with some kind of a dynamic allocate, clone, delete sequence while we are accessing the value. When your time slice is up, it is up and unless you have some kind of synchronization involved or some way to keep the CPU after your time slice expires, having the thread local area move about seems a very dicey proposition to me.
This in turn implies that you can depend on the address of operator giving you a constant value for a particular thread though the constant value will differ between threads. I can well see the library using the address of errno
in order to reduce the overhead of doing some kind of thread local lookup every time a library function is called.
Having the address of errno
as constant within a thread also provides backwards compatibility with older source code which used the errno.h include file as they should have done (see this man page from linux for errno which explicitly warns to not use extern int errno;
as was common in the old days).
The way I read the standard is to allow for this kind of thread local storage while maintaining the semantics and syntax similar to the old extern int errno;
when errno
is used and allowing the old usage as well for some kind of cross compiler for an embedded device that does not support multi-threading. However the syntax may be similar due to the use of a macro definition so the old style short cut declaration should not be used because that declaration is not what the actual errno
really is.