Is it possible to initialize a C pointer to NULL?
Solution 1:
Is it possible to initialize a C pointer to NULL?
TL;DR Yes, very much.
The actual claim made on the guide reads like
On the other hand, if you use just the single initial assignment,
int *my_int_ptr = 2;
, the program will try to fill the contents of the memory location pointed to bymy_int_ptr
with the value 2. Sincemy_int_ptr
is filled with garbage, it can be any address. [...]
Well, they are wrong, you are right.
For the statement, (ignoring, for now, the fact that pointer to integer conversion is an implementation-defined behaviour)
int * my_int_ptr = 2;
my_int_ptr
is a variable (of type pointer to int
), it has an address of its own (type: address of pointer to integer), you are storing a value of 2
into that address.
Now, my_int_ptr
, being a pointer type, we can say, it points to the value of "type" at the memory location pointed by the value held in my_int_ptr
. So, you are essentially assigning the value of the pointer variable, not the value of the memory location pointed to by the pointer.
So, for conclusion
char *x=NULL;
initializes the pointer variable x
to NULL
, not the value at the memory address pointed to by the pointer.
This is the same as
char *x;
x = NULL;
Expansion:
Now, being strictly conforming, a statement like
int * my_int_ptr = 2;
is illegal, as it involves constraint violation. To be clear,
-
my_int_ptr
is a pointer variable, typeint *
- an integer constant,
2
has typeint
, by definition.
and they are not "compatible" types, so this initialization is invalid because it's violating the rules of simple assignment, mentioned in chapter §6.5.16.1/P1, described in Lundin's answer.
In case anyone's interested how initialization is linked to simple assignment constraints, quoting C11
, chapter §6.7.9, P11
The initializer for a scalar shall be a single expression, optionally enclosed in braces. The initial value of the object is that of the expression (after conversion); the same type constraints and conversions as for simple assignment apply, taking the type of the scalar to be the unqualified version of its declared type.
Solution 2:
The tutorial is wrong. In ISO C, int *my_int_ptr = 2;
is an error. In GNU C, it means the same as int *my_int_ptr = (int *)2;
. This converts the integer 2
to a memory address, in some fashion as determined by the compiler.
It does not attempt to store anything in the location addressed by that address (if any). If you went on to write *my_int_ptr = 5;
, then it would try to store the number 5
in the location addressed by that address.
Solution 3:
To clarify why the tutorial is wrong, int *my_int_ptr = 2;
is a "constraint violation", it is code which is not allowed to compile and the compiler must give you a diagnostic upon encountering it.
As per 6.5.16.1 Simple assignment:
Constraints
One of the following shall hold:
- the left operand has atomic, qualified, or unqualified arithmetic type, and the right has arithmetic type;
- the left operand has an atomic, qualified, or unqualified version of a structure or union type compatible with the type of the right;
- the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;
- the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) one operand is a pointer to an object type, and the other is a pointer to a qualified or unqualified version of void, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;
- the left operand is an atomic, qualified, or unqualified pointer, and the right is a null pointer constant; or
- the left operand has type atomic, qualified, or unqualified _Bool, and the right is a pointer.
In this case the left operand is an unqualified pointer. Nowhere does it mention that the right operand is allowed to be an integer (arithmetic type). So the code violates the C standard.
GCC is known to behave poorly unless you explicitly tell it to be a standard C compiler. If you compile the code as -std=c11 -pedantic-errors
, it will correctly give a diagnostic as it must do.
Solution 4:
int *my_int_ptr = 2
stores the integer value 2 to whatever random address is in my_int_ptr when it is allocated.
This is completely wrong. If this is actually written then please get a better book or tutorial.
int *my_int_ptr = 2
defines an integer pointer which points to address 2. You will most likely get a crash if you try to access address 2
.
*my_int_ptr = 2
, i.e. without the int
in the line, stores the value two to whatever random address my_int_ptr
is pointing to. Having saying this, you can assign NULL
to a pointer when it is defined. char *x=NULL;
is perfectly valid C.
Edit: While writing this I didn't know that integer to pointer conversion is implementation defined behavior. Please see the good answers by @M.M and @SouravGhosh for details.
Solution 5:
A lot of confusion about C pointers comes from a very bad choice that was originally made regarding coding style, corroborated by a very bad little choice in the syntax of the language.
int *x = NULL;
is correct C, but it is very misleading, I would even say nonsensical, and it has hindered the understanding of the language for many a novice. It makes one think that later on we could do *x = NULL;
which is of course impossible. You see, the type of the variable is not int
, and the name of the variable is not *x
, nor does the *
in the declaration play any functional role in collaboration with the =
. It is purely declarative. So, what makes a lot more sense is this:
int* x = NULL;
which is also correct C, albeit it does not adhere to the original K&R coding style. It makes it perfectly clear that the type is int*
, and the pointer variable is x
, so it becomes plainly evident even to the uninitiated that the value NULL
is being stored into x
, which is a pointer to int
.
Furthermore, it makes it easier to derive a rule: when the star is away from the variable name then it is a declaration, while the star being attached to the name is pointer dereferencing.
So, now it becomes a lot more understandable that further down we can either do x = NULL;
or *x = 2;
in other words it makes it easier for a novice to see how variable = expression
leads to pointer-type variable = pointer-expression
and dereferenced-pointer-variable = expression
. (For the initiated, by 'expression' I mean 'rvalue'.)
The unfortunate choice in the syntax of the language is that when declaring local variables you can say int i, *p;
which declares an integer and a pointer to an integer, so it leads one to believe that the *
is a useful part of the name. But it is not, and this syntax is just a quirky special case, added for convenience, and in my opinion it should have never existed, because it invalidates the rule that I proposed above. As far as I know, nowhere else in the language is this syntax meaningful, but even if it is, it points to a discrepancy in the way pointer types are defined in C. Everywhere else, in single-variable declarations, in parameter lists, in struct members, etc. you can declare your pointers as type* pointer-variable
instead of type *pointer-variable
; it is perfectly legal and makes more sense.