What exactly is a C pointer if not a memory address?

In a reputable source about C, the following information is given after discussing the & operator:

... It's a bit unfortunate that the terminology [address of] remains, because it confuses those who don't know what addresses are about, and misleads those who do: thinking about pointers as if they were addresses usually leads to grief...

Other materials I have read (from equally reputable sources, I would say) have always unabashedly referred to pointers and the & operator as giving memory addresses. I would love to keep searching for the actuality of the matter, but it is kind of difficult when reputable sources KIND OF disagree.

Now I am slightly confused--what exactly is a pointer, then, if not a memory address?

P.S.

The author later says: ...I will continue to use the term 'address of' though, because to invent a different one [term] would be even worse.


Solution 1:

The C standard does not define what a pointer is internally and how it works internally. This is intentional so as not to limit the number of platforms, where C can be implemented as a compiled or interpreted language.

A pointer value can be some kind of ID or handle or a combination of several IDs (say hello to x86 segments and offsets) and not necessarily a real memory address. This ID could be anything, even a fixed-size text string. Non-address representations may be especially useful for a C interpreter.

Solution 2:

I'm not sure about your source, but the type of language you're describing comes from the C standard:

6.5.3.2 Address and indirection operators
[...]
3. The unary & operator yields the address of its operand. [...]

So... yeah, pointers point to memory addresses. At least that's how the C standard suggests it to mean.

To say it a bit more clearly, a pointer is a variable holding the value of some address. The address of an object (which may be stored in a pointer) is returned with the unary & operator.

I can store the address "42 Wallaby Way, Sydney" in a variable (and that variable would be a "pointer" of sorts, but since that's not a memory address it's not something we'd properly call a "pointer"). Your computer has addresses for its buckets of memory. Pointers store the value of an address (i.e. a pointer stores the value "42 Wallaby Way, Sydney", which is an address).

Edit: I want to expand on Alexey Frunze's comment.

What exactly is a pointer? Let's look at the C standard:

6.2.5 Types
[...]
20. [...]
A pointer type may be derived from a function type or an object type, called the referenced type. A pointer type describes an object whose value provides a reference to an entity of the referenced type. A pointer type derived from the referenced type T is sometimes called ‘‘pointer to T’’. The construction of a pointer type from a referenced type is called ‘‘pointer type derivation’’. A pointer type is a complete object type.

Essentially, pointers store a value that provides a reference to some object or function. Kind of. Pointers are intended to store a value that provides a reference to some object or function, but that's not always the case:

6.3.2.3 Pointers
[...]
5. An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

The above quote says that we can turn an integer into a pointer. If we do that (that is, if we stuff an integer value into a pointer instead of a specific reference to an object or function), then the pointer "might not point to an entity of reference type" (i.e. it may not provide a reference to an object or function). It might provide us with something else. And this is one place where you might stick some kind of handle or ID in a pointer (i.e. the pointer isn't pointing to an object; it's storing a value that represents something, but that value may not be an address).

So yes, as Alexey Frunze says, it's possible a pointer isn't storing an address to an object or function. It's possible a pointer is instead storing some kind of "handle" or ID, and you can do this by assigning some arbitrary integer value to a pointer. What this handle or ID represents depends on the system/environment/context. So long as your system/implementation can make sense of the value, you're in good shape (but that depends on the specific value and the specific system/implemenation).

Normally, a pointer stores an address to an object or function. If it isn't storing an actual address (to an object or function), the result is implementation defined (meaning that exactly what happens and what the pointer now represents depends on your system and implementation, so it might be a handle or ID on a particular system, but using the same code/value on another system might crash your program).

That ended up being longer than I thought it would be...

Solution 3:

Pointer vs Variable

In this picture,

pointer_p is a pointer which is located at 0x12345, and is pointing to a variable variable_v at 0x34567.

Solution 4:

To think of a pointer as an address is an approximation. Like all approximations, it's good enough to be useful sometimes, but it's also not exact which means that relying on it causes trouble.

A pointer is like an address in that it indicates where to find an object. One immediate limitation of this analogy is that not all pointers actually contain an address. NULL is a pointer which is not an address. The content of a pointer variable can in fact be of one of three kinds:

  • the address of an object, which can be dereferenced (if p contains the address of x then the expression *p has the same value as x);
  • a null pointer, of which NULL is an example;
  • invalid content, which doesn't point to an object (if p doesn't hold a valid value, then *p could do anything (“undefined behavior”), with crashing the program a fairly common possibility).

Furthermore, it would be more accurate to say that a pointer (if valid and non-null) contains an address: a pointer indicates where to find an object, but there is more information tied to it.

In particular, a pointer has a type. On most platforms, the type of the pointer has no influence at runtime, but it has an influence that goes beyond the type at compile time. If p is a pointer to int (int *p;), then p + 1 points to an integer which is sizeof(int) bytes after p (assuming p + 1 is still a valid pointer). If q is a pointer to char that points to the same address as p (char *q = p;), then q + 1 is not the same address as p + 1. If you think of pointer as addresses, it is not very intuitive that the “next address” is different for different pointers to the same location.

It is possible in some environments to have multiple pointer values with different representations (different bit patterns in memory) that point to the same location in memory. You can think of these as different pointers holding the same address, or as different addresses for the same location — the metaphor isn't clear in this case. The == operator always tells you whether the two operands are pointing to the same location, so on these environments you can have p == q even though p and q have different bit patterns.

There are even environments where pointers carry other information beyond the address, such as type or permission information. You can easily go through your life as a programmer without encountering these.

There are environments where different kinds of pointers have different representations. You can think of it as different kinds of addresses having different representations. For example, some architectures have byte pointers and word pointers, or object pointers and function pointers.

All in all, thinking of pointers as addresses isn't too bad as long as you keep in mind that

  • it's only valid, non-null pointers that are addresses;
  • you can have multiple addresses for the same location;
  • you can't do arithmetic on addresses, and there's no order on them;
  • the pointer also carries type information.

Going the other way round is far more troublesome. Not everything that looks like an address can be a pointer. Somewhere deep down any pointer is represented as a bit pattern that can be read as an integer, and you can say that this integer is an address. But going the other way, not every integer is a pointer.

There are first some well-known limitations; for example, an integer that designates a location outside your program's address space can't be a valid pointer. A misaligned address doesn't make a valid pointer for a data type that requires alignment; for example, on a platform where int requires 4-byte alignment, 0x7654321 cannot be a valid int* value.

However, it goes well beyond that, because when you make a pointer into an integer, you're in for a world of trouble. A big part of this trouble is that optimizing compilers are far better at microoptimization than most programmers expect, so that their mental model of how a program works is deeply wrong. Just because you have pointers with the same address doesn't mean that they are equivalent. For example, consider the following snippet:

unsigned int x = 0;
unsigned short *p = (unsigned short*)&x;
p[0] = 1;
printf("%u = %u\n", x, *p);

You might expect that on a run-of-the-mill machine where sizeof(int)==4 and sizeof(short)==2, this either prints 1 = 1? (little-endian) or 65536 = 1? (big-endian). But on my 64-bit Linux PC with GCC 4.4:

$ c99 -O2 -Wall a.c && ./a.out 
a.c: In function ‘main’:
a.c:6: warning: dereferencing pointer ‘p’ does break strict-aliasing rules
a.c:5: note: initialized from here
0 = 1?

GCC is kind enough to warn us what's going wrong in this simple example — in more complex examples, the compiler might not notice. Since p has a different type from &x, changing what p points to cannot affect what &x points to (outside some well-defined exceptions). Therefore the compiler is at liberty to keep the value of x in a register and not update this register as *p changes. The program dereferences two pointers to the same address and obtains two different values!

The moral of this example is that thinking of a (non-null valid) pointer as an address is fine, as long as you stay within the precise rules of the C language. The flip side of the coin is that the rules of the C language are intricate, and difficult to get an intuitive feeling for unless you know what happens under the hood. And what happens under the hood is that the tie between pointers and addresses is somewhat loose, both to support “exotic” processor architectures and to support optimizing compilers.

So think of pointers being addresses as a first step in your understanding, but don't follow that intuition too far.

Solution 5:

A pointer is a variable that HOLDS memory address, not the address itself. However, you can dereference a pointer - and get access to the memory location.

For example:

int q = 10; /*say q is at address 0x10203040*/
int *p = &q; /*means let p contain the address of q, which is 0x10203040*/
*p = 20; /*set whatever is at the address pointed by "p" as 20*/

That's it. It's that simple.

enter image description here

A program to demonstrate what I am saying and its output is here:

http://ideone.com/rcSUsb

The program:

#include <stdio.h>

int main(int argc, char *argv[])
{
  /* POINTER AS AN ADDRESS */
  int q = 10;
  int *p = &q;

  printf("address of q is %p\n", (void *)&q);
  printf("p contains %p\n", (void *)p);

  p = NULL;
  printf("NULL p now contains %p\n", (void *)p);
  return 0;
}