How to explain C pointers (declaration vs. unary operators) to a beginner?

I have had the recent pleasure to explain pointers to a C programming beginner and stumbled upon the following difficulty. It might not seem like an issue at all if you already know how to use pointers, but try to look at the following example with a clear mind:

int foo = 1;
int *bar = &foo;
printf("%p\n", (void *)&foo);
printf("%i\n", *bar);

To the absolute beginner the output might be surprising. In line 2 he/she had just declared *bar to be &foo, but in line 4 it turns out *bar is actually foo instead of &foo!

The confusion, you might say, stems from the ambiguity of the * symbol: In line 2 it is used to declare a pointer. In line 4 it is used as an unary operator which fetches the value the pointer points at. Two different things, right?

However, this "explanation" doesn't help a beginner at all. It introduces a new concept by pointing out a subtle discrepancy. This can't be the right way to teach it.

So, how did Kernighan and Ritchie explain it?

The unary operator * is the indirection or dereferencing operator; when applied to a pointer, it accesses the object the pointer points to. […]

The declaration of the pointer ip, int *ip is intended as a mnemonic; it says that the expression *ip is an int. The syntax of the declaration for a variable mimics the syntax of expressions in which the variable might appear.

int *ip should be read like "*ip will return an int"? But why then doesn't the assignment after the declaration follow that pattern? What if a beginner wants to initialize the variable? int *ip = 1 (read: *ip will return an int and the int is 1) won't work as expected. The conceptual model just doesn't seem coherent. Am I missing something here?


Edit: It tried to summarize the answers here.


Solution 1:

The reason why the shorthand:

int *bar = &foo;

in your example can be confusing is that it's easy to misread it as being equivalent to:

int *bar;
*bar = &foo;    // error: use of uninitialized pointer bar!

when it actually means:

int *bar;
bar = &foo;

Written out like this, with the variable declaration and assignment separated, there is no such potential for confusion, and the use ↔ declaration parallelism described in your K&R quote works perfectly:

  • The first line declares a variable bar, such that *bar is an int.

  • The second line assigns the address of foo to bar, making *bar (an int) an alias for foo (also an int).

When introducing C pointer syntax to beginners, it may be helpful to initially stick to this style of separating pointer declarations from assignments, and only introduce the combined shorthand syntax (with appropriate warnings about its potential for confusion) once the basic concepts of pointer use in C have been adequately internalized.

Solution 2:

For your student to understand the meaning of the * symbol in different contexts, they must first understand that the contexts are indeed different. Once they understand that the contexts are different (i.e. the difference between the left hand side of an assignment and a general expression) it isn't too much of a cognitive leap to understand what the differences are.

Firstly explain that the declaration of a variable cannot contain operators (demonstrate this by showing that putting a - or + symbol in a variable declaration simply causes an error). Then go on to show that an expression (i.e. on the right hand side of an assignment) can contain operators. Make sure the student understands that an expression and a variable declaration are two completely different contexts.

When they understand that the contexts are different, you can go on to explain that when the * symbol is in a variable declaration in front of the variable identifier, it means 'declare this variable as a pointer'. Then you can explain that when used in an expression (as a unary operator) the * symbol is the 'dereference operator' and it means 'the value at the address of' rather than its earlier meaning.

To truly convince your student, explain that the creators of C could have used any symbol to mean the dereference operator (i.e. they could have used @ instead) but for whatever reason they made the design decision to use *.

All in all, there's no way around explaining that the contexts are different. If the student doesn't understand the contexts are different, they can't understand why the * symbol can mean different things.