Are pointer variables just integers with some operators or are they "symbolic"?

EDIT: The original word choice was confusing. The term "symbolic" is much better than the original ("mystical").

In the discussion about my previous C++ question, I have been told that pointers are

  • "a simple value type much like an integer"
  • not "mystical"
  • "The Bit pattern (object representation) contains the value (value representation) (§3.9/4) for trivially copyable types, which a pointer is."

This does not sound right! If nothing is symbolic and a pointer is its representation, then I can do the following. Can I?

#include <stdio.h>
#include <string.h>

int main() {
    int a[1] = { 0 }, *pa1 = &a[0] + 1, b = 1, *pb = &b;
    if (memcmp (&pa1, &pb, sizeof pa1) == 0) {
        printf ("pa1 == pb\n");
        *pa1 = 2;
    }
    else {
        printf ("pa1 != pb\n");
        pa1 = &a[0]; // ensure well defined behaviour in printf
    }
    printf ("b = %d *pa1 = %d\n", b, *pa1);
    return 0;
 }

This is a C and C++ question.

Testing with Compile and Execute C Online with GNU GCC v4.8.3: gcc -O2 -Wall gives

pa1 == pb                                                                                                                                                                                       
b = 1 *pa1 = 2    

Testing with Compile and Execute C++ Online with GNU GCC v4.8.3: g++ -O2 -Wall

pa1 == pb                                                                                                                                                                                       
b = 1 *pa1 = 2        

So the modification of b via (&a)[1] fails with GCC in C and C++.

Of course, I would like an answer based on standard quotes.

EDIT: To respond to criticism about UB on &a + 1, now a is an array of 1 element.

Related: Dereferencing an out of bound pointer that contains the address of an object (array of array)

Additional note: the term "mystical" was first used, I think, by Tony Delroy here. I was wrong to borrow it.


The first thing to say is that a sample of one test on one compiler generating code on one architecture is not the basis on which to draw a conclusion on the behaviour of the language.

c++ (and c) are general purpose languages created with the intention of being portable. i.e. a well formed program written in c++ on one system should run on any other (barring calls to system-specific services).

Once upon a time, for various reasons including backward-compatibility and cost, memory maps were not contiguous on all processors.

For example I used to write code on a 6809 system where half the memory was paged in via a PIA addressed in the non-paged part of the memory map. My c compiler was able to cope with this because pointers were, for that compiler, a 'mystical' type which knew how to write to the PIA.

The 80386 family has an addressing mode where addresses are organised in groups of 16 bytes. Look up FAR pointers and you'll see different pointer arithmetic.

This is the history of pointer development in c++. Not all chip manufacturers have been "well behaved" and the language accommodates them all (usually) without needing to rewrite source code.


Stealing the quote from TartanLlama:

[expr.add]/5 "[for pointer addition, ] if both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."

So the compiler can assume that your pointer points to the a array, or one past the end. If it points one past the end, you cannot defererence it. But as you do, it surely can't be one past the end, so it can only be inside the array.

So now you have your code (reduced)

b = 1;
*pa1 = 2;

where pa points inside an array a and b is a separate variable. And when you print them, you get exactly 1 and 2, the values you have assigned them.

An optimizing compiler can figure that out, without even storing a 1or a 2 to memory. It can just print the final result.