What are the differences between a pointer variable and a reference variable in C++?

Solution 1:

A pointer can be re-assigned:

int x = 5;
int y = 6;
int *p;
p = &x;
p = &y;
*p = 10;
assert(x == 5);
assert(y == 10);

A reference cannot be re-bound, and must be bound at initialization:

int x = 5;
int y = 6;
int &q; // error
int &r = x;

A pointer variable has its own identity: a distinct, visible memory address that can be taken with the unary & operator and a certain amount of space that can be measured with the sizeof operator. Using those operators on a reference returns a value corresponding to whatever the reference is bound to; the reference’s own address and size are invisible. Since the reference assumes the identity of the original variable in this way, it is convenient to think of a reference as another name for the same variable.
```
int x = 0;
int &r = x;
int *p = &x;
int *p2 = &r;

assert(p == p2); // &x == &r
assert(&p != &p2);
```

You can have arbitrarily nested pointers to pointers offering extra levels of indirection. References only offer one level of indirection.

int x = 0;
int y = 0;
int *p = &x;
int *q = &y;
int **pp = &p;

**pp = 2;
pp = &q; // *pp is now q
**pp = 4;

assert(y == 4);
assert(x == 2);

A pointer can be assigned nullptr, whereas a reference must be bound to an existing object. If you try hard enough, you can bind a reference to nullptr, but this is undefined and will not behave consistently.

/* the code below is undefined; your compiler may optimise it
 * differently, emit warnings, or outright refuse to compile it */

int &r = *static_cast<int *>(nullptr);

// prints "null" under GCC 10
std::cout
    << (&r != nullptr
        ? "not null" : "null")
    << std::endl;

bool f(int &r) { return &r != nullptr; }

// prints "not null" under GCC 10
std::cout
    << (f(*static_cast<int *>(nullptr))
        ? "not null" : "null")
    << std::endl;

You can, however, have a reference to a pointer whose value is nullptr.

Pointers can iterate over an array; you can use ++ to go to the next item that a pointer is pointing to, and + 4 to go to the 5th element. This is no matter what size the object is that the pointer points to.
A pointer needs to be dereferenced with * to access the memory location it points to, whereas a reference can be used directly. A pointer to a class/struct uses -> to access its members whereas a reference uses a ..
References cannot be put into an array, whereas pointers can be (Mentioned by user @litb)
Const references can be bound to temporaries. Pointers cannot (not without some indirection):
```
const int &x = int(12); // legal C++
int *y = &int(12); // illegal to take the address of a temporary.
```
This makes const & more convenient to use in argument lists and so forth.

Solution 2:

What's a C++ reference (for C programmers)

A reference can be thought of as a constant pointer (not to be confused with a pointer to a constant value!) with automatic indirection, ie the compiler will apply the * operator for you.

All references must be initialized with a non-null value or compilation will fail. It's neither possible to get the address of a reference - the address operator will return the address of the referenced value instead - nor is it possible to do arithmetics on references.

C programmers might dislike C++ references as it will no longer be obvious when indirection happens or if an argument gets passed by value or by pointer without looking at function signatures.

C++ programmers might dislike using pointers as they are considered unsafe - although references aren't really any safer than constant pointers except in the most trivial cases - lack the convenience of automatic indirection and carry a different semantic connotation.

Consider the following statement from the C++ FAQ:

Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object. It is not a pointer to the object, nor a copy of the object. It is the object.

But if a reference really were the object, how could there be dangling references? In unmanaged languages, it's impossible for references to be any 'safer' than pointers - there generally just isn't a way to reliably alias values across scope boundaries!

Why I consider C++ references useful

Coming from a C background, C++ references may look like a somewhat silly concept, but one should still use them instead of pointers where possible: Automatic indirection is convenient, and references become especially useful when dealing with RAII - but not because of any perceived safety advantage, but rather because they make writing idiomatic code less awkward.

RAII is one of the central concepts of C++, but it interacts non-trivially with copying semantics. Passing objects by reference avoids these issues as no copying is involved. If references were not present in the language, you'd have to use pointers instead, which are more cumbersome to use, thus violating the language design principle that the best-practice solution should be easier than the alternatives.

Solution 3:

If you want to be really pedantic, there is one thing you can do with a reference that you can't do with a pointer: extend the lifetime of a temporary object. In C++ if you bind a const reference to a temporary object, the lifetime of that object becomes the lifetime of the reference.

std::string s1 = "123";
std::string s2 = "456";

std::string s3_copy = s1 + s2;
const std::string& s3_reference = s1 + s2;

In this example s3_copy copies the temporary object that is a result of the concatenation. Whereas s3_reference in essence becomes the temporary object. It's really a reference to a temporary object that now has the same lifetime as the reference.

If you try this without the const it should fail to compile. You cannot bind a non-const reference to a temporary object, nor can you take its address for that matter.

Solution 4:

Apart from syntactic sugar, a reference is a const pointer (not pointer to a const). You must establish what it refers to when you declare the reference variable, and you cannot change it later.

Update: now that I think about it some more, there is an important difference.

A const pointer's target can be replaced by taking its address and using a const cast.

A reference's target cannot be replaced in any way short of UB.

This should permit the compiler to do more optimization on a reference.

Solution 5:

Contrary to popular opinion, it is possible to have a reference that is NULL.

int * p = NULL;
int & r = *p;
r = 1;  // crash! (if you're lucky)

Granted, it is much harder to do with a reference - but if you manage it, you'll tear your hair out trying to find it. References are not inherently safe in C++!

Technically this is an invalid reference, not a null reference. C++ doesn't support null references as a concept as you might find in other languages. There are other kinds of invalid references as well. Any invalid reference raises the spectre of undefined behavior, just as using an invalid pointer would.

The actual error is in the dereferencing of the NULL pointer, prior to the assignment to a reference. But I'm not aware of any compilers that will generate any errors on that condition - the error propagates to a point further along in the code. That's what makes this problem so insidious. Most of the time, if you dereference a NULL pointer, you crash right at that spot and it doesn't take much debugging to figure it out.

My example above is short and contrived. Here's a more real-world example.

class MyClass
{
    ...
    virtual void DoSomething(int,int,int,int,int);
};

void Foo(const MyClass & bar)
{
    ...
    bar.DoSomething(i1,i2,i3,i4,i5);  // crash occurs here due to memory access violation - obvious why?
}

MyClass * GetInstance()
{
    if (somecondition)
        return NULL;
    ...
}

MyClass * p = GetInstance();
Foo(*p);

I want to reiterate that the only way to get a null reference is through malformed code, and once you have it you're getting undefined behavior. It never makes sense to check for a null reference; for example you can try if(&bar==NULL)... but the compiler might optimize the statement out of existence! A valid reference can never be NULL so from the compiler's view the comparison is always false, and it is free to eliminate the if clause as dead code - this is the essence of undefined behavior.

The proper way to stay out of trouble is to avoid dereferencing a NULL pointer to create a reference. Here's an automated way to accomplish this.

template<typename T>
T& deref(T* p)
{
    if (p == NULL)
        throw std::invalid_argument(std::string("NULL reference"));
    return *p;
}

MyClass * p = GetInstance();
Foo(deref(p));

For an older look at this problem from someone with better writing skills, see Null References from Jim Hyslop and Herb Sutter.

For another example of the dangers of dereferencing a null pointer see Exposing undefined behavior when trying to port code to another platform by Raymond Chen.