C++ references - are they just syntactic sugar?

Is a C++ reference just syntactic sugar, or does it offer any speed ups in certain cases?

For example, a call-by-pointer involves a copy anyway, and that seems to be true about a call-by-reference as well. The underlying mechanism appears to be the same.

Edit: After about six answers and many comments. I am still of the opinion references are just syntatic sugar. If people could answer in a straight yes or no, and if someone could do an accepted answer?


Assume reference as a pointer that:

  1. Can't be NULL
  2. Once initialized, can't be re-pointed to other object
  3. Any attempt to use it will implicitly dereference it:

    int a = 5;
    int &ra = a;
    int *pa = &a;
    
    ra = 6;
    
    (*pa) = 6;
    

here as it looks in disassembly:

    int a = 5;
00ED534E  mov         dword ptr [a],5  
    int &ra = a;
00ED5355  lea         eax,[a]  
00ED5358  mov         dword ptr [ra],eax  
    int *pa = &a;
00ED535B  lea         eax,[a]  
00ED535E  mov         dword ptr [pa],eax  

    ra = 6;
00ED5361  mov         eax,dword ptr [ra]  
00ED5364  mov         dword ptr [eax],6  

    (*pa) = 6;
00ED536A  mov         eax,dword ptr [pa]  
00ED536D  mov         dword ptr [eax],6  

the assigning to the reference is the same thing from the compiler perspective as the assigning to a dereferenced pointer. There are no difference between them as you can see (we are not talking about compiler optimization right now) However as mentioned above, references can't be null and have stronger guarantees of what they contains.

As for me, I prefer using references as long as I don't need nullptr as a valid value, values that should be repointed or values of different types to be passed into (e.g. pointer to interface type).


References have stronger guarantees than pointers, so the compiler can optimize more aggressively. I've recently seen GCC inline multiple nested calls through function references perfectly, but not a single one through function pointers (because it couldn't prove that the pointer was always pointing at the same function).

If the reference ends up stored somewhere, it typically takes the same space as a pointer. That is not to say, again, that it will be used like a pointer : the compiler may well cut through it if it knows which object the reference was bound to.


The compiler cannot assume a pointer is non-null; when optimizing code, it has to either prove the pointer is non-null, or emit a program that accounts for the possibility that it is null (in a context where that would be well-defined).

Similarly, the compiler cannot assume the pointer never changes value. (nor can it assume the pointer points to a valid object, although I'm having trouble imagining a case where that would matter in a well-defined context)

On the other hand, assuming that references are implemented as pointers, the compiler is still allowed to assume it is non-null, never changes where it points, and points to a valid object.


References differ from pointers in that there are things you cannot do to a reference and have it be defined behavior.

You cannot take the address of a reference, but only what is referred to. You cannot modify a reference once it is created.

A T& and a T*const (note that const applies to the pointer, not the pointed-to, there) are relatively similar. Taking the address of an actual const value and modifying it is undefined behavior, as is modifying (any storage that it uses directly) a reference.

Now, in practice, you can get a the storage of a reference:

struct foo {
  int& x;
};

sizeof(foo) will almost certainly equal sizeof(int*). But the compiler is free to neglect the possibility that someone directly accessing the bytes of foo could actually change the value referred to. This permits the compiler to read the reference "pointer" implementation once, and then never read it again. If we had struct foo{ int* x; } the compiler would have to prove each time it did a *f.x that the pointer value had not changed.

If you had struct foo{ int*const x; } is again starts behaving reference-like in its immutability (modifying something that was declared const is UB).


A trick that I'm not aware of any compiler writers using is to compress reference-capture in a lambda.

If you have a lambda that captures data by reference, instead of capturing each value via a pointer, it could capture only the stack frame pointer. The offsets to each local variable are compile-time constants off the stack frame pointer.

The exception is references captured by reference, which under a defect report to C++ must remain valid even if the reference variable goes out of scope. So those have to be captured by pseudo-pointer.

For a concrete example (if a toy one):

void part( std::vector<int>& v, int left, int right ) {
  std::function<bool(int)> op = [&](int y){return y<left && y>right;};
  std::partition( begin(v), end(v), op );
}

the lambda above could capture only the stack frame pointer, and know where left and right are relative to it, reducing it size, instead of capturing two ints by (basically pointer) reference.

Here we have references implied by [&] whose existence is eliminated easier than if they where pointers captured by value:

void part( std::vector<int>& v, int left, int right ) {
  int* pleft=&left;
  int* pright=&right;
  std::function<bool(int)> op = [=](int y){return y<*pleft && y>*pright;};
  std::partition( begin(v), end(v), op );
}

There are a few other differences between references and pointers.

A reference can extend the lifetime of a temporary.

This is used heavily in for(:) loops. Both the definition of the for(:) loop relies on reference lifetime extension to avoid needless copies, and users of for(:) loops can use auto&& to automatically deduce the lightest weight way to wrap the iterated objects.

struct big { int data[1<<10]; };

std::array<big, 100> arr;

arr get_arr();

for (auto&& b : get_arr()) {
}

here reference lifetime extension carefully prevents needless copies from ever occuring. If we change make_arr to return a arr const& it continues to work without any copies. If we change get_arr to return a container that returns big elements by-value (say, an input iterator range), again no needless copies are done.

This is in a sense syntactic sugar, but it allows the same construct to be optimal in many cases without having to micro-optimize based on how things are returned or iterated over.


Similarly, forwarding references allow data to be treated as a const, non-const, lvalue or rvalue intelligently. Temporaries are marked as temporaries, data that users have no further need for is marked as temporary, data that will persist is marked as being an lvalue reference.

The advantage references have over non-references here is that you can form a rvalue reference to a temporary, and you cannot form a pointer to that temporary without passing it through an rvalue reference-to-lvalue reference conversion.


No


References are not just a syntactic difference; they also have different semantics:

  • A reference always aliases an existing object, unlike a pointer which may be nullptr (a sentinel value).
  • A reference cannot be re-seated, it always points to the same object throughout its lifetime.
  • A reference can extend the lifetime of an object, see binding to auto const& or auto&&.

Thus, at the language level, a reference is an entity of its own. The rest are implementation details.