How does a C++ reference look, memory-wise?

Solution 1:

everywhere the reference j is encountered, it is replaced with the address of i. So basically the reference content address is resolved at compile time, and there is not need to dereference it like a pointer at run time.

Just to clarify what I mean by the address of i :

void function(int& x)
{
    x = 10;
}

int main()
{
    int i = 5;
    int& j = i;

    function(j);
}

In the above code, j should not take space on the main stack, but the reference x of function will take a place on its stack. That means when calling function with j as an argument, the address of i that will be pushed on the stack of function. The compiler can and should not reserve space on the main stack for j.

For the array part the standards say ::

C++ Standard 8.3.2/4:

There shall be no references to references, no arrays of references, and no pointers to references.

Why arrays of references are illegal?

Solution 2:

How does a C++ reference look, memory-wise?

It doesn't. The C++ standard only says how it should behave, not how it should be implemented.

In the general case, compilers usually implement references as pointers. But they generally have more information about what a reference may point to, and use that for optimization.

Remember that the only requirement for a reference is that it behaves as an alias for the referenced object. So if the compiler encounters this code:

int i = 42;
int& j = i;
int k = 44;

what it sees is not "create a pointer to the variable i" (although that is how the compiler may choose to implement it in some cases), but rather "make a note in the symbol table that j is now an alias for i."

The compiler doesn't have to create a new variable for j, it simply has to remember that whenever j is referenced from now on, it should really swap it out and use i instead.

As for creating an array of references, you can't do it because it'd be useless and meaningless.

When you create an array, all elements are default-constructed. What does it mean to default-construct a reference? What does it point to? The entire point in references is that they re initialized to reference another object, after which they can not be reseated.

So if it could be done, you would end up with an array of references to nothing. And you'd be unable to change them to reference something because they'd been initialized already.

Solution 3:

Sorry for using assembly to explain this, but I think this is the best way to understand references.

#include <iostream>

using namespace std;

int main()
{
    int i = 10;
    int *ptrToI = &i;
    int &refToI = i;

    cout << "i = " << i << "\n";
    cout << "&i = " << &i << "\n";

    cout << "ptrToI = " << ptrToI << "\n";
    cout << "*ptrToI = " << *ptrToI << "\n";
    cout << "&ptrToI = " << &ptrToI << "\n";

    cout << "refToI = " << refToI << "\n";
    //cout << "*refToI = " << *refToI << "\n";
    cout << "&refToI = " << &refToI << "\n";

    return 0;
}

Output of this code is like this

i = 10
&i = 0xbf9e52f8
ptrToI = 0xbf9e52f8
*ptrToI = 10
&ptrToI = 0xbf9e52f4
refToI = 10
&refToI = 0xbf9e52f8

Lets look at the disassembly (I used GDB for this. 8, 9, and 10 here are line numbers of code)

8           int i = 10;
0x08048698 <main()+18>: movl   $0xa,-0x10(%ebp)

Here $0xa is the 10(decimal) that we are assigning to i. -0x10(%ebp) here means content of ebp register –16(decimal). -0x10(%ebp) points to the address of i on stack.

9           int *ptrToI = &i;
0x0804869f <main()+25>: lea    -0x10(%ebp),%eax
0x080486a2 <main()+28>: mov    %eax,-0x14(%ebp)

Assign address of i to ptrToI. ptrToI is again on stack located at address -0x14(%ebp), that is ebp – 20(decimal).

10          int &refToI = i;
0x080486a5 <main()+31>: lea    -0x10(%ebp),%eax
0x080486a8 <main()+34>: mov    %eax,-0xc(%ebp)

Now here is the catch! Compare disassembly of line 9 and 10 and you will observer that -0x14(%ebp) is replaced by -0xc(%ebp) in line number 10. -0xc(%ebp) is the address of refToI. It is allocated on stack. But you will never be able to get this address from you code because you are not required to know the address.

So; a reference does occupy memory. In this case, it is the stack memory, since we have allocated it as a local variable.

How much memory does it occupy? As much a pointer occupies.

Now let's see how we access the reference and pointers. For simplicity I have shown only part of the assembly snippet

16          cout << "*ptrToI = " << *ptrToI << "\n";
0x08048746 <main()+192>:        mov    -0x14(%ebp),%eax
0x08048749 <main()+195>:        mov    (%eax),%ebx
19          cout << "refToI = " << refToI << "\n";
0x080487b0 <main()+298>:        mov    -0xc(%ebp),%eax
0x080487b3 <main()+301>:        mov    (%eax),%ebx

Now compare the above two lines, you will see striking similarity. -0xc(%ebp) is the actual address of refToI which is never accessible to you.

In simple terms, if you think of reference as a normal pointer, then accessing a reference is like fetching the value at address pointed to by the reference. Which means the below two lines of code will give you the same result

cout << "Value if i = " << *ptrToI << "\n";
cout << "Value if i = " << refToI << "\n";

Now compare these:

15          cout << "ptrToI = " << ptrToI << "\n";
0x08048713 <main()+141>:        mov    -0x14(%ebp),%ebx
21          cout << "&refToI = " << &refToI << "\n";
0x080487fb <main()+373>:        mov    -0xc(%ebp),%eax

I guess you are able to spot what is happening here. If you ask for &refToI:

  1. The contents of -0xc(%ebp) address location are returned.
  2. -0xc(%ebp) is where refToI resides, and its contents are nothing but address of i.

One last thing. Why is this line commented?

// cout << "*refToI = " << *refToI << "\n";

Because *refToI is not permitted, and it will give you a compile time error.

Solution 4:

In practice, a reference is equivalent to a pointer, except that the extra constraints on how references are allowed to be used can allow a compiler to "optimize it away" in more cases (depending on how smart the compiler is, its optimization settings, etc etc of course).