C++: Why pass-by-value is generally more efficient than pass-by-reference for built-in (i.e., C-like) types

just as what indicated in the title


Solution 1:

A compiler vendor would typically implement a reference as a pointer. Pointers tend to be the same size as or larger than many of the built-in types. For these built-in types the same amount of data would be passed whether you passed by value or by reference. In the function, in order to get the actual data, you would however need to dereference this internal pointer. This can add an instruction to the generated code, and you will also have two memory locations that may not be in cache. The difference won't be much - but it could be measured in tight loops.

A compiler vendor could choose to disregard const references (and sometimes also non-const references) when they're used on built-in types - all depending on the information available to the compiler when it deals with the function and its callers.

Solution 2:

For pod types like int, char, short, and float the size of the data is the same size (or smaller) than the address passed in to reference the actual data. Looking up the value at the referenced address is an unnecessary step and adds additional cost.

For example, take the following functions foo and bar

void foo(char& c) {...}
void bar(char c) {...}

When foo is called an address is passed by value of either 32bits or 64bits, depending on your platform. When you use c within foo you have the cost of looking up the value of data held at the passed in address.

When calling bar a value of the size of char is passed in and there is no address lookup overhead.

Solution 3:

In practice, C++ implementations generally implement pass-by-reference by passing a pointer under the hood (assuming the call isn't inlined).

So there's no clever mechanism that will allow pass-by-reference to be faster, since it's no faster to pass a pointer than to pass a small value. And pass-by-value can also benefit from better optimization once you're in the function. For example:

int foo(const int &a, int *b) {
    int c = a;
    *b = 2;
    return c + a;
}

For all the compiler knows, b points to a, which is called "aliasing". Had a been passed by value, this function could optimize to the equivalent of *b = 2; return 2*a;. In a modern CPU's instruction pipeline, this could be more like "start a loading, start b storing, wait for a to load, multiply by 2, wait for b to store, return", instead of "start a loading, start b storing, wait for a to load, wait for b to store, start a loading, wait for a to load, add a to c, return", and you start to see why the potential for aliasing can have a significant effect on performance. In some cases, if not necessarily a huge effect in this one.

Of course aliasing only impedes optimization in cases where it changes the effect of the function for some possible input. But just because your intention for the function is that aliasing shouldn't ever affect the results, it doesn't necessarily mean the compiler can assume it doesn't: sometimes in fact, in your program, no aliasing occurs, but the compiler doesn't know that. And there doesn't have to be a second pointer parameter, any time that your function calls code that the optimizer "can't see", it has to assume that any reference could change.