When pass a variable to a function, why the function only gets a duplicate of the variable?

The are basically two schools of thought on this matter.

The first is pass-by-value where a copy of the value is created for the called function.

The second is pass-by-reference where the parameter that appears in the called function is an "alias" of the original. That means changes you make to it are reflected in the original.

C is generally a pass-by-value language. You can emulate pass-by-reference by passing the address of a variable and then using that to modify the original:

void setTo42 (int *x) { *x = 42; }
:
int y;
setTo42 (&y);
// y is now 42

but that's more passing the pointer to a variable by value, than passing the variable itself by reference.

C++ has true reference types, possibly because so many people have trouble with C pointers :-) They're done as follows:

void setTo42 (int &x) { x = 42; }

:
int y;
setTo42 (y);
// y is now 42

Pass-by-value is usually preferable since it limits the effects that a function can have on the "outside world" - encapsulation, modularity and localisation of effect is usually a good thing.

Being able to arbitrarily modify any parameters passed in would be nearly as bad as global variables in terms on modularity and code management.

However, sometimes you need pass-by-reference since it might make sense to change one of the variables passed in.


Most modern languages are defined to use pass by value. The reason is simple: it significantly simplifies reasoning about the code if you know that a function cannot change your local state. You can always pass by non-const reference if you want a function to be able to modify local state, but such cases should be extremely rare.

EDITED to respond to updated question:

No, you're not right. Pass by value is the simplest mechanism for passing parameters. Pass by reference or copy-in/copy-out are more complex (and of course, Algol's expression replacement is the most complicated).

Think about it for awhile. Consider f(10). With call by value, the compiler just pushes 10 on the stack, and the function just accesses the value in situ. With call by reference, the compiler must create a temporary, initialize it with 10, and then pass a pointer to it to the function. Inside the function, the compiler must generate an indirection each time it accesses the value.

Also, protecting from modification inside the function doesn't really help readability. If the function takes no reference parameters, you know without looking inside the function that it cannot modify any variables you pass as arguments. Regardless of how anyone modifies the function in the future. (One could even argue that functions should not be allowed to modify global state. Which would make the implementation of rand() rather difficult. But would certainly help the optimizer.)