What's the difference between passing by reference vs. passing by value?

Solution 1:

First and foremost, the "pass by value vs. pass by reference" distinction as defined in the CS theory is now obsolete because the technique originally defined as "pass by reference" has since fallen out of favor and is seldom used now.1

Newer languages2 tend to use a different (but similar) pair of techniques to achieve the same effects (see below) which is the primary source of confusion.

A secondary source of confusion is the fact that in "pass by reference", "reference" has a narrower meaning than the general term "reference" (because the phrase predates it).


Now, the authentic definition is:

  • When a parameter is passed by reference, the caller and the callee use the same variable for the parameter. If the callee modifies the parameter variable, the effect is visible to the caller's variable.

  • When a parameter is passed by value, the caller and callee have two independent variables with the same value. If the callee modifies the parameter variable, the effect is not visible to the caller.

Things to note in this definition are:

  • "Variable" here means the caller's (local or global) variable itself -- i.e. if I pass a local variable by reference and assign to it, I'll change the caller's variable itself, not e.g. whatever it is pointing to if it's a pointer.

    • This is now considered bad practice (as an implicit dependency). As such, virtually all newer languages are exclusively, or almost exclusively pass-by-value. Pass-by-reference is now chiefly used in the form of "output/inout arguments" in languages where a function cannot return more than one value.
  • The meaning of "reference" in "pass by reference". The difference with the general "reference" term is that this "reference" is temporary and implicit. What the callee basically gets is a "variable" that is somehow "the same" as the original one. How specifically this effect is achieved is irrelevant (e.g. the language may also expose some implementation details -- addresses, pointers, dereferencing -- this is all irrelevant; if the net effect is this, it's pass-by-reference).


Now, in modern languages, variables tend to be of "reference types" (another concept invented later than "pass by reference" and inspired by it), i.e. the actual object data is stored separately somewhere (usually, on the heap), and only "references" to it are ever held in variables and passed as parameters.3

Passing such a reference falls under pass-by-value because a variable's value is technically the reference itself, not the referred object. However, the net effect on the program can be the same as either pass-by-value or pass-by-reference:

  • If a reference is just taken from a caller's variable and passed as an argument, this has the same effect as pass-by-reference: if the referred object is mutated in the callee, the caller will see the change.
    • However, if a variable holding this reference is reassigned, it will stop pointing to that object, so any further operations on this variable will instead affect whatever it is pointing to now.
  • To have the same effect as pass-by-value, a copy of the object is made at some point. Options include:
    • The caller can just make a private copy before the call and give the callee a reference to that instead.
    • In some languages, some object types are "immutable": any operation on them that seems to alter the value actually creates a completely new object without affecting the original one. So, passing an object of such a type as an argument always has the effect of pass-by-value: a copy for the callee will be made automatically if and when it needs a change, and the caller's object will never be affected.
      • In functional languages, all objects are immutable.

As you may see, this pair of techniques is almost the same as those in the definition, only with a level of indirection: just replace "variable" with "referenced object".

There's no agreed-upon name for them, which leads to contorted explanations like "call by value where the value is a reference". In 1975, Barbara Liskov suggested the term "call-by-object-sharing" (or sometimes just "call-by-sharing") though it never quite caught on. Moreover, neither of these phrases draws a parallel with the original pair. No wonder the old terms ended up being reused in the absence of anything better, leading to confusion.4


NOTE: For a long time, this answer used to say:

Say I want to share a web page with you. If I tell you the URL, I'm passing by reference. You can use that URL to see the same web page I can see. If that page is changed, we both see the changes. If you delete the URL, all you're doing is destroying your reference to that page - you're not deleting the actual page itself.

If I print out the page and give you the printout, I'm passing by value. Your page is a disconnected copy of the original. You won't see any subsequent changes, and any changes that you make (e.g. scribbling on your printout) will not show up on the original page. If you destroy the printout, you have actually destroyed your copy of the object - but the original web page remains intact.

This is mostly correct except the narrower meaning of "reference" -- it being both temporary and implicit (it doesn't have to, but being explicit and/or persistent are additional features, not a part of the pass-by-reference semantic, as explained above). A closer analogy would be giving you a copy of a document vs inviting you to work on the original.


1Unless you are programming in Fortran or Visual Basic, it's not the default behavior, and in most languages in modern use, true call-by-reference is not even possible.

2A fair amount of older ones support it, too

3In several modern languages, all types are reference types. This approach was pioneered by the language CLU in 1975 and has since been adopted by many other languages, including Python and Ruby. And many more languages use a hybrid approach, where some types are "value types" and others are "reference types" -- among them are C#, Java, and JavaScript.

4There's nothing bad with recycling a fitting old term per se, but one has to somehow make it clear which meaning is used each time. Not doing that is exactly what keeps causing confusion.

Solution 2:

It's a way how to pass arguments to functions. Passing by reference means the called functions' parameter will be the same as the callers' passed argument (not the value, but the identity - the variable itself). Pass by value means the called functions' parameter will be a copy of the callers' passed argument. The value will be the same, but the identity - the variable - is different. Thus changes to a parameter done by the called function in one case changes the argument passed and in the other case just changes the value of the parameter in the called function (which is only a copy). In a quick hurry:

  • Java only supports pass by value. Always copies arguments, even though when copying a reference to an object, the parameter in the called function will point to the same object and changes to that object will be see in the caller. Since this can be confusing, here is what Jon Skeet has to say about this.
  • C# supports pass by value and pass by reference (keyword ref used at caller and called function). Jon Skeet also has a nice explanation of this here.
  • C++ supports pass by value and pass by reference (reference parameter type used at called function). You will find an explanation of this below.

Codes

Since my language is C++, i will use that here

// passes a pointer (called reference in java) to an integer
void call_by_value(int *p) { // :1
    p = NULL;
}

// passes an integer
void call_by_value(int p) { // :2
    p = 42;
}

// passes an integer by reference
void call_by_reference(int & p) { // :3
    p = 42;
}

// this is the java style of passing references. NULL is called "null" there.
void call_by_value_special(int *p) { // :4
    *p = 10; // changes what p points to ("what p references" in java)
    // only changes the value of the parameter, but *not* of 
    // the argument passed by the caller. thus it's pass-by-value:
    p = NULL;
}

int main() {
    int value = 10;
    int * pointer = &value;

    call_by_value(pointer); // :1
    assert(pointer == &value); // pointer was copied

    call_by_value(value); // :2
    assert(value == 10); // value was copied

    call_by_reference(value); // :3
    assert(value == 42); // value was passed by reference

    call_by_value_special(pointer); // :4
    // pointer was copied but what pointer references was changed.
    assert(value == 10 && pointer == &value);
}

And an example in Java won't hurt:

class Example {
    int value = 0;

    // similar to :4 case in the c++ example
    static void accept_reference(Example e) { // :1
        e.value++; // will change the referenced object
        e = null; // will only change the parameter
    }

    // similar to the :2 case in the c++ example
    static void accept_primitive(int v) { // :2
        v++; // will only change the parameter
    }        

    public static void main(String... args) {
        int value = 0;
        Example ref = new Example(); // reference

        // note what we pass is the reference, not the object. we can't 
        // pass objects. The reference is copied (pass-by-value).
        accept_reference(ref); // :1
        assert ref != null && ref.value == 1;

        // the primitive int variable is copied
        accept_primitive(value); // :2
        assert value == 0;
    }
}

Wikipedia

http://en.wikipedia.org/wiki/Pass_by_reference#Call_by_value

http://en.wikipedia.org/wiki/Pass_by_reference#Call_by_reference

This guy pretty much nails it:

http://javadude.com/articles/passbyvalue.htm

Solution 3:

Many answers here (and in particular the most highly upvoted answer) are factually incorrect, since they misunderstand what "call by reference" really means. Here's my attempt to set matters straight.

TL;DR

In simplest terms:

  • call by value means that you pass values as function arguments
  • call by reference means that you pass variables as function arguments

In metaphoric terms:

  • Call by value is where I write down something on a piece of paper and hand it to you. Maybe it's a URL, maybe it's a complete copy of War and Peace. No matter what it is, it's on a piece of paper which I've given to you, and so now it is effectively your piece of paper. You are now free to scribble on that piece of paper, or use that piece of paper to find something somewhere else and fiddle with it, whatever.
  • Call by reference is when I give you my notebook which has something written down in it. You may scribble in my notebook (maybe I want you to, maybe I don't), and afterwards I keep my notebook, with whatever scribbles you've put there. Also, if what either you or I wrote there is information about how to find something somewhere else, either you or I can go there and fiddle with that information.

What "call by value" and "call by reference" don't mean

Note that both of these concepts are completely independent and orthogonal from the concept of reference types (which in Java is all types that are subtypes of Object, and in C# all class types), or the concept of pointer types like in C (which are semantically equivalent to Java's "reference types", simply with different syntax).

The notion of reference type corresponds to a URL: it is both itself a piece of information, and it is a reference (a pointer, if you will) to other information. You can have many copies of a URL in different places, and they don't change what website they all link to; if the website is updated then every URL copy will still lead to the updated information. Conversely, changing the URL in any one place won't affect any other written copy of the URL.

Note that C++ has a notion of "references" (e.g. int&) that is not like Java and C#'s "reference types", but is like "call by reference". Java and C#'s "reference types", and all types in Python, are like what C and C++ call "pointer types" (e.g. int*).


OK, here's the longer and more formal explanation.

Terminology

To start with, I want to highlight some important bits of terminology, to help clarify my answer and to ensure we're all referring to the same ideas when we are using words. (In practice, I believe the vast majority of confusion about topics such as these stems from using words in ways that to not fully communicate the meaning that was intended.)

To start, here's an example in some C-like language of a function declaration:

void foo(int param) {  // line 1
  param += 1;
}

And here's an example of calling this function:

void bar() {
  int arg = 1;  // line 2
  foo(arg);     // line 3
}

Using this example, I want to define some important bits of terminology:

  • foo is a function declared on line 1 (Java insists on making all functions methods, but the concept is the same without loss of generality; C and C++ make a distinction between declaration and definition which I won't go into here)
  • param is a formal parameter to foo, also declared on line 1
  • arg is a variable, specifically a local variable of the function bar, declared and initialized on line 2
  • arg is also an argument to a specific invocation of foo on line 3

There are two very important sets of concepts to distinguish here. The first is value versus variable:

  • A value is the result of evaluating an expression in the language. For example, in the bar function above, after the line int arg = 1;, the expression arg has the value 1.
  • A variable is a container for values. A variable can be mutable (this is the default in most C-like languages), read-only (e.g. declared using Java's final or C#'s readonly) or deeply immutable (e.g. using C++'s const).

The other important pair of concepts to distinguish is parameter versus argument:

  • A parameter (also called a formal parameter) is a variable which must be supplied by the caller when calling a function.
  • An argument is a value that is supplied by the caller of a function to satisfy a specific formal parameter of that function

Call by value

In call by value, the function's formal parameters are variables that are newly created for the function invocation, and which are initialized with the values of their arguments.

This works exactly the same way that any other kinds of variables are initialized with values. For example:

int arg = 1;
int another_variable = arg;

Here arg and another_variable are completely independent variables -- their values can change independently of each other. However, at the point where another_variable is declared, it is initialized to hold the same value that arg holds -- which is 1.

Since they are independent variables, changes to another_variable do not affect arg:

int arg = 1;
int another_variable = arg;
another_variable = 2;

assert arg == 1; // true
assert another_variable == 2; // true

This is exactly the same as the relationship between arg and param in our example above, which I'll repeat here for symmetry:

void foo(int param) {
  param += 1;
}

void bar() {
  int arg = 1;
  foo(arg);
}

It is exactly as if we had written the code this way:

// entering function "bar" here
int arg = 1;
// entering function "foo" here
int param = arg;
param += 1;
// exiting function "foo" here
// exiting function "bar" here

That is, the defining characteristic of what call by value means is that the callee (foo in this case) receives values as arguments, but has its own separate variables for those values from the variables of the caller (bar in this case).

Going back to my metaphor above, if I'm bar and you're foo, when I call you, I hand you a piece of paper with a value written on it. You call that piece of paper param. That value is a copy of the value I have written in my notebook (my local variables), in a variable I call arg.

(As an aside: depending on hardware and operating system, there are various calling conventions about how you call one function from another. The calling convention is like us deciding whether I write the value on a piece of my paper and then hand it to you, or if you have a piece of paper that I write it on, or if I write it on the wall in front of both of us. This is an interesting subject as well, but far beyond the scope of this already long answer.)

Call by reference

In call by reference, the function's formal parameters are simply new names for the same variables that the caller supplies as arguments.

Going back to our example above, it's equivalent to:

// entering function "bar" here
int arg = 1;
// entering function "foo" here
// aha! I note that "param" is just another name for "arg"
arg /* param */ += 1;
// exiting function "foo" here
// exiting function "bar" here

Since param is just another name for arg -- that is, they are the same variable, changes to param are reflected in arg. This is the fundamental way in which call by reference differs from call by value.

Very few languages support call by reference, but C++ can do it like this:

void foo(int& param) {
  param += 1;
}

void bar() {
  int arg = 1;
  foo(arg);
}

In this case, param doesn't just have the same value as arg, it actually is arg (just by a different name) and so bar can observe that arg has been incremented.

Note that this is not how any of Java, JavaScript, C, Objective-C, Python, or nearly any other popular language today works. This means that those languages are not call by reference, they are call by value.

Addendum: call by object sharing

If what you have is call by value, but the actual value is a reference type or pointer type, then the "value" itself isn't very interesting (e.g. in C it's just an integer of a platform-specific size) -- what's interesting is what that value points to.

If what that reference type (that is, pointer) points to is mutable then an interesting effect is possible: you can modify the pointed-to value, and the caller can observe changes to the pointed-to value, even though the caller cannot observe changes to the pointer itself.

To borrow the analogy of the URL again, the fact that I gave you a copy of the URL to a website is not particularly interesting if the thing we both care about is the website, not the URL. The fact that you scribbling over your copy of the URL doesn't affect my copy of the URL isn't a thing we care about (and in fact, in languages like Java and Python the "URL", or reference type value, can't be modified at all, only the thing pointed to by it can).

Barbara Liskov, when she invented the CLU programming language (which had these semantics), realized that the existing terms "call by value" and "call by reference" weren't particularly useful for describing the semantics of this new language. So she invented a new term: call by object sharing.

When discussing languages that are technically call by value, but where common types in use are reference or pointer types (that is: nearly every modern imperative, object-oriented, or multi-paradigm programming language), I find it's a lot less confusing to simply avoid talking about call by value or call by reference. Stick to call by object sharing (or simply call by object) and nobody will be confused. :-)

Solution 4:

Before understanding the 2 terms, you MUST understand the following. Every object, has 2 things that can make it be distinguished.

  • Its value.
  • Its address.

So if you say employee.name = "John"

know that there are 2 things about name. Its value which is "John" and also its location in the memory which is some hexadecimal number maybe like this: 0x7fd5d258dd00.

Depending on the language's architecture or the type (class, struct, etc.) of your object, you would be either transferring "John" or 0x7fd5d258dd00

Passing "John" is known as passing by value. Passing 0x7fd5d258dd00 is known as passing by reference. Anyone who is pointing to this memory location will have access to the value of "John".

For more on this, I recommend you to read about dereferencing a pointer and also why choose struct (value type) over class (reference type)

Solution 5:

Here is an example:

#include <iostream>

void by_val(int arg) { arg += 2; }
void by_ref(int&arg) { arg += 2; }

int main()
{
    int x = 0;
    by_val(x); std::cout << x << std::endl;  // prints 0
    by_ref(x); std::cout << x << std::endl;  // prints 2

    int y = 0;
    by_ref(y); std::cout << y << std::endl;  // prints 2
    by_val(y); std::cout << y << std::endl;  // prints 2
}