Solution 1:

I'm also new to d3 and was struggling with the key function. I didn't find Tristan Reid's answer illuminating, because it doesn't really talk much about the key function.

Let's work through an example, first without a key function, and then with.

Here's our initial html before applying javascript. We've got two divs, and there is no data attached to anything.

<body>
    <div>** First div **</div>
    <div>** Second div **</div>
</body>

Calling data() with no key function

Let's add a couple lines of javascript.

var arr1 = [35, 70, 24, 86, 59];
d3.select("body")
    .selectAll("div")
    .data(arr1)
    .enter()
    .append("div")
    .html(function(d) { return d });

What does our html look like now? Here is the html along with the values of the associated data (comments added).

<body>
    <div>** First div ** </div>   <!-- data:  35 -->
    <div>** Second div ** </div>  <!-- data:  70 -->
    <div>24</div>                 <!-- data:  24 -->
    <div>86</div>                 <!-- data:  86 -->
    <div>59</div>                 <!-- data:  59 -->
</body>

The data() call matched an array of divs with an array of values by use of a key. The default keys used for the arrays is the indexes. So these are the keys that were used.

selected divs (by text)  key       data elements  key
-----------------------  ---       -------------  ---
** First div **          0         35             0
** Second div **         1         70             1
                                   24             2
                                   86             3
                                   59             4

Going by the keys, two of the data elements have matches in the selected divs -- those with keys 0 and 1. Those matching divs get bound to data, but nothing else happens them.

All the data elements without a matching key get passed to enter(). In this case, there is no match for the keys 2, 3, and 4. So those data elements get passed to enter(), and a new div is appended for each of them. The appended divs are also bound to their respective data values.

Calling data() with a key function

Let's change our javascript, keeping what we have but adding a couple more lines. We'll perform the same selects with a data call (with a different array), but this time using a key function. Notice the partial overlap between arr1 and arr2.

var arr1 = [35, 70, 24, 86, 59];
d3.select("body")
    .selectAll("div")
    .data(arr1)                            // no key function
    .enter()
    .append("div")
    .html(function(d) { return d });

var arr2 = [35, 7, 24, 2];
d3.select("body")
    .selectAll("div")
    .data(arr2, function(d) { return d })  // key function used
    .enter()
    .append("div")
    .html(function(d) { return "new: " + d});

The resulting html looks like this (with comment added):

<body>
    <div>** First div** </div>    <!-- data:  35 -->
    <div>** Second div ** </div>  <!-- data:  70 -->
    <div>24</div>                 <!-- data:  24 -->
    <div>86</div>                 <!-- data:  86 -->
    <div>59</div>                 <!-- data:  59 -->
    <div>new: 7</div>             <!-- data:  7 -->
    <div>new: 2</div>             <!-- data:  2 -->
</body>

The second call to data() used the value returned by the function for the keys. For the selected elements, the function returns a value derived from the data that had already been bound to them by the first call to data(). That is, their key is based on their bound data.

For the second data() call, the keys used for matching look like this.

selected divs (by text) key       data elements  key
----------------------- ---       -------------  ---
** First div **         35        35             35
** Second div **        70        7              7
24                      24        24             24
86                      86        2              2
59                      59

The data elements without matching keys are 7 and 2. Those data elements are passed to enter(). So we get two new divs appended to the body.

Okay, so now let's look back at the original post. The OP said that there was no difference between the data() call with a function and without. That's probably because -- as Tristan Reid suggests -- the key function was being used on html elements that had no bound data. When there's no bound data, there will be no matching keys, so all of data elements will get passed to the enter() function.

Working through this example helped illuminate for me the relationships between selections, keys, and bound data. Hopefully it will be helpful to someone else.

Solution 2:

The key function explains how to join the data to the elements. The default key function if you don't supply one is to use the index.

To understand all this, consider that a d3 selectAll/data is performing two phases of matching. The first is the selector, e.g. d3.selectAll('div'), which will match all divs. The second is the data join, data([1,2,3]), which is looking for elements with data properties which match the data you pass in. The emphasis is because I think understanding this is fundamental to getting full benefit from d3.

Here's an example (also a fiddle) that demonstrates the difference.

function id (d) { return d; }

d3.select('.no-key').selectAll('div')
    .data([1,2,3])
    .enter().append('div')
    .text(id);

d3.select('.yes-key').selectAll('div')
    .data([1,2,3], id)
    .enter().append('div')
    .text(id);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.4.11/d3.min.js"></script>
<div class='no-key'>
    <div class='a'>already here</div>
</div>
<br/>
<div class='yes-key'>
    <div>already here</div>
</div>

I applaud the efforts of the other answer, but this answer doesn't require parsing a console out, it shows the actual difference in functionality.

Why does this difference happen? Here are the gory details:

If you do a d3.selectAll('div') you are selecting all divs. If you then do a .data([1,2,3]), you are joining that data to those divs: but the join doesn't have a key function, so it isn't looking to see if the divs have [1,2,3] as data elements, it's just going to use the first 3 divs that it finds.

If you instead do .data([1,2,3], function(d){return d;}), your key function says to match [1,2,3] against the data in the divs, so unless you have existing divs that have data elements, you won't match any existing divs.

The illustration of all this is in the .enter().append('div'), which of course adds any necessary divs that weren't found in the above matches. That's the bottom line of all this enter().append business: It adds (number of data elements) - (number of existing elements that match the key function)

Hope this helps!