Why is <= slower than < using this code snippet in V8?

Solution 1:

Other answers and comments mention that the difference between the two loops is that the first one executes one more iteration than the second one. This is true, but in an array that grows to 25,000 elements, one iteration more or less would only make a miniscule difference. As a ballpark guess, if we assume the average length as it grows is 12,500, then the difference we might expect should be around 1/12,500, or only 0.008%.

The performance difference here is much larger than would be explained by that one extra iteration, and the problem is explained near the end of the presentation.

this.primes is a contiguous array (every element holds a value) and the elements are all numbers.

A JavaScript engine may optimize such an array to be an simple array of actual numbers, instead of an array of objects which happen to contain numbers but could contain other values or no value. The first format is much faster to access: it takes less code, and the array is much smaller so it will fit better in cache. But there are some conditions that may prevent this optimized format from being used.

One condition would be if some of the array elements are missing. For example:

let array = [];
a[0] = 10;
a[2] = 20;

Now what is the value of a[1]? It has no value. (It isn't even correct to say it has the value undefined - an array element containing the undefined value is different from an array element that is missing entirely.)

There isn't a way to represent this with numbers only, so the JavaScript engine is forced to use the less optimized format. If a[1] contained a numeric value like the other two elements, the array could potentially be optimized into an array of numbers only.

Another reason for an array to be forced into the deoptimized format can be if you attempt to access an element outside the bounds of the array, as discussed in the presentation.

The first loop with <= attempts to read an element past the end of the array. The algorithm still works correctly, because in the last extra iteration:

  • this.primes[i] evaluates to undefined because i is past the array end.
  • candidate % undefined (for any value of candidate) evaluates to NaN.
  • NaN == 0 evaluates to false.
  • Therefore, the return true is not executed.

So it's as if the extra iteration never happened - it has no effect on the rest of the logic. The code produces the same result as it would without the extra iteration.

But to get there, it tried to read a nonexistent element past the end of the array. This forces the array out of optimization - or at least did at the time of this talk.

The second loop with < reads only elements that exist within the array, so it allows an optimized array and code.

The problem is described in pages 90-91 of the talk, with related discussion in the pages before and after that.

I happened to attend this very Google I/O presentation and talked with the speaker (one of the V8 authors) afterward. I had been using a technique in my own code that involved reading past the end of an array as a misguided (in hindsight) attempt to optimize one particular situation. He confirmed that if you tried to even read past the end of an array, it would prevent the simple optimized format from being used.

If what the V8 author said is still true, then reading past the end of the array would prevent it from being optimized and it would have to fall back to the slower format.

Now it's possible that V8 has been improved in the meantime to efficiently handle this case, or that other JavaScript engines handle it differently. I don't know one way or the other on that, but this deoptimization is what the presentation was talking about.

Solution 2:

I work on V8 at Google, and wanted to provide some additional insight on top of the existing answers and comments.

For reference, here's the full code example from the slides:

var iterations = 25000;

function Primes() {
  this.prime_count = 0;
  this.primes = new Array(iterations);
  this.getPrimeCount = function() { return this.prime_count; }
  this.getPrime = function(i) { return this.primes[i]; }
  this.addPrime = function(i) {
    this.primes[this.prime_count++] = i;
  }
  this.isPrimeDivisible = function(candidate) {
    for (var i = 1; i <= this.prime_count; ++i) {
      if ((candidate % this.primes[i]) == 0) return true;
    }
    return false;
  }
};

function main() {
  var p = new Primes();
  var c = 1;
  while (p.getPrimeCount() < iterations) {
    if (!p.isPrimeDivisible(c)) {
      p.addPrime(c);
    }
    c++;
  }
  console.log(p.getPrime(p.getPrimeCount() - 1));
}

main();

First and foremost, the performance difference has nothing to do with the < and <= operators directly. So please don't jump through hoops just to avoid <= in your code because you read on Stack Overflow that it's slow --- it isn't!


Second, folks pointed out that the array is "holey". This was not clear from the code snippet in OP's post, but it is clear when you look at the code that initializes this.primes:

this.primes = new Array(iterations);

This results in an array with a HOLEY elements kind in V8, even if the array ends up completely filled/packed/contiguous. In general, operations on holey arrays are slower than operations on packed arrays, but in this case the difference is negligible: it amounts to 1 additional Smi (small integer) check (to guard against holes) each time we hit this.primes[i] in the loop within isPrimeDivisible. No big deal!

TL;DR The array being HOLEY is not the problem here.


Others pointed out that the code reads out of bounds. It's generally recommended to avoid reading beyond the length of arrays, and in this case it would indeed have avoided the massive drop in performance. But why though? V8 can handle some of these out-of-bound scenarios with only a minor performance impact. What's so special about this particular case, then?

The out-of-bounds read results in this.primes[i] being undefined on this line:

if ((candidate % this.primes[i]) == 0) return true;

And that brings us to the real issue: the % operator is now being used with non-integer operands!

  • integer % someOtherInteger can be computed very efficiently; JavaScript engines can produce highly-optimized machine code for this case.

  • integer % undefined on the other hand amounts to a way less efficient Float64Mod, since undefined is represented as a double.

The code snippet can indeed be improved by changing the <= into < on this line:

for (var i = 1; i <= this.prime_count; ++i) {

...not because <= is somehow a superior operator than <, but just because this avoids the out-of-bounds read in this particular case.