400x Sorting Speedup by Switching a.localeCompare(b) to (a<b?-1:(a>b?1:0))

By switching a javascript sort function from

myArray.sort(function (a, b) {
  return a.name.localeCompare(b.name);
});

to

myArray.sort(function (a, b) {
  return (a.name < b.name ? -1 : (a.name > b.name ? 1 : 0));
});

I was able to cut the time to sort a ~1700 element array in Chrome from 1993 milliseconds to 5 milliseconds. Almost a 400x speedup. Unfortunately this is at the expense of correctly sorting non-english strings.

Obviously I can't have my UI blocking for 2 seconds when I try to do a sort. Is there anything I can do to avoid the horribly slow localeCompare but still maintain support for localized strings?


Solution 1:

A great performance improvement can be obtained by declaring the collator object beforehand and using it's compare method. EG:

const collator = new Intl.Collator('en', { numeric: true, sensitivity: 'base' });
arrayOfObjects.sort((a, b) => {
  return collator.compare(a.name, b.name);
});

NOTE: This doesn't work ok if the elements are floats. See explanation here: Intl.Collator and natural sort with numeric option sorts incorrectly with decimal numbers

Here's a benchmark script comparing the 3 methods:

const arr = [];
for (let i = 0; i < 2000; i++) {
  arr.push(`test-${Math.random()}`);
}

const arr1 = arr.slice();
const arr2 = arr.slice();
const arr3 = arr.slice();

console.time('#1 - localeCompare');
arr1.sort((a, b) => a.localeCompare(
  b,
  undefined, {
    numeric: true,
    sensitivity: 'base'
  }
));
console.timeEnd('#1 - localeCompare');

console.time('#2 - collator');
const collator = new Intl.Collator('en', {
  numeric: true,
  sensitivity: 'base'
});
arr2.sort((a, b) => collator.compare(a, b));
console.timeEnd('#2 - collator');

console.time('#3 - non-locale');
arr3.sort((a, b) => (a < b ? -1 : (a > b ? 1 : 0)));
console.timeEnd('#3 - non-locale');

Solution 2:

An effective Approach that I found when dealing with /mostly/ latin characters is to use the operator whenever both strings match a specific regex. EG: /^[\w-.\s,]*$/

It's much faster if both strings match the expression, and at worst it seems to be slightly slower than blindly calling localeCompare.

Example here: http://jsperf.com/operator-vs-localecompage/11

Update: it seems like Intl.Collator is currently the best option for performance across the board: https://jsperf.com/operator-vs-localecompage/22

Solution 3:

It's difficult to know the fastest sort without seeing the data you are sorting. But jsperf has lots of good tests showing the performance differences between types of sorting: http://jsperf.com/javascript-sort/45 http://jsperf.com/sort-algorithms/31

However none of these account for localised strings, and i'd imagine there is no easy way to sort localised strings and localeCompare is probably the best solution for this.

Looking at mozilla reference is says: "When comparing large numbers of strings, such as in sorting large arrays, it is better to create an Intl.Collator object and use the function provided by its compare property." https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare

But going to the Intl.Collator reference it shows that is not support for firefox/safari https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Collator

you could try using some of the options on localCompare to speed up the performance. But I've just done a quick test changing the sensitivity level and it seems like it won't improve the performance:

list.sort(function(a, b) {
  return a.localeCompare(b, {sensitivity:'base'});
});

http://jsperf.com/sort-locale-strings

Solution 4:

Try sorting it in 2 steps:

  1. With the operator: as you said, it will be 400 times faster
  2. Then with localCompare(): this has now less comparisons to do because the array is mostly sorted.

Note: I think that localCompare() will mostly be called with at least 1 string that is not english. So the number of calls to localCompare() with 2 english strings should be greatly reduced.

Here is the code:

myArray.sort(function(a, b) {
  return (a.name < b.name ? -1 : (a.name > b.name ? 1 : 0));
});

myArray.sort(function(a, b) {
  return a.name.localeCompare(b.name);
});

This solution has the advantage of being short and easy to use. It will be efficient if the array contains mostly english strings. The more non-english strings you have, the less useful the first sort will be. But as it is easy to add in your scripts, it is also easy to see if this approach is worthwile.

Now if I were you, I would also use an Intl.Collator, as it is said to be much faster than localCompare() when you have many comparisons to do.