Javascript Set vs. Array performance
It maybe because Sets are relatively new to Javascript but I haven't been able to find an article, on StackO or anywhere else, that talks about the performance difference between the two in Javascript. So, what is the difference, in terms of performance, between the two? Specifically, when it comes to removing, adding and iterating.
Solution 1:
Ok, I have tested adding, iterating and removing elements from both an array and a set. I ran a "small" test, using 10 000 elements and a "big" test, using 100 000 elements. Here are the results.
Adding elements to a collection
It would seem that the .push
array method is about 4 times faster than the .add
set method, no matter the number of elements being added.
Iterating over and modifying elements in a collection
For this part of the test I used a for
loop to iterate over the array and a for of
loop to iterate over the set. Again, iterating over the array was faster. This time it would seem that it is exponentially so as it took twice as long during the "small" tests and almost four times longer during the "big" tests.
Removing elements from a collection
Now this is where it gets interesting. I used a combination of a for
loop and .splice
to remove some elements from the array and I used for of
and .delete
to remove some elements from the set. For the "small" tests, it was about three times faster to remove items from the set (2.6 ms vs 7.1 ms) but things changed drastically for the "big" test where it took 1955.1 ms to remove items from the array while it only took 83.6 ms to remove them from the set, 23 times faster.
Conclusions
At 10k elements, both tests ran comparable times (array: 16.6 ms, set: 20.7 ms) but when dealing with 100k elements, the set was the clear winner (array: 1974.8 ms, set: 83.6 ms) but only because of the removing operation. Otherwise the array was faster. I couldn't say exactly why that is.
I played around with some hybrid scenarios where an array was created and populated and then converted into a set where some elements would be removed, the set would then be reconverted into an array. Although doing this will give much better performance than removing elements in the array, the additional processing time needed to transfer to and from a set outweighs the gains of populating an array instead of a set. In the end, it is faster to only deal with a set. Still, it is an interesting idea, that if one chooses to use an array as a data collection for some big data that doesn't have duplicates, it could be advantageous performance wise, if there is ever a need to remove many elements in one operation, to convert the array to a set, perform the removal operation, and convert the set back to an array.
Array code:
var timer = function(name) {
var start = new Date();
return {
stop: function() {
var end = new Date();
var time = end.getTime() - start.getTime();
console.log('Timer:', name, 'finished in', time, 'ms');
}
}
};
var getRandom = function(min, max) {
return Math.random() * (max - min) + min;
};
var lastNames = ['SMITH', 'JOHNSON', 'WILLIAMS', 'JONES', 'BROWN', 'DAVIS', 'MILLER', 'WILSON', 'MOORE', 'TAYLOR', 'ANDERSON', 'THOMAS'];
var genLastName = function() {
var index = Math.round(getRandom(0, lastNames.length - 1));
return lastNames[index];
};
var sex = ["Male", "Female"];
var genSex = function() {
var index = Math.round(getRandom(0, sex.length - 1));
return sex[index];
};
var Person = function() {
this.name = genLastName();
this.age = Math.round(getRandom(0, 100))
this.sex = "Male"
};
var genPersons = function() {
for (var i = 0; i < 100000; i++)
personArray.push(new Person());
};
var changeSex = function() {
for (var i = 0; i < personArray.length; i++) {
personArray[i].sex = genSex();
}
};
var deleteMale = function() {
for (var i = 0; i < personArray.length; i++) {
if (personArray[i].sex === "Male") {
personArray.splice(i, 1)
i--
}
}
};
var t = timer("Array");
var personArray = [];
genPersons();
changeSex();
deleteMale();
t.stop();
console.log("Done! There are " + personArray.length + " persons.")
Set code:
var timer = function(name) {
var start = new Date();
return {
stop: function() {
var end = new Date();
var time = end.getTime() - start.getTime();
console.log('Timer:', name, 'finished in', time, 'ms');
}
}
};
var getRandom = function (min, max) {
return Math.random() * (max - min) + min;
};
var lastNames = ['SMITH','JOHNSON','WILLIAMS','JONES','BROWN','DAVIS','MILLER','WILSON','MOORE','TAYLOR','ANDERSON','THOMAS'];
var genLastName = function() {
var index = Math.round(getRandom(0, lastNames.length - 1));
return lastNames[index];
};
var sex = ["Male", "Female"];
var genSex = function() {
var index = Math.round(getRandom(0, sex.length - 1));
return sex[index];
};
var Person = function() {
this.name = genLastName();
this.age = Math.round(getRandom(0,100))
this.sex = "Male"
};
var genPersons = function() {
for (var i = 0; i < 100000; i++)
personSet.add(new Person());
};
var changeSex = function() {
for (var key of personSet) {
key.sex = genSex();
}
};
var deleteMale = function() {
for (var key of personSet) {
if (key.sex === "Male") {
personSet.delete(key)
}
}
};
var t = timer("Set");
var personSet = new Set();
genPersons();
changeSex();
deleteMale();
t.stop();
console.log("Done! There are " + personSet.size + " persons.")
Solution 2:
OBSERVATIONS:
- Set operations can be understood as snapshots within the execution stream.
- We are not before a definitive substitute.
- The elements of a Set class have no accessible indexes.
- Set class is an Array class complement, useful in those scenarios where we need to store a collection on which to apply basic addition, Deletion, checking and iteration operations.
I share some test of performance. Try to open your console and copypaste the code below.
Creating an array (125000)
var n = 125000;
var arr = Array.apply( null, Array( n ) ).map( ( x, i ) => i );
console.info( arr.length ); // 125000
1. Locating an Index
We compared the has method of Set with Array indexOf:
Array/indexOf (0.281ms) | Set/has (0.053ms)
// Helpers
var checkArr = ( arr, item ) => arr.indexOf( item ) !== -1;
var checkSet = ( set, item ) => set.has( item );
// Vars
var set, result;
console.time( 'timeTest' );
result = checkArr( arr, 123123 );
console.timeEnd( 'timeTest' );
set = new Set( arr );
console.time( 'timeTest' );
checkSet( set, 123123 );
console.timeEnd( 'timeTest' );
2. Adding a new element
We compare the add and push methods of the Set and Array objects respectively:
Array/push (1.612ms) | Set/add (0.006ms)
console.time( 'timeTest' );
arr.push( n + 1 );
console.timeEnd( 'timeTest' );
set = new Set( arr );
console.time( 'timeTest' );
set.add( n + 1 );
console.timeEnd( 'timeTest' );
console.info( arr.length ); // 125001
console.info( set.size ); // 125001
3. Deleting an element
When deleting elements, we have to keep in mind that Array and Set do not start under equal conditions. Array does not have a native method, so an external function is necessary.
Array/deleteFromArr (0.356ms) | Set/remove (0.019ms)
var deleteFromArr = ( arr, item ) => {
var i = arr.indexOf( item );
i !== -1 && arr.splice( i, 1 );
};
console.time( 'timeTest' );
deleteFromArr( arr, 123123 );
console.timeEnd( 'timeTest' );
set = new Set( arr );
console.time( 'timeTest' );
set.delete( 123123 );
console.timeEnd( 'timeTest' );
Read the full article here
Solution 3:
My observation is that a Set is always better with two pitfalls for large arrays in mind :
a) The creation of Sets from Arrays must be done in a for
loop with a precached length.
slow (e.g. 18ms) new Set(largeArray)
fast (e.g. 6ms)
const SET = new Set();
const L = largeArray.length;
for(var i = 0; i<L; i++) { SET.add(largeArray[i]) }
b) Iterating could be done in the same way because it is also faster than a for of
loop ...
See https://jsfiddle.net/0j2gkae7/5/
for a real life comparison to
difference()
, intersection()
, union()
and uniq()
( + their iteratee companions etc.) with 40.000 elements
Solution 4:
For the iteration part of your question, I recently ran this test and found that Set much outperformed an Array of 10,000 items (around 10x the operations could happen in the same timeframe). And depending on the browser either beat or lost to Object.hasOwnProperty in a like for like test.
Both Set and Object have their "has" method performing in what seems to be amortized to O(1), but depending on the browser's implementation a single operation could take longer or faster. It seems that most browsers implement key in Object faster than Set.has(). Even Object.hasOwnProperty which includes an additional check on the key is about 5% faster than Set.has() at least for me on Chrome v86.
https://jsperf.com/set-has-vs-object-hasownproperty-vs-array-includes/1
Update: 11/11/2020: https://jsbench.me/irkhdxnoqa/2
In case you want to run your own tests with different browsers/environments.
Similarly I'll add a benchmark for adding items to an array vs set and removing.