Algorithm to tell if two arrays have identical members
What's the best algorithm for comparing two arrays to see if they have the same members?
Assume there are no duplicates, the members can be in any order, and that neither is sorted.
compare(
[a, b, c, d],
[b, a, d, c]
) ==> true
compare(
[a, b, e],
[a, b, c]
) ==> false
compare(
[a, b, c],
[a, b]
) ==> false
Obvious answers would be:
- Sort both lists, then check each element to see if they're identical
- Add the items from one array to a hashtable, then iterate through the other array, checking that each item is in the hash
- nickf's iterative search algorithm
Which one you'd use would depend on whether you can sort the lists first, and whether you have a good hash algorithm handy.
You could load one into a hash table, keeping track of how many elements it has. Then, loop over the second one checking to see if every one of its elements is in the hash table, and counting how many elements it has. If every element in the second array is in the hash table, and the two lengths match, they are the same, otherwise they are not. This should be O(N).
To make this work in the presence of duplicates, track how many of each element has been seen. Increment while looping over the first array, and decrement while looping over the second array. During the loop over the second array, if you can't find something in the hash table, or if the counter is already at zero, they are unequal. Also compare total counts.
Another method that would work in the presence of duplicates is to sort both arrays and do a linear compare. This should be O(N*log(N)).