Hash Set and Array List performances
My experiment shows that HashSet
is faster than an ArrayList
starting at collections of 3 elements inclusively.
A complete results table
| Boost | Collection Size |
| 2x | 3 elements |
| 3x | 10 elements |
| 6x | 50 elements |
| 12x | 200 elements | <= proportion 532-12 vs 10.000-200 elements
| 532x | 10.000 elements | <= shows linear lookup growth for the ArrayList
They're completely different classes, so the question is: what kind of behaviour do you want?
HashSet
ensures there are no duplicates, gives you an O(1) contains()
method but doesn't preserve order.ArrayList
doesn't ensure there are no duplicates, contains()
is O(n) but you can control the order of the entries.
I believe using the hash set has a better performance than an array list. Am I correct in stating that?
With many (whatever it means) entries, yes. With small data sizes, raw linear search could be faster than hashing, though. Where exactly the break-even is, you have to just measure. My gut feeling is that with fewer than 10 elements, linear look-up is probably faster; with more than 100 elements hashing is probably faster, but that's just my feeling...
Lookup from a HashSet is constant time, O(1), provided that the hashCode implementation of the elements is sane. Linear look-up from a list is linear time, O(n).