LINQ: Determine if two sequences contains exactly the same elements

I need to determine whether or not two sets contains exactly the same elements. The ordering does not matter.

For instance, these two arrays should be considered equal:

IEnumerable<int> data = new []{3, 5, 6, 9};
IEnumerable<int> otherData = new []{6, 5, 9, 3}

One set cannot contain any elements, that are not in the other.

Can this be done using the built-in query operators? And what would be the most efficient way to implement it, considering that the number of elements could range from a few to hundreds?


Solution 1:

If you want to treat the arrays as "sets" and ignore order and duplicate items, you can use HashSet<T>.SetEquals method:

var isEqual = new HashSet<int>(first).SetEquals(second);

Otherwise, your best bet is probably sorting both sequences in the same way and using SequenceEqual to compare them.

Solution 2:

I suggest sorting both, and doing an element-by-element comparison.

data.OrderBy(x => x).SequenceEqual(otherData.OrderBy(x => x))

I'm not sure how fast the implementation of OrderBy is, but if it's a O(n log n) sort like you'd expect the total algorithm is O(n log n) as well.

For some cases of data, you can improve on this by using a custom implementation of OrderBy that for example uses a counting sort, for O(n+k), with k the size of the range wherein the values lie.

Solution 3:

If you might have duplicates (or if you want a solution which performs better for longer lists), I'd try something like this:

static bool IsSame<T>(IEnumerable<T> set1, IEnumerable<T> set2)
{
    if (set1 == null && set2 == null)
        return true;
    if (set1 == null || set2 == null)
        return false;

    List<T> list1 = set1.ToList();
    List<T> list2 = set2.ToList();

    if (list1.Count != list2.Count)
        return false;

    list1.Sort();
    list2.Sort();

    return list1.SequenceEqual(list2);
}

UPDATE: oops, you guys are right-- the Except() solution below needs to look both ways before crossing the street. And it has lousy perf for longer lists. Ignore the suggestion below! :-)

Here's one easy way to do it. Note that this assumes the lists have no duplicates.

bool same = data.Except (otherData).Count() == 0;