Test if all values in a list are unique
I have a small list of bytes and I want to test that they're all different values. For instance, I have this:
List<byte> theList = new List<byte> { 1,4,3,6,1 };
What's the best way to check if all values are distinct or not?
bool isUnique = theList.Distinct().Count() == theList.Count();
Here's another approach which is more efficient than Enumerable.Distinct
+ Enumerable.Count
(all the more if the sequence is not a collection type). It uses a HashSet<T>
which eliminates duplicates, is very efficient in lookups and has a count-property:
var distinctBytes = new HashSet<byte>(theList);
bool allDifferent = distinctBytes.Count == theList.Count;
or another - more subtle and efficient - approach:
var diffChecker = new HashSet<byte>();
bool allDifferent = theList.All(diffChecker.Add);
HashSet<T>.Add
returns false
if the element could not be added since it was already in the HashSet
. Enumerable.All
stops on the first "false".
Okay, here is the most efficient method I can think of using standard .Net
using System;
using System.Collections.Generic;
public static class Extension
{
public static bool HasDuplicate<T>(
this IEnumerable<T> source,
out T firstDuplicate)
{
if (source == null)
{
throw new ArgumentNullException(nameof(source));
}
var checkBuffer = new HashSet<T>();
foreach (var t in source)
{
if (checkBuffer.Add(t))
{
continue;
}
firstDuplicate = t;
return true;
}
firstDuplicate = default(T);
return false;
}
}
essentially, what is the point of enumerating the whole sequence twice if all you want to do is find the first duplicate.
I could optimise this more by special casing an empty and single element sequences but that would depreciate from readability/maintainability with minimal gain.
The similar logic to Distinct
using GroupBy
:
var isUnique = theList.GroupBy(i => i).Count() == theList.Count;