How to find all duplicate from a List<string>? [duplicate]
In .NET framework 3.5 and above you can use Enumerable.GroupBy
which returns an enumerable of enumerables of duplicate keys, and then filter out any of the enumerables that have a Count of <=1, then select their keys to get back down to a single enumerable:
var duplicateKeys = list.GroupBy(x => x)
.Where(group => group.Count() > 1)
.Select(group => group.Key);
If you are using LINQ, you can use the following query:
var duplicateItems = from x in list
group x by x into grouped
where grouped.Count() > 1
select grouped.Key;
or, if you prefer it without the syntactic sugar:
var duplicateItems = list.GroupBy(x => x).Where(x => x.Count() > 1).Select(x => x.Key);
This groups all elements that are the same, and then filters to only those groups with more than one element. Finally it selects just the key from those groups as you don't need the count.
If you're prefer not to use LINQ, you can use this extension method:
public void SomeMethod {
var duplicateItems = list.GetDuplicates();
…
}
public static IEnumerable<T> GetDuplicates<T>(this IEnumerable<T> source) {
HashSet<T> itemsSeen = new HashSet<T>();
HashSet<T> itemsYielded = new HashSet<T>();
foreach (T item in source) {
if (!itemsSeen.Add(item)) {
if (itemsYielded.Add(item)) {
yield return item;
}
}
}
}
This keeps track of items it has seen and yielded. If it hasn't seen an item before, it adds it to the list of seen items, otherwise it ignores it. If it hasn't yielded an item before, it yields it, otherwise it ignores it.
and without the LINQ:
string[] ss = {"1","1","1"};
var myList = new List<string>();
var duplicates = new List<string>();
foreach (var s in ss)
{
if (!myList.Contains(s))
myList.Add(s);
else
duplicates.Add(s);
}
// show list without duplicates
foreach (var s in myList)
Console.WriteLine(s);
// show duplicates list
foreach (var s in duplicates)
Console.WriteLine(s);
If you're looking for a more generic method:
public static List<U> FindDuplicates<T, U>(this List<T> list, Func<T, U> keySelector)
{
return list.GroupBy(keySelector)
.Where(group => group.Count() > 1)
.Select(group => group.Key).ToList();
}
EDIT: Here's an example:
public class Person {
public string Name {get;set;}
public int Age {get;set;}
}
List<Person> list = new List<Person>() { new Person() { Name = "John", Age = 22 }, new Person() { Name = "John", Age = 30 }, new Person() { Name = "Jack", Age = 30 } };
var duplicateNames = list.FindDuplicates(p => p.Name);
var duplicateAges = list.FindDuplicates(p => p.Age);
foreach(var dupName in duplicateNames) {
Console.WriteLine(dupName); // Will print out John
}
foreach(var dupAge in duplicateAges) {
Console.WriteLine(dupAge); // Will print out 30
}