Can you create a simple 'EqualityComparer<T>' using a lambda expression

Short question:

Is there a simple way in LINQ to objects to get a distinct list of objects from a list based on a key property on the objects.

Long question:

I am trying to do a Distinct() operation on a list of objects that have a key as one of their properties.

class GalleryImage {
   public int Key { get;set; }
   public string Caption { get;set; }
   public string Filename { get; set; }
   public string[] Tags {g et; set; }
}

I have a list of Gallery objects that contain GalleryImage[].

Because of the way the webservice works [sic] I have duplicates of the GalleryImage object. i thought it would be a simple matter to use Distinct() to get a distinct list.

This is the LINQ query I want to use :

var allImages = Galleries.SelectMany(x => x.Images);
var distinctImages = allImages.Distinct<GalleryImage>(new 
                     EqualityComparer<GalleryImage>((a, b) => a.id == b.id));

The problem is that EqualityComparer is an abstract class.

I dont want to :

  • implement IEquatable on GalleryImage because it is generated
  • have to write a separate class to implement IEqualityComparer as shown here

Is there a concrete implementation of EqualityComparer somewhere that I'm missing?

I would have thought there would be an easy way to get 'distinct' objects from a set based on a key.


Solution 1:

(There are two solutions here - see the end for the second one):

My MiscUtil library has a ProjectionEqualityComparer class (and two supporting classes to make use of type inference).

Here's an example of using it:

EqualityComparer<GalleryImage> comparer = 
    ProjectionEqualityComparer<GalleryImage>.Create(x => x.id);

Here's the code (comments removed)

// Helper class for construction
public static class ProjectionEqualityComparer
{
    public static ProjectionEqualityComparer<TSource, TKey>
        Create<TSource, TKey>(Func<TSource, TKey> projection)
    {
        return new ProjectionEqualityComparer<TSource, TKey>(projection);
    }

    public static ProjectionEqualityComparer<TSource, TKey>
        Create<TSource, TKey> (TSource ignored,
                               Func<TSource, TKey> projection)
    {
        return new ProjectionEqualityComparer<TSource, TKey>(projection);
    }
}

public static class ProjectionEqualityComparer<TSource>
{
    public static ProjectionEqualityComparer<TSource, TKey>
        Create<TKey>(Func<TSource, TKey> projection)
    {
        return new ProjectionEqualityComparer<TSource, TKey>(projection);
    }
}

public class ProjectionEqualityComparer<TSource, TKey>
    : IEqualityComparer<TSource>
{
    readonly Func<TSource, TKey> projection;
    readonly IEqualityComparer<TKey> comparer;

    public ProjectionEqualityComparer(Func<TSource, TKey> projection)
        : this(projection, null)
    {
    }

    public ProjectionEqualityComparer(
        Func<TSource, TKey> projection,
        IEqualityComparer<TKey> comparer)
    {
        projection.ThrowIfNull("projection");
        this.comparer = comparer ?? EqualityComparer<TKey>.Default;
        this.projection = projection;
    }

    public bool Equals(TSource x, TSource y)
    {
        if (x == null && y == null)
        {
            return true;
        }
        if (x == null || y == null)
        {
            return false;
        }
        return comparer.Equals(projection(x), projection(y));
    }

    public int GetHashCode(TSource obj)
    {
        if (obj == null)
        {
            throw new ArgumentNullException("obj");
        }
        return comparer.GetHashCode(projection(obj));
    }
}

Second solution

To do this just for Distinct, you can use the DistinctBy extension in MoreLINQ:

    public static IEnumerable<TSource> DistinctBy<TSource, TKey>
        (this IEnumerable<TSource> source,
         Func<TSource, TKey> keySelector)
    {
        return source.DistinctBy(keySelector, null);
    }

    public static IEnumerable<TSource> DistinctBy<TSource, TKey>
        (this IEnumerable<TSource> source,
         Func<TSource, TKey> keySelector,
         IEqualityComparer<TKey> comparer)
    {
        source.ThrowIfNull("source");
        keySelector.ThrowIfNull("keySelector");
        return DistinctByImpl(source, keySelector, comparer);
    }

    private static IEnumerable<TSource> DistinctByImpl<TSource, TKey>
        (IEnumerable<TSource> source,
         Func<TSource, TKey> keySelector,
         IEqualityComparer<TKey> comparer)
    {
        HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
        foreach (TSource element in source)
        {
            if (knownKeys.Add(keySelector(element)))
            {
                yield return element;
            }
        }
    }

In both cases, ThrowIfNull looks like this:

public static void ThrowIfNull<T>(this T data, string name) where T : class
{
    if (data == null)
    {
        throw new ArgumentNullException(name);
    }
}

Solution 2:

Building on Charlie Flowers' answer, you can create your own extension method to do what you want which internally uses grouping:

    public static IEnumerable<T> Distinct<T, U>(
        this IEnumerable<T> seq, Func<T, U> getKey)
    {
        return
            from item in seq
            group item by getKey(item) into gp
            select gp.First();
    }

You could also create a generic class deriving from EqualityComparer, but it sounds like you'd like to avoid this:

    public class KeyEqualityComparer<T,U> : IEqualityComparer<T>
    {
        private Func<T,U> GetKey { get; set; }

        public KeyEqualityComparer(Func<T,U> getKey) {
            GetKey = getKey;
        }

        public bool Equals(T x, T y)
        {
            return GetKey(x).Equals(GetKey(y));
        }

        public int GetHashCode(T obj)
        {
            return GetKey(obj).GetHashCode();
        }
    }

Solution 3:

This is the best i can come up with for the problem in hand. Still curious whether theres a nice way to create a EqualityComparer on the fly though.

Galleries.SelectMany(x => x.Images).ToLookup(x => x.id).Select(x => x.First());

Create lookup table and take 'top' from each one

Note: this is the same as @charlie suggested but using ILookup - which i think is what a group must be anyway.