Optimal LINQ query to get a random sub collection - Shuffle
Please suggest an easiest way to get a random shuffled collection of count 'n' from a collection having 'N' items. where n <= N
Further to mquander's answer and Dan Blanchard's comment, here's a LINQ-friendly extension method that performs a Fisher-Yates-Durstenfeld shuffle:
// take n random items from yourCollection
var randomItems = yourCollection.Shuffle().Take(n);
// ...
public static class EnumerableExtensions
{
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
{
return source.Shuffle(new Random());
}
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source, Random rng)
{
if (source == null) throw new ArgumentNullException("source");
if (rng == null) throw new ArgumentNullException("rng");
return source.ShuffleIterator(rng);
}
private static IEnumerable<T> ShuffleIterator<T>(
this IEnumerable<T> source, Random rng)
{
var buffer = source.ToList();
for (int i = 0; i < buffer.Count; i++)
{
int j = rng.Next(i, buffer.Count);
yield return buffer[j];
buffer[j] = buffer[i];
}
}
}
Another option is to use OrderBy and to sort on a GUID value, which you can do so using:
var result = sequence.OrderBy(elem => Guid.NewGuid());
I did some empirical tests to convince myself that the above actually generates a random distribution (which it appears to do). You can see my results at Techniques for Randomly Reordering an Array.
This has some issues with "random bias" and I am sure it's not optimal, this is another possibility:
var r = new Random();
l.OrderBy(x => r.NextDouble()).Take(n);