Create batches in linq

Solution 1:

You don't need to write any code. Use MoreLINQ Batch method, which batches the source sequence into sized buckets (MoreLINQ is available as a NuGet package you can install):

int size = 10;
var batches = sequence.Batch(size);

Which is implemented as:

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
                  this IEnumerable<TSource> source, int size)
{
    TSource[] bucket = null;
    var count = 0;

    foreach (var item in source)
    {
        if (bucket == null)
            bucket = new TSource[size];

        bucket[count++] = item;
        if (count != size)
            continue;

        yield return bucket;

        bucket = null;
        count = 0;
    }

    if (bucket != null && count > 0)
        yield return bucket.Take(count).ToArray();
}

Solution 2:

public static class MyExtensions
{
    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> items,
                                                       int maxItems)
    {
        return items.Select((item, inx) => new { item, inx })
                    .GroupBy(x => x.inx / maxItems)
                    .Select(g => g.Select(x => x.item));
    }
}

and the usage would be:

List<int> list = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

foreach(var batch in list.Batch(3))
{
    Console.WriteLine(String.Join(",",batch));
}

OUTPUT:

0,1,2
3,4,5
6,7,8
9

Solution 3:

If you start with sequence defined as an IEnumerable<T>, and you know that it can safely be enumerated multiple times (e.g. because it is an array or a list), you can just use this simple pattern to process the elements in batches:

while (sequence.Any())
{
    var batch = sequence.Take(10);
    sequence = sequence.Skip(10);

    // do whatever you need to do with each batch here
}

Solution 4:

This is a fully lazy, low overhead, one-function implementation of Batch that doesn't do any accumulation. Based on (and fixes issues in) Nick Whaley's solution with help from EricRoller.

Iteration comes directly from the underlying IEnumerable, so elements must be enumerated in strict order, and accessed no more than once. If some elements aren't consumed in an inner loop, they are discarded (and trying to access them again via a saved iterator will throw InvalidOperationException: Enumeration already finished.).

You can test a complete sample at .NET Fiddle.

public static class BatchLinq
{
    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
    {
        if (size <= 0)
            throw new ArgumentOutOfRangeException("size", "Must be greater than zero.");
        using (var enumerator = source.GetEnumerator())
            while (enumerator.MoveNext())
            {
                int i = 0;
                // Batch is a local function closing over `i` and `enumerator` that
                // executes the inner batch enumeration
                IEnumerable<T> Batch()
                {
                    do yield return enumerator.Current;
                    while (++i < size && enumerator.MoveNext());
                }

                yield return Batch();
                while (++i < size && enumerator.MoveNext()); // discard skipped items
            }
    }
}