ILearnable .Net

August 23, 2012

Brilliant Chunking of IEnumerables

Filed under: Uncategorized — andreakn @ 08:58

I recently needed to chunk a *huge* IEnumerable into manageable chunks (SqlServer only allows _so_ many parameters in a single query (update x where y in *huge ienumerable*) )

So I came across this little gem on StackOverflow: http://stackoverflow.com/questions/1349491/how-can-i-split-an-ienumerablestring-into-groups-of-ienumerablestring/3935352#9136291

    public static class IEnumerableExtension
    {
        public static IEnumerable<IEnumerable> Chunks(this IEnumerable source, int chunkSize)
        {
            var chunk = new List(chunkSize);
            foreach (var x in source)
            {
                chunk.Add(x);
                if (chunk.Count < chunkSize)
                {
                    continue;
                }
                yield return chunk;
                chunk = new List(chunkSize);
            }
            if (chunk.Any())
            {
                yield return chunk;
            }
        }
    }

3 Comments »

  1. It would be even a little better if you replace

    if (chunk.Any())
    {
    yield return chunk;
    }

    with

    if (chunk.Count != 0)
    {
    yield return chunk;
    }

    Any() is an IEnumerable overload which allocates an enumerator and starts enumerating the sequence and if it can, it returns true, otherwise false. It wastes CPU and puts unnecessary preasure on the GC. Since chunk is a list getting Count will just return the current list size as a value on the stack.

    Source of IEnumerable.Any()

    public static bool Any(this IEnumerable source)
    {
    if (source == null)
    throw Error.ArgumentNull(“source”);
    using (IEnumerator enumerator = source.GetEnumerator())
    {
    if (enumerator.MoveNext())
    return true;
    }
    return false;
    }

    Source of List.Count

    public int Count
    {
    get
    {
    return this._size;
    }
    }

    Cheers!

    Erik

    Comment by Erik Bergman — December 31, 2015 @ 16:24 | Reply

  2. And of course use a generic enumerable and return the list instead of an enumerable.

    Something like this:

    public static IEnumerable<List> Chunks(this IEnumerable source, int chunkSize)
    {
    var chunk = new List(chunkSize);
    foreach (var x in source)
    {
    chunk.Add(x);
    if (chunk.Count < chunkSize)
    {
    continue;
    }
    yield return chunk;
    chunk = new List(chunkSize);
    }
    if (chunk.Count != 0)
    {
    yield return chunk;
    }
    }
    }

    Comment by Erik Bergman — December 31, 2015 @ 16:29 | Reply

    • Somehow the site removes the generic:s. Perhaps it does that for you too.It should be List <T&gt

      Comment by Erik Bergman — December 31, 2015 @ 16:31 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.