foreach vs someList.ForEach(){}

There are apparently many ways to iterate over a collection. Curious if there are any differences, or why you'd use one way over the other.

First type:

List<string> someList = <some way to init>
foreach(string s in someList) {
   <process the string>
}

Other Way:

List<string> someList = <some way to init>
someList.ForEach(delegate(string s) {
    <process the string>
});

I suppose off the top of my head, that instead of the anonymous delegate I use above, you'd have a reusable delegate you could specify...


Solution 1:

There is one important, and useful, distinction between the two.

Because .ForEach uses a for loop to iterate the collection, this is valid (edit: prior to .net 4.5 - the implementation changed and they both throw):

someList.ForEach(x => { if(x.RemoveMe) someList.Remove(x); }); 

whereas foreach uses an enumerator, so this is not valid:

foreach(var item in someList)
  if(item.RemoveMe) someList.Remove(item);

tl;dr: Do NOT copypaste this code into your application!

These examples aren't best practice, they are just to demonstrate the differences between ForEach() and foreach.

Removing items from a list within a for loop can have side effects. The most common one is described in the comments to this question.

Generally, if you are looking to remove multiple items from a list, you would want to separate the determination of which items to remove from the actual removal. It doesn't keep your code compact, but it guarantees that you do not miss any items.

Solution 2:

We had some code here (in VS2005 and C#2.0) where the previous engineers went out of their way to use list.ForEach( delegate(item) { foo;}); instead of foreach(item in list) {foo; }; for all the code that they wrote. e.g. a block of code for reading rows from a dataReader.

I still don't know exactly why they did this.

The drawbacks of list.ForEach() are:

  • It is more verbose in C# 2.0. However, in C# 3 onwards, you can use the "=>" syntax to make some nicely terse expressions.

  • It is less familiar. People who have to maintain this code will wonder why you did it that way. It took me awhile to decide that there wasn't any reason, except maybe to make the writer seem clever (the quality of the rest of the code undermined that). It was also less readable, with the "})" at the end of the delegate code block.

  • See also Bill Wagner's book "Effective C#: 50 Specific Ways to Improve Your C#" where he talks about why foreach is preferred to other loops like for or while loops - the main point is that you are letting the compiler decide the best way to construct the loop. If a future version of the compiler manages to use a faster technique, then you will get this for free by using foreach and rebuilding, rather than changing your code.

  • a foreach(item in list) construct allows you to use break or continue if you need to exit the iteration or the loop. But you cannot alter the list inside a foreach loop.

I'm surprised to see that list.ForEach is slightly faster. But that's probably not a valid reason to use it throughout , that would be premature optimisation. If your application uses a database or web service that, not loop control, is almost always going to be be where the time goes. And have you benchmarked it against a for loop too? The list.ForEach could be faster due to using that internally and a for loop without the wrapper would be even faster.

I disagree that the list.ForEach(delegate) version is "more functional" in any significant way. It does pass a function to a function, but there's no big difference in the outcome or program organisation.

I don't think that foreach(item in list) "says exactly how you want it done" - a for(int 1 = 0; i < count; i++) loop does that, a foreach loop leaves the choice of control up to the compiler.

My feeling is, on a new project, to use foreach(item in list) for most loops in order to adhere to the common usage and for readability, and use list.Foreach() only for short blocks, when you can do something more elegantly or compactly with the C# 3 "=>" operator. In cases like that, there may already be a LINQ extension method that is more specific than ForEach(). See if Where(), Select(), Any(), All(), Max() or one of the many other LINQ methods doesn't already do what you want from the loop.

Solution 3:

For fun, I popped List into reflector and this is the resulting C#:

public void ForEach(Action<T> action)
{
    if (action == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
    }
    for (int i = 0; i < this._size; i++)
    {
        action(this._items[i]);
    }
}

Similarly, the MoveNext in Enumerator which is what is used by foreach is this:

public bool MoveNext()
{
    if (this.version != this.list._version)
    {
        ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
    }
    if (this.index < this.list._size)
    {
        this.current = this.list._items[this.index];
        this.index++;
        return true;
    }
    this.index = this.list._size + 1;
    this.current = default(T);
    return false;
}

The List.ForEach is much more trimmed down than MoveNext - far less processing - will more likely JIT into something efficient..

In addition, foreach() will allocate a new Enumerator no matter what. The GC is your friend, but if you're doing the same foreach repeatedly, this will make more throwaway objects, as opposed to reusing the same delegate - BUT - this is really a fringe case. In typical usage you will see little or no difference.

Solution 4:

I know two obscure-ish things that make them different. Go me!

Firstly, there's the classic bug of making a delegate for each item in the list. If you use the foreach keyword, all your delegates can end up referring to the last item of the list:

    // A list of actions to execute later
    List<Action> actions = new List<Action>();

    // Numbers 0 to 9
    List<int> numbers = Enumerable.Range(0, 10).ToList();

    // Store an action that prints each number (WRONG!)
    foreach (int number in numbers)
        actions.Add(() => Console.WriteLine(number));

    // Run the actions, we actually print 10 copies of "9"
    foreach (Action action in actions)
        action();

    // So try again
    actions.Clear();

    // Store an action that prints each number (RIGHT!)
    numbers.ForEach(number =>
        actions.Add(() => Console.WriteLine(number)));

    // Run the actions
    foreach (Action action in actions)
        action();

The List.ForEach method doesn't have this problem. The current item of the iteration is passed by value as an argument to the outer lambda, and then the inner lambda correctly captures that argument in its own closure. Problem solved.

(Sadly I believe ForEach is a member of List, rather than an extension method, though it's easy to define it yourself so you have this facility on any enumerable type.)

Secondly, the ForEach method approach has a limitation. If you are implementing IEnumerable by using yield return, you can't do a yield return inside the lambda. So looping through the items in a collection in order to yield return things is not possible by this method. You'll have to use the foreach keyword and work around the closure problem by manually making a copy of the current loop value inside the loop.

More here

Solution 5:

I guess the someList.ForEach() call could be easily parallelized whereas the normal foreach is not that easy to run parallel. You could easily run several different delegates on different cores, which is not that easy to do with a normal foreach.
Just my 2 cents