Performance of nested yield in a tree

I've got a tree-like structure. Each element in this structure should be able to return a Enumerable of all elements it is root to. Let's call this method IEnumerable<Foo> GetAll(). So if we have

     A  <-- topmost root
    / \ 
   /   \
  B     C
 / \   / \
D   E F   G

a call to GetAll on element C returns {C, F, G} (fixed order of elements would be nice, but is not needed). I guess everybody knew that already.

The current implementation of GetAll looks like this:

public IEnumerable<Foo> GetAll ()
{
    yield return this;

    foreach (Foo foo in MyChildren) {
        foreach (Foo f in foo.GetAll ()) {
            yield return f;
        }
    }
}

In an earlier implementation, I returned a List and added the child-foos using List.AddRange().

My question is if the version using yield is correcly implemented or if it should be improved (esp. in terms of performance). Or is this just bad and I should stick to Lists (or ReadOnlyCollections) instead?

Solution 1:

You can improve performance if you unroll recurse to stack, so you will have only one iterator:

public IEnumerable<Foo> GetAll()
{
    Stack<Foo> FooStack = new Stack<Foo>();
    FooStack.Push(this);

    while (FooStack.Count > 0)
    {
        Foo Result = FooStack.Pop();
        yield return Result;
        foreach (Foo NextFoo in Result.MyChildren)
            FooStack.Push(NextFoo);
    }
}

Solution 2:

It's certainly not ideal in terms of performance - you end up creating a lot of iterators for large trees, instead of a single iterator which knows how to traverse efficiently.

Some blog entries concerning this:

Wes Dyer: All about iterators
Eric Lippert: Immutability in C#, part 6
Eric again: Immutability in C#, part 7

It's worth noting that F# has the equivalent of the proposed "yield foreach" with "yield!"

Solution 3:

A better solution might be to create a visit method that recursively traverses the tree, and use that to collect items up.

Something like this (assuming a binary tree):

public class Node<T>
{
    public void Visit(Action<T> action)
    {
        action(this);
        left.Visit(action);
        right.Visit(action);
    }

    public IEnumerable<Foo> GetAll ()
    {
        var result = new List<T>();
        Visit( n => result.Add(n));
        return result;
    }
}

Taking this approach

Avoids creating large numbers of nested iterators
Avoids creating any more lists than necessary
Is relatively efficient
Falls down if you only need part of the list regularly

Solution 4:

No, that looks fine.

Have a look at my blog entry, it may be of some use :)

Performance of nested yield in a tree

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Related

Recent Posts