Performance of nested yield in a tree
I've got a tree-like structure. Each element in this structure should be able to return a Enumerable of all elements it is root to. Let's call this method IEnumerable<Foo> GetAll()
. So if we have
A <-- topmost root
/ \
/ \
B C
/ \ / \
D E F G
a call to GetAll
on element C
returns {C, F, G}
(fixed order of elements would be nice, but is not needed). I guess everybody knew that already.
The current implementation of GetAll
looks like this:
public IEnumerable<Foo> GetAll ()
{
yield return this;
foreach (Foo foo in MyChildren) {
foreach (Foo f in foo.GetAll ()) {
yield return f;
}
}
}
In an earlier implementation, I returned a List and added the child-foos using List.AddRange()
.
My question is if the version using yield is correcly implemented or if it should be improved (esp. in terms of performance). Or is this just bad and I should stick to List
s (or ReadOnlyCollections
) instead?
Solution 1:
You can improve performance if you unroll recurse to stack, so you will have only one iterator:
public IEnumerable<Foo> GetAll()
{
Stack<Foo> FooStack = new Stack<Foo>();
FooStack.Push(this);
while (FooStack.Count > 0)
{
Foo Result = FooStack.Pop();
yield return Result;
foreach (Foo NextFoo in Result.MyChildren)
FooStack.Push(NextFoo);
}
}
Solution 2:
It's certainly not ideal in terms of performance - you end up creating a lot of iterators for large trees, instead of a single iterator which knows how to traverse efficiently.
Some blog entries concerning this:
- Wes Dyer: All about iterators
- Eric Lippert: Immutability in C#, part 6
- Eric again: Immutability in C#, part 7
It's worth noting that F# has the equivalent of the proposed "yield foreach
" with "yield!
"
Solution 3:
A better solution might be to create a visit method that recursively traverses the tree, and use that to collect items up.
Something like this (assuming a binary tree):
public class Node<T>
{
public void Visit(Action<T> action)
{
action(this);
left.Visit(action);
right.Visit(action);
}
public IEnumerable<Foo> GetAll ()
{
var result = new List<T>();
Visit( n => result.Add(n));
return result;
}
}
Taking this approach
- Avoids creating large numbers of nested iterators
- Avoids creating any more lists than necessary
- Is relatively efficient
- Falls down if you only need part of the list regularly
Solution 4:
No, that looks fine.
Have a look at my blog entry, it may be of some use :)