Order of LINQ extension methods does not affect performance?
I'm surprised that it apparently doesn't matter whether i prepend or append LINQ extension methods.
Tested with Enumerable.FirstOrDefault
:
hugeList.Where(x => x.Text.Contains("10000")).FirstOrDefault();
-
hugeList.FirstOrDefault(x => x.Text.Contains("10000"));
var hugeList = Enumerable.Range(1, 50000000) .Select(i => new { ID = i, Text = "Item" + i }); var sw1 = new System.Diagnostics.Stopwatch(); var sw2 = new System.Diagnostics.Stopwatch(); sw1.Start(); for(int i=0;i<1000;i++) hugeList.Where(x => x.Text.Contains("10000")).FirstOrDefault(); sw1.Stop(); sw2.Start(); for(int i=0;i<1000;i++) hugeList.FirstOrDefault(x => x.Text.Contains("10000")); sw2.Stop(); var result1 = String.Format("FirstOrDefault after: {0} FirstOrDefault before: {1}", sw1.Elapsed, sw2.Elapsed); //result1: FirstOrDefault after: 00:00:03.3169683 FirstOrDefault before: 00:00:03.0463219 sw2.Restart(); for (int i = 0; i < 1000; i++) hugeList.FirstOrDefault(x => x.Text.Contains("10000")); sw2.Stop(); sw1.Restart(); for (int i = 0; i < 1000; i++) hugeList.Where(x => x.Text.Contains("10000")).FirstOrDefault(); sw1.Stop(); var result2 = String.Format("FirstOrDefault before: {0} FirstOrDefault after: {1}", sw2.Elapsed, sw1.Elapsed); //result2: FirstOrDefault before: 00:00:03.6833079 FirstOrDefault after: 00:00:03.1675611 //average after:3.2422647 before: 3.3648149 (all seconds)
I would have guessed that it would be slower to prepend Where
since it must find all matching items and then take the first and a preceded FirstOrDefault
could yield the first found item.
Q: Can somebody explain why i'm on the wrong track?
I would have guessed that it would be slower to prepend Where since it must find all matching items and then take the first and a preceded FirstOrDefault could yield the first found item. Can somebody explain why i'm on the wrong track?
You are on the wrong track because your first statement is simply incorrect. Where
is not required to find all matching items before fetching the first matching item. Where
fetches matching items "on demand"; if you only ask for the first one, it only fetches the first one. If you only ask for the first two, it only fetches the first two.
Jon Skeet does a nice bit on stage. Imagine you have three people. The first person has a shuffled pack of cards. The second person has a t-shirt that says "where card is red". The third person pokes the second person and says "give me the first card". The second person pokes the first person over and over again until the first person hands over a red card, which the second person then hands to the third person. The second person has no reason to keep poking the first person; the task is done!
Now, if the second person's t-shirt says "order by rank ascending" then we have a very different situation. Now the second person really does need to get every card from the first person, in order to find the lowest card in the deck, before handing the first card to the third person.
This should now give you the necessary intuition to tell when order does matter for performance reasons. The net result of "give me the red cards and then sort them" is exactly the same as "sort all the cards then give me the red ones", but the former is much faster because you do not have to spend any time sorting the black cards that you are going to discard.
The Where()
method uses deferred execution and will provide the next matching item as it is requested. That is, Where()
does not evaluate and immediately return a sequence of all candidate objects, it provides them one at a time as they are iterated over.
Since FirstOrDefault()
stops after the first item, this will cause the Where()
to stop iterating as well.
Think of FirstOrDefault()
as halting the execution of the Where()
as if it performed a break
. It's not that simple, of course, but in essence since FirstOrDefault()
stops iterating once it finds an item, the Where()
does not need to proceed any further.
Of course, this is in the simple case of applying a FirstOrDefault()
on a Where()
clause, if you have other clauses in which imply the need to consider all items, this could have an effect, but this would be true both in using Where().FirstOrDefault()' combo or just
FirstOrDefault()' with a predicate.