Does the order of LINQ functions matter?
Basically, as the question states... does the order of LINQ functions matter in terms of performance? Obviously the results would have to be identical still...
Example:
myCollection.OrderBy(item => item.CreatedDate).Where(item => item.Code > 3);
myCollection.Where(item => item.Code > 3).OrderBy(item => item.CreatedDate);
Both return me the same results, but are in a different LINQ order. I realize that reordering some items will result in different results, and I'm not concerned about those. What my main concern is in knowing if, in getting the same results, ordering can impact performance. And, not just on the 2 LINQ calls I made (OrderBy, Where), but on any LINQ calls.
It will depend on the LINQ provider in use. For LINQ to Objects, that could certainly make a huge difference. Assume we've actually got:
var query = myCollection.OrderBy(item => item.CreatedDate)
.Where(item => item.Code > 3);
var result = query.Last();
That requires the whole collection to be sorted and then filtered. If we had a million items, only one of which had a code greater than 3, we'd be wasting a lot of time ordering results which would be thrown away.
Compare that with the reversed operation, filtering first:
var query = myCollection.Where(item => item.Code > 3)
.OrderBy(item => item.CreatedDate);
var result = query.Last();
This time we're only ordering the filtered results, which in the sample case of "just a single item matching the filter" will be a lot more efficient - both in time and space.
It also could make a difference in whether the query executes correctly or not. Consider:
var query = myCollection.Where(item => item.Code != 0)
.OrderBy(item => 10 / item.Code);
var result = query.Last();
That's fine - we know we'll never be dividing by 0. But if we perform the ordering before the filtering, the query will throw an exception.
Yes.
But exactly what that performance difference is depends on how the underlying expression tree is evaluated by the LINQ provider.
For instance, your query may well execute faster the second time (with the WHERE clause first) for LINQ-to-XML, but faster the first time for LINQ-to-SQL.
To find out precisely what the performance difference is, you'll most likely want to profile your application. As ever with such things, though, premature optimisation is not usually worth the effort -- you may well find issues other than LINQ performance are more important.
In your particular example it can make a difference to the performance.
First query: Your OrderBy
call needs to iterate through the entire source sequence, including those items where Code
is 3 or less. The Where
clause then also needs to iterate the entire ordered sequence.
Second query: The Where
call limits the sequence to only those items where Code
is greater than 3. The OrderBy
call then only needs to traverse the reduced sequence returned by the Where
call.