Filtering on Include in EF Core
Entity Framework core 5 is the first EF version to support filtered Include
.
How it works
Supported operations:
Where
OrderBy(Descending)/ThenBy(Descending)
Skip
Take
Some usage examples (from the original feature request and the github commmit) :
Only one filter allowed per navigation, so for cases where the same navigation needs to be included multiple times (e.g. multiple ThenInclude on the same navigation) apply the filter only once, or apply exactly the same filter for that navigation.
context.Customers
.Include(c => c.Orders.Where(o => o.Name != "Foo")).ThenInclude(o => o.OrderDetails)
.Include(c => c.Orders).ThenInclude(o => o.Customer)
or
context.Customers
.Include(c => c.Orders.Where(o => o.Name != "Foo")).ThenInclude(o => o.OrderDetails)
.Include(c => c.Orders.Where(o => o.Name != "Foo")).ThenInclude(o => o.Customer)
Another important note:
Collections included using new filter operations are considered to be loaded.
That means that if lazy loading is enabled, addressing one customer's Orders
collection from the last example won't trigger a reload of the entire Orders
collection.
Also, two subsequent filtered Include
s in the same context will accumulate the results. For example...
context.Customers.Include(c => c.Orders.Where(o => !o.IsDeleted))
...followed by...
context.Customers.Include(c => c.Orders.Where(o => o.IsDeleted))
...will result in customers
with Orders
collections containing all orders.
Filtered Include and relationship fixup
If other Order
s are loaded into the same context, more of them may get added to a customers.Orders
collection because of relationship fixup. This is inevitable because of how EF's change tracker works.
context.Customers.Include(c => c.Orders.Where(o => !o.IsDeleted))
...followed by...
context.Orders.Where(o => o.IsDeleted).Load();
...will again result in customers
with Orders
collections containing all orders.
The filter expression
The filter expression should contain predicates that can be used as a stand-alone predicate for the collection. An example will make this clear. Suppose we want to include orders filtered by some property of Customer
:
context.Customers.Include(c => c.Orders.Where(o => o.Classification == c.Classification))
It compiles, but it'll throw a very technical runtime exception, basically telling that o.Classification == c.Classification
can't be translated because c.Classification
can't be found. The query has to be rewritten using a back-reference from Order
to Customer
:
context.Customers.Include(c => c.Orders.Where(o => o.Classification == o.Customer.Classification))
The predicate o => o.Classification == o.Customer.Classification)
is "stand alone" in the sense that it can be used to filter Orders
independently:
context.Orders.Where(o => o.Classification == o.Customer.Classification) // No one would try 'c.Classification' here
This restriction may change in later EF versions than the current stable version (EF core 5.0.7).
What can (not) be filtered
Since Where
is an extension method on IEnumerable
it's clear that only collections can be filtered. It's not possible to filter reference navigation properties. If we want to get orders and only populate their Customer
property when the customer is active, we can't use Include
:
context.Orders.Include(c => c.Customer.Where( ... // obviously doesn't compile
Filtered Include vs filtering the query
Filtered Include
has given rise to some confusion on how it affects filtering a query as a whole. The rule of the thumb is: it doesn't.
The statement...
context.Customers.Include(c => c.Orders.Where(o => !o.IsDeleted))
...returns all customers from the context, not only the ones with undeleted orders. The filter in the Include
doesn't affect the number of items returned by the main query.
On the other hand, the statement...
context.Customers
.Where(c => c.Orders.Any(o => !o.IsDeleted))
.Include(c => c.Orders)
...only returns customers having at least one undeleted order, but having all of their orders in the Orders
collections. The filter on the main query doesn't affect the orders per customer returned by Include
.
To get customers with undeleted orders and only loading their undeleted orders, both filters are required:
context.Customers
.Where(c => c.Orders.Any(o => !o.IsDeleted))
.Include(c => c.Orders.Where(o => !o.IsDeleted))
Filtered Include and projections
Another area of confusion is how filtered Include
and projections (select new { ... }
) are related. The simple rule is: projections ignore Include
s, filtered or not. A query like...
context.Customers
.Include(c => c.Orders)
.Select(c => new { c.Name, c.RegistrationDate })
...will generate SQL without a join to Orders
. As for EF, it's the same as...
context.Customers
.Select(c => new { c.Name, c.RegistrationDate })
It gets confusing when the Include
is filtered, but Orders
are also used in the projection:
context.Customers
.Include(c => c.Orders.Where(o => !o.IsDeleted))
.Select(c => new
{
c.Name,
c.RegistrationDate,
OrderDates = c.Orders.Select(o => o.DateSent)
})
One might expect that OrderDates
only contains dates from undeleted orders, but they contain the dates from all Orders
. Again, the projection completely ignores the Include
. Projection and Include
are separate worlds.
How strictly they lead their own lives is amusingly demonstrated by this query:
context.Customers
.Include(c => c.Orders.Where(o => !o.IsDeleted))
.Select(c => new
{
Customer = c,
OrderDates = c.Orders.Select(o => o.DateSent)
})
Now pause for a moment and predict the outcome...
The not so simple rule is: projections don't always ignore Include
. When there is an entity in the projection to which the Include
can be applied, it is applied. That means that Customer
in the projection contains its undeleted Orders
, whereas OrderDates
still contains all dates. Did you get it right?
Not doable.
There is an on-going discussion about this topic: https://github.com/aspnet/EntityFramework/issues/1833
I'd suggest to look around for any of the 3rd party libraries listed there, ex.: https://github.com/jbogard/EntityFramework.Filters