EF 4.1 loading filtered child collections not working for many-to-many

I've been looking at Applying filters when explicitly loading related entities and could not get it to work for a many-to-many relationship.

I created a simple model:Model

Brief description:
A Student can take many Courses and a Course can have many Students.
A Student can make many Presentation, but a Presentation can be made by only one Student.
So what we have is a many-to-many relationship between Students and Courses, as well as a one-to-many relationship between Student and Presentations.

I've also added one Student, one Course and one Presentation related to each other.

Here is the code I am running:

class Program
{
    static void Main()
    {
        using (var context = new SportsModelContainer())
        {
            context.Configuration.LazyLoadingEnabled = false;
            context.Configuration.ProxyCreationEnabled = false;

            Student student = context.Students.Find(1);

            context.
                Entry(student).
                Collection(s => s.Presentations).
                Query().
                Where(p => p.Id == 1).
                Load(); 

            context.
                Entry(student).
                Collection(s => s.Courses).
                Query().
                Where(c => c.Id == 1).
                Load();

            // Trying to run Load without calling Query() first
            context.Entry(student).Collection(s => s.Courses).Load();
        }
    }
}

After loading the presentations I see that the count for Presentations changed from 0 to 1: After loading presentations. However, after doing the same with Courses nothing changes: After attempting to load courses

So I try to load the courses without calling Query and it works as expected: Courses loaded

(I removed the Where clause to further highlight the point - the last two loading attempts only differ by the "Query()" call)

Now, the only difference I see is that one relationship is one-to-many while the other one is many-to-many. Is this an EF bug, or am I missing something?

And btw, I checked the SQL calls for the last two Course-loading attempts, and they are 100% identical, so it seems that it's EF that fails to populate the collection.


I could reproduce exactly the behaviour you describe. What I got working is this:

context.Entry(student)
       .Collection(s => s.Courses)
       .Query()
       .Include(c => c.Students)
       .Where(c => c.Id == 1)
       .Load();

I don't know why we should be forced also to load the other side of the many-to-many relationship (Include(...)) when we only want to load one collection. For me it feels indeed like a bug unless I missed some hidden reason for this requirement which is documented somewhere or not.

Edit

Another result: Your original query (without Include) ...

context.Entry(student)
       .Collection(s => s.Courses)
       .Query()
       .Where(c => c.Id == 1)
       .Load();

... actually loads the courses into the DbContext as ...

var localCollection = context.Courses.Local;

... shows. The course with Id 1 is indeed in this collection which means: loaded into the context. But it's not in the child collection of the student object.

Edit 2

Perhaps it is not a bug.

First of all: We are using here two different versions of Load:

DbCollectionEntry<TEntity, TElement>.Load()

Intellisense says:

Loads the collection of entities from the database. Note that entities that already exist in the context are not overwritten with values from the database.

For the other version (extension method of IQueryable) ...

DbExtensions.Load(this IQueryable source);

... Intellisense says:

Enumerates the query such that for server queries such as those of System.Data.Entity.DbSet, System.Data.Objects.ObjectSet, System.Data.Objects.ObjectQuery, and others the results of the query will be loaded into the associated System.Data.Entity.DbContext, System.Data.Objects.ObjectContext or other cache on the client. This is equivalent to calling ToList and then throwing away the list without the overhead of actually creating the list.

So, in this version it is not guaranteed that the child collection is populated, only that the objects are loaded into the context.

The question remains: Why gets the Presentations collection populated but not the Courses collection. And I think the answer is: Because of Relationship Span.

Relationship Span is a feature in EF which fixes automatically relationships between objects which are in the context or which are just loaded into the context. But this doesn't happen for all types of relationships. It happens only if the multiplicity is 0 or 1 on one end.

In our example it means: When we load the Presentations into the context (by our filtered explicit query), EF also loads the foreign key of the Presentation entites to the Student entity - "transparently", which means, no matter if the FK is exposed as property in the model of not. This loaded FK allows EF to recognize that the loaded Presentations belong to the Student entity which is already in the context.

But this is not the case for the Courses collection. A course does not have a foreign key to the Student entity. There is the many-to-many join-table in between. So, when we load the Courses EF does not recognize that those courses belong to the Student which is in the context, and therefore doesn't fix the navigation collection in the Student entity.

EF does this automatic fixup only for references (not collections) for performance reasons:

To fix relationship, EF transparently rewrites the query to bring relationship info for all relations which has multiplicity of 0..1 or1 on the other end; in other words navigation properties that are entity reference. If an entity has relationship with multiplicity of greater then 1, EF will not bring back the relationship info because it could be performance hit and as compared to bringing a single foreign along with rest of the record. Bringing relationship info means retrieving all the foreign keys the records has.

Quote from page 128 of Zeeshan Hirani's in depth guide to EF.

It is based on EF 4 and ObjectContext but I think this is still valid in EF 4.1 as DbContext is mainly a wrapper around ObjectContext.

Unfortunately rather complex stuff to keep in mind when using Load.

And another Edit

So, what can we do when we want to explicitely load one filtered side of a many-to-many relationship? Perhaps only this:

student.Courses = context.Entry(student)
       .Collection(s => s.Courses)
       .Query()
       .Where(c => c.Id == 1)
       .ToList();