What does Include() do in LINQ?

I tried to do a lot of research but I'm more of a db guy - so even the explanation in the MSDN doesn't make any sense to me. Can anyone please explain, and provide some examples on what Include() statement does in the term of SQL query?


Solution 1:

Let's say for instance you want to get a list of all your customers:

var customers = context.Customers.ToList();

And let's assume that each Customer object has a reference to its set of Orders, and that each Order has references to LineItems which may also reference a Product.

As you can see, selecting a top-level object with many related entities could result in a query that needs to pull in data from many sources. As a performance measure, Include() allows you to indicate which related entities should be read from the database as part of the same query.

Using the same example, this might bring in all of the related order headers, but none of the other records:

var customersWithOrderDetail = context.Customers.Include("Orders").ToList();

As a final point since you asked for SQL, the first statement without Include() could generate a simple statement:

SELECT * FROM Customers;

The final statement which calls Include("Orders") may look like this:

SELECT *
FROM Customers JOIN Orders ON Customers.Id = Orders.CustomerId;

Solution 2:

I just wanted to add that "Include" is part of eager loading. It is described in Entity Framework 6 tutorial by Microsoft. Here is the link: https://docs.microsoft.com/en-us/aspnet/mvc/overview/getting-started/getting-started-with-ef-using-mvc/reading-related-data-with-the-entity-framework-in-an-asp-net-mvc-application


Excerpt from the linked page:

Here are several ways that the Entity Framework can load related data into the navigation properties of an entity:

Lazy loading. When the entity is first read, related data isn't retrieved. However, the first time you attempt to access a navigation property, the data required for that navigation property is automatically retrieved. This results in multiple queries sent to the database — one for the entity itself and one each time that related data for the entity must be retrieved. The DbContext class enables lazy loading by default.

Eager loading. When the entity is read, related data is retrieved along with it. This typically results in a single join query that retrieves all of the data that's needed. You specify eager loading by using the Include method.

Explicit loading. This is similar to lazy loading, except that you explicitly retrieve the related data in code; it doesn't happen automatically when you access a navigation property. You load related data manually by getting the object state manager entry for an entity and calling the Collection.Load method for collections or the Reference.Load method for properties that hold a single entity. (In the following example, if you wanted to load the Administrator navigation property, you'd replace Collection(x => x.Courses) with Reference(x => x.Administrator).) Typically you'd use explicit loading only when you've turned lazy loading off.

Because they don't immediately retrieve the property values, lazy loading and explicit loading are also both known as deferred loading.

Solution 3:

Think of it as enforcing Eager-Loading in a scenario where you sub-items would otherwise be lazy-loading.

The Query EF is sending to the database will yield a larger result at first, but on access no follow-up queries will be made when accessing the included items.

On the other hand, without it, EF would execute separte queries later, when you first access the sub-items.