LINQ Include vs Join. Are they equivalent?
I have used join in linq to join 2 tables. What is the difference between a join and Include. From what I see, they both behave the same.
Include vs. Join
An Included is intended to retain the original object structures and graphs. A Join is needed to project a flattened representation of the object graph or to join types which are not naturally related through the graph (ie. join the customer's city with a shipping facility's city).
Compare the following:
db.Customers.Include("Orders")
Generates an IEnumerable each of which may contain their corresponding list of Orders in an object graph like this:
Customer 1
Order
Order
Order
Customer 2
Order
Order
In contrast, if you do the same with a join projecting into an anonymous type you could get the following:
from c in db.Customers
join o in db.Orders on c.CustomerId equals o.CustomerId
select new {c, o}
This produces a new IEnumerable<Anonymous<Customer, Order>>
where the customer is repeated for each order.
{ Customer1, orderA }
{ Customer1, orderB }
{ Customer1, orderC }
{ Customer2, orderD }
{ Customer2, orderE }
{ Customer2, orderF }
While both may issue the same request to the database, the resulting type may be quite different.
In a sense, yes. Include is implemented as a join. Depending on the nullability of the included link it is an inner or left join.
You can always build an include yourself by using a join, like this:
db.Users.Select(u => new { u, u.City })
This is an "include" for the user's city. It manifests itself as a SQL join.
If you simply need all Orders
for some Customers
. Good example here for blog application is displaying all Comments
below Articles
always. Then Include
is your way of work.
Join
in opposition is more helpful if you need some Customers
and filters out them using some data contained in Orders
entity. For example you want to sort out Articles
to send to the Police Articles
with Comments
containing vulgar words.
Also if your Orders
entity contains a lot of data (many columns) taking a lot of memory and you don't need all of them then join
can be much more efficient but here always is a question what "lot of data" or "many columns" means so test first will be the best choice.