LINQ Join with Multiple From Clauses
When writing LINQ queries in C#, I know I can perform a join using the join
keyword. But what does the following do?
from c in Companies
from e in c.Employees
select e;
A LINQ book I have say it's a type of join, but not a proper join (which uses the join
keyword). So exactly what type of join is it then?
Solution 1:
Multiple "from" statements are considered compound linq statments. They are like nested foreach statements. The msdn page does list a great example here
var scoreQuery = from student in students
from score in student.Scores
where score > 90
select new { Last = student.LastName, score };
this statement could be rewritten as:
SomeDupCollection<string, decimal> nameScore = new SomeDupCollection<string, float>();
foreach(Student curStudent in students)
{
foreach(Score curScore in curStudent.scores)
{
if (curScore > 90)
{
nameScore.Add(curStudent.LastName, curScore);
}
}
}
Solution 2:
This will get translated into a SelectMany()
call. It is essentially a cross-join.
Jon Skeet talks about it on his blog, as part of the Edulinq series. (Scroll down to Secondary "from" clauses.)
Solution 3:
The code that you listed:
from c in company
from e in c.Employees
select e;
... will produce a list of every employee for every company in the company
variable. If an employee works for two companies, they will be included in the list twice.
The only "join" that might occur here is when you say c.Employees
. In an SQL-backed provider, this would translate to an inner join from the Company
table to the Employee
table.
However, the double-from
construct is often used to perform "joins" manually, like so:
from c in companies
from e in employees
where c.CompanyId == e.CompanyId
select e;
This would have a similar effect as the code you posted, with potential subtle differences depending on what the employees
variable contains. This would also be equivalent to the following join
:
from c in companies
join e in employees
on c.CompanyId equals e.CompanyId
select e;
If you wanted a Cartesian product, however, you could just remove the where
clause. (To make it worth anything, you'd probably want to change the select
slightly, too, though.)
from c in companies
from e in employees
select new {c, e};
This last query would give you every possible combination of company and employee.