LINQ Join with Multiple From Clauses

When writing LINQ queries in C#, I know I can perform a join using the join keyword. But what does the following do?

from c in Companies
from e in c.Employees
select e;

A LINQ book I have say it's a type of join, but not a proper join (which uses the join keyword). So exactly what type of join is it then?


Solution 1:

Multiple "from" statements are considered compound linq statments. They are like nested foreach statements. The msdn page does list a great example here

var scoreQuery = from student in students
                 from score in student.Scores
                 where score > 90
                 select new { Last = student.LastName, score };

this statement could be rewritten as:

SomeDupCollection<string, decimal> nameScore = new SomeDupCollection<string, float>();
foreach(Student curStudent in students)
{
   foreach(Score curScore in curStudent.scores)
   {
      if (curScore > 90)
      {
         nameScore.Add(curStudent.LastName, curScore);
      }
   }
}

Solution 2:

This will get translated into a SelectMany() call. It is essentially a cross-join.

Jon Skeet talks about it on his blog, as part of the Edulinq series. (Scroll down to Secondary "from" clauses.)

Solution 3:

The code that you listed:

from c in company
from e in c.Employees
select e;

... will produce a list of every employee for every company in the company variable. If an employee works for two companies, they will be included in the list twice.

The only "join" that might occur here is when you say c.Employees. In an SQL-backed provider, this would translate to an inner join from the Company table to the Employee table.

However, the double-from construct is often used to perform "joins" manually, like so:

from c in companies
from e in employees
where c.CompanyId == e.CompanyId
select e;

This would have a similar effect as the code you posted, with potential subtle differences depending on what the employees variable contains. This would also be equivalent to the following join:

from c in companies
join e in employees
   on c.CompanyId equals e.CompanyId
select e;

If you wanted a Cartesian product, however, you could just remove the where clause. (To make it worth anything, you'd probably want to change the select slightly, too, though.)

from c in companies
from e in employees
select new {c, e};

This last query would give you every possible combination of company and employee.