Why and when a LEFT JOIN with condition in WHERE clause is not equivalent to the same LEFT JOIN in ON? [duplicate]

Solution 1:

The on clause is used when the join is looking for matching rows. The where clause is used to filter rows after all the joining is done.

An example with Disney toons voting for president:

declare @candidates table (name varchar(50));
insert @candidates values 
    ('Obama'), 
    ('Romney');
declare @votes table (voter varchar(50), voted_for varchar(50));
insert @votes values 
    ('Mickey Mouse', 'Romney'),
    ('Donald Duck', 'Obama');

select  *
from    @candidates c
left join    
        @votes v
on      c.name = v.voted_for
        and v.voter = 'Donald Duck'

This still returns Romney even though Donald didn't vote for him. If you move the condition from the on to the where clause:

select  *
from    @candidates c
left join    
        @votes v
on      c.name = v.voted_for
where   v.voter = 'Donald Duck'

Romney will no longer be in the result set.

Solution 2:

Both are literally different.

The first query does the filtering of table t2 before the joining of tables take place. So the results will then be join on table t1 resulting all the records of t1 will be shown on the list.

The second one filters from the total result after the joining the tables is done.


Here's an example

Table1

ID   Name
1    Stack
2    Over 
3    Flow

Table2

T1_ID   Score
1       10
2       20
3       30

In your first query, it looks like this,

SELECT  a.*, b.Score
FROM    Table1 a
        LEFT JOIN Table2 b
           ON a.ID = b.T1_ID AND
              b.Score >= 20

What it does is before joining the tables, the records of table2 are filtered first by the score. So the only records that will be joined on table1 are

T1_ID   Score
2       20
3       30
  • SQLFiddle Demo

because the Score of T1_ID is only 10. The result of the query is

ID   Name    Score
1    Stack   NULL
2    Over    20
3    Flow    30
  • SQLFiddle Demo

While the second query is different.

SELECT  a.*, b.Score
FROM    Table1 a
        LEFT JOIN Table2 b
           ON a.ID = b.T1_ID
WHERE   b.Score >= 20

It joins the records first whether it has a matching record on the other table or not. So the result will be

ID   Name    Score
1    Stack   10
2    Over    20
3    Flow    30
  • SQLFiddle Demo

and the filtering takes place b.Score >= 20. So the final result will be

ID   Name    Score
2    Over    20
3    Flow    30
  • SQLFiddle Demo

Solution 3:

CREATE TABLE Company
(
CompanyId TinyInt Identity Primary Key,
CompanyName Nvarchar(50) NULL
)
GO

INSERT Company VALUES('DELL')
INSERT Company VALUES('HP')
INSERT Company VALUES('IBM')
INSERT Company VALUES('Microsoft')
GO

CREATE TABLE Candidate
(
CandidateId tinyint identity primary key,
FullName nvarchar(50) NULL,
CompanyId tinyint REFERENCES Company(CompanyId)
)
GO

INSERT Candidate VALUES('Ron',1)
INSERT Candidate VALUES('Pete',2)
INSERT Candidate VALUES('Steve',3)
INSERT Candidate VALUES('Steve',NULL)
INSERT Candidate VALUES('Ravi',1)
INSERT Candidate VALUES('Raj',3)
INSERT Candidate VALUES('Kiran',NULL)
GO

SELECT * from Company c
SELECT * from Candidate c

-- A simple left outer Join
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId

--Left Outer Join ON and AND condition fetches 5 rows wtih NULL value from right side table 
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId
AND c.CompanyName = 'DELL' 

--Left Outer Join ON and where clause fetches only required rows
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId
AND c.CompanyName = 'DELL' 
WHERE c.CompanyName='IBM'