Why and when a LEFT JOIN with condition in WHERE clause is not equivalent to the same LEFT JOIN in ON? [duplicate]
Solution 1:
The on
clause is used when the join
is looking for matching rows. The where
clause is used to filter rows after all the joining is done.
An example with Disney toons voting for president:
declare @candidates table (name varchar(50));
insert @candidates values
('Obama'),
('Romney');
declare @votes table (voter varchar(50), voted_for varchar(50));
insert @votes values
('Mickey Mouse', 'Romney'),
('Donald Duck', 'Obama');
select *
from @candidates c
left join
@votes v
on c.name = v.voted_for
and v.voter = 'Donald Duck'
This still returns Romney
even though Donald
didn't vote for him. If you move the condition from the on
to the where
clause:
select *
from @candidates c
left join
@votes v
on c.name = v.voted_for
where v.voter = 'Donald Duck'
Romney
will no longer be in the result set.
Solution 2:
Both are literally different.
The first query does the filtering of table t2
before the joining of tables take place. So the results will then be join on table t1
resulting all the records of t1
will be shown on the list.
The second one filters from the total result after the joining the tables is done.
Here's an example
Table1
ID Name
1 Stack
2 Over
3 Flow
Table2
T1_ID Score
1 10
2 20
3 30
In your first query, it looks like this,
SELECT a.*, b.Score
FROM Table1 a
LEFT JOIN Table2 b
ON a.ID = b.T1_ID AND
b.Score >= 20
What it does is before joining the tables, the records of table2
are filtered first by the score. So the only records that will be joined on table1 are
T1_ID Score
2 20
3 30
- SQLFiddle Demo
because the Score
of T1_ID
is only 10. The result of the query is
ID Name Score
1 Stack NULL
2 Over 20
3 Flow 30
- SQLFiddle Demo
While the second query is different.
SELECT a.*, b.Score
FROM Table1 a
LEFT JOIN Table2 b
ON a.ID = b.T1_ID
WHERE b.Score >= 20
It joins the records first whether it has a matching record on the other table or not. So the result will be
ID Name Score
1 Stack 10
2 Over 20
3 Flow 30
- SQLFiddle Demo
and the filtering takes place b.Score >= 20
. So the final result will be
ID Name Score
2 Over 20
3 Flow 30
- SQLFiddle Demo
Solution 3:
CREATE TABLE Company
(
CompanyId TinyInt Identity Primary Key,
CompanyName Nvarchar(50) NULL
)
GO
INSERT Company VALUES('DELL')
INSERT Company VALUES('HP')
INSERT Company VALUES('IBM')
INSERT Company VALUES('Microsoft')
GO
CREATE TABLE Candidate
(
CandidateId tinyint identity primary key,
FullName nvarchar(50) NULL,
CompanyId tinyint REFERENCES Company(CompanyId)
)
GO
INSERT Candidate VALUES('Ron',1)
INSERT Candidate VALUES('Pete',2)
INSERT Candidate VALUES('Steve',3)
INSERT Candidate VALUES('Steve',NULL)
INSERT Candidate VALUES('Ravi',1)
INSERT Candidate VALUES('Raj',3)
INSERT Candidate VALUES('Kiran',NULL)
GO
SELECT * from Company c
SELECT * from Candidate c
-- A simple left outer Join
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId
--Left Outer Join ON and AND condition fetches 5 rows wtih NULL value from right side table
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId
AND c.CompanyName = 'DELL'
--Left Outer Join ON and where clause fetches only required rows
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId
AND c.CompanyName = 'DELL'
WHERE c.CompanyName='IBM'