SQL JOIN - WHERE clause vs. ON clause

After reading it, this is not a duplicate of Explicit vs Implicit SQL Joins. The answer may be related (or even the same) but the question is different.


What is the difference and what should go in each?

If I understand the theory correctly, the query optimizer should be able to use both interchangeably.


They are not the same thing.

Consider these queries:

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345

and

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID 
    AND Orders.ID = 12345

The first will return an order and its lines, if any, for order number 12345. The second will return all orders, but only order 12345 will have any lines associated with it.

With an INNER JOIN, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.


  • Does not matter for inner joins

  • Matters for outer joins

    a. WHERE clause: After joining. Records will be filtered after join has taken place.

    b. ON clause - Before joining. Records (from right table) will be filtered before joining. This may end up as null in the result (since OUTER join).

Example: Consider the below tables:

  1. documents:

    id name
    1 Document1
    2 Document2
    3 Document3
    4 Document4
    5 Document5
  2. downloads:

    id document_id username
    1 1 sandeep
    2 1 simi
    3 2 sandeep
    4 2 reya
    5 3 simi

a) Inside WHERE clause:

   SELECT documents.name, downloads.id
     FROM documents
     LEFT OUTER JOIN downloads
       ON documents.id = downloads.document_id
     WHERE username = 'sandeep'

For above query the intermediate join table will look like this.

id(from documents) name id (from downloads) document_id username
1 Document1 1 1 sandeep
1 Document1 2 1 simi
2 Document2 3 2 sandeep
2 Document2 4 2 reya
3 Document3 5 3 simi
4 Document4 NULL NULL NULL
5 Document5 NULL NULL NULL

After applying the WHERE clause and selecting the listed attributes, the result will be:

name id
Document1 1
Document2 3

b) Inside JOIN clause

   SELECT documents.name, downloads.id
   FROM documents
     LEFT OUTER JOIN downloads
       ON documents.id = downloads.document_id
         AND username = 'sandeep'

For above query the intermediate join table will look like this.

id(from documents) name id (from downloads) document_id username
1 Document1 1 1 sandeep
2 Document2 3 2 sandeep
3 Document3 NULL NULL NULL
4 Document4 NULL NULL NULL
5 Document5 NULL NULL NULL

Notice how the rows in documents that did not match both the conditions are populated with NULL values.

After Selecting the listed attributes, the result will be:

name id
Document1 1
Document2 3
Document3 NULL
Document4 NULL
Document5 NULL

On INNER JOINs they are interchangeable, and the optimizer will rearrange them at will.

On OUTER JOINs, they are not necessarily interchangeable, depending on which side of the join they depend on.

I put them in either place depending on the readability.


The way I do it is:

  • Always put the join conditions in the ON clause if you are doing an INNER JOIN. So, do not add any WHERE conditions to the ON clause, put them in the WHERE clause.

  • If you are doing a LEFT JOIN, add any WHERE conditions to the ON clause for the table in the right side of the join. This is a must, because adding a WHERE clause that references the right side of the join will convert the join to an INNER JOIN.

    The exception is when you are looking for the records that are not in a particular table. You would add the reference to a unique identifier (that is not ever NULL) in the RIGHT JOIN table to the WHERE clause this way: WHERE t2.idfield IS NULL. So, the only time you should reference a table on the right side of the join is to find those records which are not in the table.