NULL values inside NOT IN clause

This issue came up when I got different records counts for what I thought were identical queries one using a not in where constraint and the other a left join. The table in the not in constraint had one null value (bad data) which caused that query to return a count of 0 records. I sort of understand why but I could use some help fully grasping the concept.

To state it simply, why does query A return a result but B doesn't?

A: select 'true' where 3 in (1, 2, 3, null)
B: select 'true' where 3 not in (1, 2, null)

This was on SQL Server 2005. I also found that calling set ansi_nulls off causes B to return a result.


Solution 1:

Query A is the same as:

select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = null

Since 3 = 3 is true, you get a result.

Query B is the same as:

select 'true' where 3 <> 1 and 3 <> 2 and 3 <> null

When ansi_nulls is on, 3 <> null is UNKNOWN, so the predicate evaluates to UNKNOWN, and you don't get any rows.

When ansi_nulls is off, 3 <> null is true, so the predicate evaluates to true, and you get a row.

Solution 2:

Whenever you use NULL you are really dealing with a Three-Valued logic.

Your first query returns results as the WHERE clause evaluates to:

    3 = 1 or 3 = 2 or 3 = 3 or 3 = null
which is:
    FALSE or FALSE or TRUE or UNKNOWN
which evaluates to 
    TRUE

The second one:

    3 <> 1 and 3 <> 2 and 3 <> null
which evaluates to:
    TRUE and TRUE and UNKNOWN
which evaluates to:
    UNKNOWN

The UNKNOWN is not the same as FALSE you can easily test it by calling:

select 'true' where 3 <> null
select 'true' where not (3 <> null)

Both queries will give you no results

If the UNKNOWN was the same as FALSE then assuming that the first query would give you FALSE the second would have to evaluate to TRUE as it would have been the same as NOT(FALSE).
That is not the case.

There is a very good article on this subject on SqlServerCentral.

The whole issue of NULLs and Three-Valued Logic can be a bit confusing at first but it is essential to understand in order to write correct queries in TSQL

Another article I would recommend is SQL Aggregate Functions and NULL.