Why is EF generating SQL queries with unnecessary null-checks?

I came across an issue with EF creating terrible queries when searching on a string field. Its produced a query in the style of lazy programmers to encompass null checking which forces the whole index to be scanned.

consider the following queries.

  1. Query 1

    var p1 = "x";
    var r1 = ctx.Set<E>().FirstOrDefault(
                            subject =>
                                p1.Equals(subject.StringField));
    
  2. Query 2

    const string p2 = "x";
    var r2 = ctx.Set<E>().FirstOrDefault(
                            subject =>
                                p2.Equals(subject.StringField));
    

Query 1 produces

WHERE (('x' = [Extent2].[StringField]) OR (('x' IS NULL) AND ([Extent2].[StringField] IS NULL))) 

and executes in 4 seconds

Query 2 produces

WHERE (N'x' = [Extent2].[StringField]) 

and executes in 2 milliseconds

Does anyone know of any work arounds? (no the parameter cant be a const as it is entered by user input but cannot be null.)

N.B When profiled, both queries are prepared with sp_executesql by EF; as of-cause if they were just executed the query optimiser would negate the OR 'x' IS NULL check.

for @Martin


Solution 1:

Set UseDatabaseNullSemantics = true;

  • When UseDatabaseNullSemantics == true, (operand1 == operand2) will be translated as:

    WHERE operand1 = operand2
    
  • When UseDatabaseNullSemantics == false, (operand1 == operand2) will be translated as:

    WHERE
        (
            (operand1 = operand2)
            AND
            (NOT (operand1 IS NULL OR operand2 IS NULL))
        )
        OR
        (
            (operand1 IS NULL)
            AND
            (operand2 IS NULL)
        )
    

This is documented by Microsoft:

Gets or sets a value indicating whether database null semantics are exhibited when comparing two operands, both of which are potentially nullable. The default value is false.

You can set it in your DbContext subclass constructor, like so:

public class MyContext : DbContext
{
    public MyContext()
    {
        this.Configuration.UseDatabaseNullSemantics = true;
    }
}

Or you can also set this setting to your dbContext instance from the outside like the code example below, from my point of view (see @GertArnold comment), this apporach will be better, because it will not change the default database behaviour or configuration):

myDbContext.Configuration.UseDatabaseNullSemantics = true;

Solution 2:

You can fix this by adding [Required] on StringField property

public class Test
{
    [Key]
    public int Id { get; set; }
    [Required]
    public string Bar{ get; set; }
    public string Foo { get; set; }

}


 string p1 = "x";
 var query1 = new Context().Tests.Where(F => p1.Equals(F.Bar));

 var query2 = new Context().Tests.Where(F => p1.Equals(F.Foo));

this is query1

{SELECT [Extent1].[Id] AS [Id], [Extent1].[Bar] AS [Bar], [Extent1].[Foo] AS [Foo] FROM [dbo].[Tests] AS [Extent1] WHERE @p__linq__0 = [Extent1].[Bar]}

and this is query2

{SELECT [Extent1].[Id] AS [Id], [Extent1].[Bar] AS [Bar], [Extent1].[Foo] AS [Foo] FROM [dbo].[Tests] AS [Extent1] WHERE (@p__linq__0 = [Extent1].[Foo]) OR ((@p__linq__0 IS NULL) AND ([Extent1].[Bar2] IS NULL))}

Solution 3:

A colleague of mine has just found a really really nice solution. Since I already discovered that using constants produces the correct SQL. We wondered if we could swap out the variables in the expression with constants; and as it turns out you can. I believe this method to be less invasive than changing the null settings on the DB context.

public class Foo_test : EntityContextIntegrationSpec
        {

            private static string _foo = null;

            private static DataConnection _result;

            private Because _of = () => _result = EntityContext.Set<E>().Where(StringMatch<E>(x => x.StringField));

            private static Expression<Func<TSource, bool>> StringMatch<TSource>(Expression<Func<TSource, string>> prop)
            {
                var body = Expression.Equal(prop.Body, Expression.Constant(_foo));
                return Expression.Lambda<Func<TSource,bool>>(body, prop.Parameters[0]);                
            }

            [Test] public void Test() => _result.ShouldNotBeNull();
        }