TSQL divide by zero encountered despite no columns containing 0

SQL is a declarative language; you write a query that logically describes the result you want, but it is up to the optimizer to produce a physical plan. This physical plan may not bear much relation to the written form of the query, because the optimizer does not simply reorder 'steps' derived from the textual form of the query, it can apply over 300 different transformations to find an efficient execution strategy.

The optimizer has considerable freedom to reorder expressions, joins, and other logical query constructions. This means that you cannot, in general, rely on any written query form to force one thing to be evaluated before another. In particular, the rewrite given by Lieven does not force the WHERE clause predicate to be evaluated before the expression. The optimizer may, depending on cost estimations, decide to evaluate the expression wherever it seems most efficient to do so. This may even mean, in some cases, that the expression is evaluated more than once.

The original question considered this possibility, but rejected it as 'not making much sense'. Nevertheless, this is the way the product works - if SQL Server estimates that a join will reduce the set size enough to make it cheaper to compute the expression on the result of the join, it is free to do so.

The general rule is to never depend on a particular evaluation order to avoid things like overflow or divide-by-zero errors. In this example, one would employ a CASE statement to check for a zero divisor - an example of defensive programming.

The optimizer's freedom to reorder things is a fundamental tenet of its design. You can find cases where it leads to counter-intuitive behaviours, but overall the benefits far outweigh the disadvantages.

Paul

The basic steps that SQL Server uses to process a single SELECT statement include the following

The parser scans the SELECT statement and breaks it into logical units such as keywords, expressions, operators, and identifiers.

A query tree, sometimes referred to as a sequence tree, is built describing the logical steps needed to transform the source data into the format required by the result set.

The query optimizer analyzes different ways the source tables can be accessed. It then selects the series of steps that returns the results fastest while using fewer resources. The query tree is updated to record this exact series of steps. The final, optimized version of the query tree is called the execution plan.

The relational engine starts executing the execution plan. As the steps that require data from the base tables are processed, the relational engine requests that the storage engine pass up data from the rowsets requested from the relational engine.

The relational engine processes the data returned from the storage engine into the format defined for the result set and returns the result set to the client.

My interpretation of things is that there is no guarantee that your where clause get's evaluated before evaluating the computed column for all rows.

You could verify that assumption by changing you query like below and forcing the where clause to be evaluated before the computation.

SELECT
    TotalSize,
    FreeSpace,
    (FreeSpace / TotalSize * 100)
FROM (
  SELECT
      TotalSize,
      FreeSpace,
  FROM
      tblComputer
  ...[ couple of joins ]...
  WHERE
      SomeCondition = SomeValue
  ) t

What rows are returned when you run:

SELECT
   TotalSize
FROM
   tblComputer
   ...[ couple of joins ]...
WHERE
   SomeCondition = SomeValue
   and ((TotalSize * 100) = 0)

This might give you a clue as to how SQL Serve ris evaluating (TotalSize * 100) to be zero.

Another idea, is there anything in your where statement which might also be the problem?
You're assuming it's the TotalSize, but it might be somewhere else.

TSQL divide by zero encountered despite no columns containing 0

Related

Recent Posts