2014-11-12

Logical order of SQL Query execution

First of all we should know that logical and physical order of query execution are different but result must be the same.

Every RDBMS can optimize the order in which parts of query are executed for performance reason. However that rearrangement must not influence the final result. The query must return the same records as if it is executed in this logical order:

  1.     FROM
  2.     WHERE
  3.     GROUP BY
  4.     HAVING
  5.     SELECT
  6.     DISTINCT
  7.     ORDER BY
  8.     TOP X / LIMIT X/ select * from ( subquery ) ROWNUM <= X

The programmers should always be aware of this sequence in order not to fall into a trap.

A common mistake is to try using column alias in GROUP BY, WHERE and HAVING clause but actually it is possible to use it only after SELECT clause is executed (ORDER BY and TOP/LIMIT/that long stupid oracle boilerplate).

Another mistake is to try sorting by column  that is not listed in SELECT clause.

The less common but hard to debug problem could be made if OUTER JOIN is used and join condition is moved from ON into WHERE and vice versa. While it is OK with INNER JOIN these two selects will return different results:
select *
from t1
     left join t2 on (t1.id = t2.t1_id and t2.color = 'blue')

select *
from t1
     left join t2 on (t1.id = t2.t1_id)
where t2.color = 'blue'

Which of those two will return less records?

No comments: