2014-11-12

Logical order of SQL Query execution

First of all we should know that logical and physical order of query execution are different but result must be the same.

Every RDBMS can optimize the order in which parts of query are executed for performance reason. However that rearrangement must not influence the final result. The query must return the same records as if it is executed in this logical order:

  1.     FROM
  2.     WHERE
  3.     GROUP BY
  4.     HAVING
  5.     SELECT
  6.     DISTINCT
  7.     ORDER BY
  8.     TOP X / LIMIT X/ select * from ( subquery ) ROWNUM <= X

The programmers should always be aware of this sequence in order not to fall into a trap.

A common mistake is to try using column alias in GROUP BY, WHERE and HAVING clause but actually it is possible to use it only after SELECT clause is executed (ORDER BY and TOP/LIMIT/that long stupid oracle boilerplate).

Another mistake is to try sorting by column  that is not listed in SELECT clause.

The less common but hard to debug problem could be made if OUTER JOIN is used and join condition is moved from ON into WHERE and vice versa. While it is OK with INNER JOIN these two selects will return different results:
select *
from t1
     left join t2 on (t1.id = t2.t1_id and t2.color = 'blue')

select *
from t1
     left join t2 on (t1.id = t2.t1_id)
where t2.color = 'blue'

Which of those two will return less records?

2014-07-08

SQL Fiddle

If you need fast try and error check against some of the most popular relational database management system SQL Fiddle is the wright address. It is also free as in beer :)

It is an online application in which you can select to work some of supported RDBMS.
There are 2 text area fields: first is used to set your schema and data, the second is used to write some query that will run against your schema and data.

Using SQL Fiddle you can share your script similar to Gist for source code sharing.

It is also useful as online SQL code beautifier and formatter.

Here is an example of SQL Code for Oracle DB: http://sqlfiddle.com/#!4/1897b/1

2014-06-11

XMLUnit


XMLUnit for Java provides a set of supporting classes (e.g. Diff,  DetailedDiff, Transform, SimpleXpathEngine, Validator,NodeTest) that allow assertions to be made about:
  • The differences between two pieces of XML
  • The outcome of transforming a piece of XML using XSLT
  • The evaluation of an XPath expression on a piece of XML
  • The validity of a piece of XML
  • Individual nodes in a piece of XML that are exposed by DOM Traversal
XMLUnit for Java can also treat HTML content (even badly-formed HTML) as valid XML to allow these assertions to be made about the content of web pages too.

 

Experience


I've used XMLUnit in 2 situations: assert that 2 xml files are semantically the same regardless of structural differences (eg. different formatting, different namespace prefixes, usage of opening and closing tag vs. empty tag...) and finding real differences between two xml files regardless of prior mentioned differences in structure.

The equality of two xml files can be compared in two ways. We can check if two documents are identical. Two documents are considered to be "identical" if they contain the same elements and attributes in the same order. But we can check if two documents are similar, where those two documents must contain the same elements and attributes regardless of order. In most cases you will checking similarity and that is what is checked when XMLAssert.assertXMLEqual methdo is called.

Howto


There are only few classes that needs to be known. Not so complicated but not so easy when compared to ordinary Assert implementations (JUnit, TestNG or FEST).
XMLUnit class is used as a static configuration class. If there is need for some general configuration change (eg. to ignore whitespaces, to normalize text nodes, to ignore xml comments...) this is a place to do it before calling asserts. Also it is possible to change parsers and engines used by XMLUnit.

DifferenceEngine is comparing Nodes (just Nodes of type: ELEMENT_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_TYPE_NODE, PROCESSING_INSTRUCTION_NODE and TEXT_NODE) and when difference is found the responsibility gets delegated to DifferenceListener interface which implementation is the final judge if difference is significant. DifferenceEngine is not used directly but in some not so simple use cases difference listener should be set.
There are several implementations of DifferenceListener, most useful are Diff and DetailedDiff.

Diff is used for simple comparison where only the state of comparison is needed (identical, similar or different) and the comparison is halted as soon as the status is known with certainty.

If you need o collect all the differences and have all details DetaildedDiff should be used. It is just a decorator for other DifferenceListeners.

MatchTracker interface gets notified when a match between test and actual node is found. It is useful to implement listener that logs matches so that you get insight of comparison process. This interface has no implementation.

ElementQualifier Interface used by the DifferenceEngine to determine which elements can be compared within a NodeList of child nodes. It controls which two nodes from test and actual XML document should be compared.

There are few implementations of this interface: ElementNameQualifier which is default, compares element nodes whose tag name is same. ElementNameAndAttributeQualifier uses tag name and attribute, ElementNameAndTextQualifier uses tag name and text content, RecursiveElementNameAndTextQualifier compares all Element and Text nodes in two pieces of XML. Allows elements of complex, deeply nested types that are returned in different orders but have the same content to be recognized as comparable. 

Every single compare is done using ElementNameQualifier. I had a special case where I needed recursive ElementQualifier but which compares single nodes not just by tag name but also by attributes. So I have tweaked RecursiveElementNameAndTextQualifier to be just a recursive decorator around existing ElementQualifier implementation.

Example


I've crated example at https://github.com/aurelije/xmlunit I had an use case where I needed to refactor one complex wsdl file. I wanted to sort all operations, types and messages, so I can easily compare that sorted Old API wsdl file with New API wsdl file and see what is changed and what is added. So I needed some proof that the original and sorted wsdl are semantically the same. And then I could also use this test to list all differences between Old API and New API wsdl.

References