The Structured Query Language (SQL) is everywhere. All major applications, all the systems that deal with the ‘Relational Database Management Systems’ (RDMS) beg for deep knowledge on SQL. So, I set out to learn deep concepts of SQL and document my learning.
SQL execution sequence
SQL statements includes a variety of clauses to select, join, trim and slice data as per our needs. Each of these
clauses have a execution sequence, and order. Let’s take a look at it.
SELECT DISTINCT <TOP_specification> <select_list> FROM <left_table> <join_type> JOIN <right_table> ON <join_condition> WHERE <where_condition> GROUP BY <group_by_list> HAVING <having_condition> ORDER BY <order_by_list>
In SQL statements, the primary
clause that is processed in the
FROM clause. This is the way of statement telling SQL language to look for the table first and then do whatever that is asked to do later. So, FROM clause always executes first.
Here’s the sequence of the 10 clauses.
1. FROM clause 2. ON clause 3. OUTER clause 4. WHERE clause 5. GROUP BY clause 6. HAVING clause 7. SELECT clause 8. DISTINCT clause 9. ORDER BY clause 10. TOP clause
Understanding the execution order helps optimizing the query performance easy.
Imagine you have two tables that have equivalent columns, and you want to perform
set operations on the table outputs.
In SQL the
UNION clause combines the results of two SQL queries into a single table of all matching rows. The two queries must result in the same number of columns and compatible data types in order to unite. Any duplicate records are automatically removed unless
UNION ALL is used.
SELECT * FROM table1 UNION SELECT * FROM table2;
To include the duplicate records,
UNION ALL should be used. Note that the order of the records is not preserved. To do so, use
ORDER BY clause
SELECT * FROM table1 UNION ALL SELECT * FROM table2;
INTERSECT operation takes the output of two queries and returns the records that appear in both the result sets. Consider it as the overlapping records. Just like
INTERSECT operator removes the duplicates. To include duplicates, use
INTERSET ALL operator.
SELECT * FROM Orders WHERE Quantity BETWEEN 1 AND 100 INTERSECT SELECT * FROM Orders WHERE Quantity BETWEEN 50 AND 200;
INTERSECT ALL operation includes the duplicates.
EXCEPT operator takes the distinct rows of one query and returns the rows that do not appear in a second result set. It’s like performing A-B, removing B from A. The
EXCEPT ALL operator does not remove duplicates, but if a row appears X times in the first query and Y times in the second, it will appear
max(X - Y, 0) times in the result set.
SELECT * FROM Orders WHERE Quantity BETWEEN 1 AND 100 EXCEPT SELECT * FROM Orders WHERE Quantity BETWEEN 50 AND 75;
Limiting result set record count
If you are expecting the SQL query to return a large result set and want to limit the records, consider adding below snippets to your code.
NOTE - Remove the snippets during production deployment.
ROW_NUMBER() OVER may be used for a simple table on the returned rows, e.g. to return no more than ten rows:
SELECT * FROM ( SELECT ROW_NUMBER() OVER (ORDER BY sort_key ASC) AS row_number, columns FROM tablename ) AS foo WHERE row_number <= 10;
NOTE - ROW_NUMBER can be non-deterministic: if sort_key is not unique, each time you run the query it is possible to get different row numbers assigned to any rows where sort_key is the same. When sort_key is unique, each row will always get a unique row number.
RANK() OVER window function acts like
ROW_NUMBER, but may return more or less than n rows in case of tie conditions, e.g. to return the top-10 youngest persons:
SELECT * FROM ( SELECT RANK() OVER (ORDER BY age ASC) AS ranking, person_id, person_name, age FROM person ) AS foo WHERE ranking <= 10;
The above code could return more than ten rows, e.g. if there are two people of the same age, it could return eleven rows.
FETCH FIRST clause can be used to limit the output result set. NOTE - This is supported by limited no. of databases.
SELECT * FROM T FETCH FIRST 10 ROWS ONLY;
LIMIT clause can also server a simialr purpose as
FETCH FIRST, but again, it is supported by limited databases.
SELECT * FROM TABLE1 LIMIT 10;