A subquery, an SQL statement specifying a search condition, contains either a subselect or a fullselect and is used inside another SQL statement. A subquery can include search conditions of its own. Each search condition can also contain subqueries. Any SQL statement can contain a hierarchy of subqueries. The SQL query block that contains subqueries is at a higher level than the subqueries it contains. This is important for how columns are referenced between levels.
A subquery can contain search conditions that reference not only columns of the tables identified by its FROM clause but also columns of tables identified at any higher level. The reverse, however, is not true. A query at a higher level cannot reference columns from a subquery, although outer query columns can be compared to the resultant columns from a subquery. Within a subquery, a reference to a column of a table from a higher level is called a correlated reference, or correlation predicate.
When specified in a search condition, many operands can be used. They are generally categorized as single-result, IN list, or existence subqueries.
Quantified predicates are another form and compare one or more values to a collection of values by preceding the comparison operator (=, >, <, and so on.) with ANY, SOME, or ALL. To avoid confusion, quantified predicates are ignored in this text.
The following example demonstrates a single-result subquery to return the test center with the highest number of seats:
SELECT TCID,NAME,NOSEATS FROM TEST_CENTER WHERE NOSEATS = (SELECT MAX(NOSEATS) FROM TEST_CENTER)
The subquery can return only one result: either a null, if no qualifying rows are in the subquery table, or a value representing the maximum value of the column noseats in the TEST_CENTER table. This value is compared to the outer table rows, and only those rows that equal the maximum value for noseats are returned.
This type of subquery is known as a noncorrelated subquery; the subquery and the outer query have no correlation. The subquery is independent of the outer query and returns a single result to the outer query merely for comparison.
To demonstrate an IN list subquery, consider the difficulty of producing a report on the number of DB2 Certification exams if the name and the number of the exams are unknown. The following example produces a report that includes the names and phone numbers of all the test centers that have candidates registered for the DB2 Certification exams. The search string DB2 is used to find the numbers of the DB2 Certification program exams:
SELECT DISTINCT NAME,PHONE FROM TEST_CENTER TC, TEST_TAKEN TT WHERE TC.TCID=TT.TCID AND NUMBER IN (SELECT NUMBER FROM TEST WHERE NAME LIKE 'DB2%')
In this example, the subquery appears as part of the IN clause. In the subquery, we are retrieving all the numbers for those tests that have DB2 in their names. Note that you will never see the subquery output. The subquery merely creates a list of values that will be used later by the outer SELECT statement.
Both previous examples have demonstrated a noncorrelated subquery. The most common forms of noncorrelated subqueries are the single-result and the IN list subqueries.
The third alternative, an EXISTS subquery, is generally not coded for a noncorrelated subquery. Because it returns a trUE or FALSE, an EXISTS subquery will be either true for every row of the outer table or false for every row of the outer table.
In contrast, a query in which the subquery references values of the immediate outer SELECT or any subselect at a higher level is a correlated subquery. The most common forms are the single-result and the EXISTS correlated subquery. Whereas the noncorrelated subquery is evaluated once and a result is returned to the outer query, the correlated subquery is potentially evaluated once for every outer row.
The IN subquery is generally not coded for a correlated subquery, due to the potential for the IN list to be built numerous times.
As with the noncorrelated example, a single-result correlated subquery will return a single result for every execution. The difference is that a noncorrelated subquery is executed once rather than many times.
Consider a query to return the highest score for each test and the associated details. The following correlated subquery returns the desired result:
SELECT C.FNAME, TT.NUMBER, TT.DATE_TAKEN, TT.SCORE FROM TEST_TAKEN TT, CANDIDATE C WHERE C.CID = TT.CID AND TT.SCORE = (SELECT MAX(TT2.SCORE) FROM TEST_TAKEN TT2 WHERE TT2.NUMBER = TT.NUMBER) FNAME NUMBER DATE_TAKEN SCORE ------ -------- ---------- ----- Dan 501 02/02/2004 73 Susan 500 03/01/2004 82 Susan 502 03/02/2004 92 3 record(s) selected.
The test number for each outer row is passed to the subquery, and the maximum score for that test is returned to the outer query for comparison. If the maximum score equals the score of the current row, that row is returned. If not, the row is discarded. It is possible for the outer query to return more than one row for each test if more than one person achieves the maximum score.
The next example demonstrates an EXISTS subquery in which we rewrite the previous noncorrelated IN list SQL example as a correlated subquery. This query produces a report that includes the names and phone numbers of all the test centers that have candidates registered for the DB2 Certification exams. To match the problem description to the query, the problem description can be rewritten as follows: Produce a report that includes the names and phone numbers of all test centers at which a candidate is registered for the DB2 Certification exams:
SELECT NAME,PHONE FROM TEST_CENTER TC WHERE EXISTS (SELECT * FROM TEST T, TEST_TAKEN TT WHERE T.NAME LIKE 'DB2%' AND T.NUMBER = TT.NUMBER AND TT.TCID=TC.TCID)
As with standard WHERE clause predicates, the reverse condition can be achieved by prefixing the condition with the NOT keyword, such as NOT EXISTS or NOT IN. Remember, however, that a null will not qualify against either an IN or a NOT IN condition.
Note that the WHERE clause in the subquery in this example references a table that is listed in the outer FROM clause.
The EXISTS subquery will return a trUE or a FALSE to the outer query. If trUE, the outer row is considered to be a match, and the row is returned to the application. The correlated EXISTS subquery will terminate after the first trUE condition is found or when a FALSE can be determined, as the requirement is to return only a trUE or FALSE condition, not to list how many occurrences match.
DB2 prunes the SELECT list for EXISTS subqueries, so it is not important for access path or performance what columns are listed in the subquery select list of an EXISTS subquery. Therefore, whether you specify SELECT * or SELECT 1, DB2 will still return a trUE or FALSE and will not retrieve all columns as an outer query would if SELECT * were specified.