6.2. Small Result Set, Indirect CriteriaA situation that is superficially similar to the previous one is when you have a small result set that is based on criteria applied to tables other than the data source tables. We want data from one table, and yet our conditions apply to other, related tables from which we don't want any data to be returned. A typical example is the question of "which customers have ordered a particular item" that we amply discussed earlier in Chapter 4. As you saw in Chapter 4, this type of query can be expressed in either of two ways:
If there is some particularly selective criterion to apply to the table (or tables) from which we obtain the result set, there is no need to say much more than what has been said in the previous situation "Small Result Set, Direct Specific Criteria": the query will be driven by the selective criterion. and the same reasoning applies. But if there is no such criterion, then we have to be much more careful. To take a simplified version of the example in Chapter 4, identifying the customers who have ordered a Batmobile, our typical case will be something like the following: select distinct orders.custid from orders join orderdetail on (orderdetail.ordid = orders.ordid) join articles on (articles.artid = orderdetail.artid) where articles.artname = 'BATMOBILE' In my view it is much better, because it is more understandable, to make explicit the test on the presence of the article in a customer's orders by using a subquery. But should that subquery be correlated or uncorrelated? Since we have no other criterion, the answer should be clear: uncorrelated. If not, one would have to scan the orders table and fire the subquery for each rowthe type of big mistake that passes unnoticed when we start with a small orders table but becomes increasingly painful as the business gathers momentum. The uncorrelated subquery can either be written in the classic style as: select distinct orders.custid from orders where ordid in (select orderdetails.ordid from orderdetail join articles on (articles.artid = orderdetail.artid) where articles.artname = 'BATMOBILE') or as a subquery in the from clause: select distinct orders.custid from orders, (select orderdetails.ordid from orderdetail join articles on (articles.artid = orderdetail.artid) where articles.artname = 'BATMOBILE') as sub_q where sub_q.ordid = orders.ordid I find the first query more legible, but it is really a matter of personal taste. Don't forget that an in( ) condition on the result of the subquery implies a distinct and therefore a sort, which takes us to the fringe of the relational model. Where using subqueries, think carefully before choosing either a correlated or uncorrelated subquery. |