Filtering by Subquery


The database tables used in all the chapters in this book are relational tables. (See Appendix B, "The Example Tables," for a description of each of the tables and their relationships.) Order data is stored in two tables. The orders table stores a single row for each order containing order number, customer ID, and order date. The individual order items are stored in the related orderitems table. The orders table does not store customer information. It only stores a customer ID. The actual customer information is stored in the customers table.

Now suppose you wanted a list of all the customers who ordered item TNT2. What would you have to do to retrieve this information? Here are the steps:

1.

Retrieve the order numbers of all orders containing item TNT2.

2.

Retrieve the customer ID of all the customers who have orders listed in the order numbers returned in the previous step.

3.

Retrieve the customer information for all the customer IDs returned in the previous step.

Each of these steps can be executed as a separate query. By doing so, you use the results returned by one SELECT statement to populate the WHERE clause of the next SELECT statement.

You can also use subqueries to combine all three queries into one single statement.

The first SELECT statement should be self-explanatory by now. It retrieves the order_num column for all order items with a prod_id of TNT2. The output lists the two orders containing this item:

Input

SELECT order_num FROM orderitems WHERE prod_id = 'TNT2';

Output

+-----------+ | order_num | +-----------+ |     20005 | |     20007 | +-----------+

The next step is to retrieve the customer IDs associated with orders 20005 and 20007. Using the IN clause described in Chapter 7, "Advanced Data Filtering," you can create a SELECT statement as follows:

Input

SELECT cust_id FROM orders WHERE order_num IN (20005,20007);

Output

+---------+ | cust_id | +---------+ |   10001 | |   10004 | +---------+

Now, combine the two queries by turning the first (the one that returned the order numbers) into a subquery. Look at the following SELECT statement:

Input

SELECT cust_id FROM orders WHERE order_num IN (SELECT order_num                     FROM orderitems                     WHERE prod_id = 'TNT2');

Output

+---------+ | cust_id | +---------+ |   10001 | |   10004 | +---------+

Analysis

Subqueries are always processed starting with the innermost SELECT statement and working outward. When the preceding SELECT statement is processed, MySQL actually performs two operations.

First it runs the subquery

SELECT order_num FROM orderitems WHERE prod_id='TNT2'

That query returns the two order numbers 20005 and 20007. Those two values are then passed to the WHERE clause of the outer query in the comma-delimited format required by the IN operator. The outer query now becomes

SELECT cust_id FROM orders WHERE order_num IN (20005,20007)

As you can see, the output is correct and exactly the same as the output returned by the previous hard-coded WHERE clause.

Tip

Formatting Your SQL SELECT statements containing subqueries can be difficult to read and debug, especially as they grow in complexity. Breaking up the queries over multiple lines and indenting the lines appropriately as shown here can greatly simplify working with subqueries.


You now have the IDs of all the customers who ordered item TNT2. The next step is to retrieve the customer information for each of those customer IDs. The SQL statement to retrieve the two columns is

Input

SELECT cust_name, cust_contact FROM customers WHERE cust_id IN (10001,10004);

Instead of hard-coding those customer IDs, you can turn this WHERE clause into yet another subquery:

Input

SELECT cust_name, cust_contact FROM customers WHERE cust_id IN (SELECT cust_id                   FROM orders                   WHERE order_num IN (SELECT order_num                                       FROM orderitems                                       WHERE prod_id = 'TNT2'));

Output

+----------------+--------------+ | cust_name      | cust_contact | +----------------+--------------+ | Coyote Inc.    | Y Lee        | | Yosemite Place | Y Sam        | +----------------+--------------+

Analysis

To execute this SELECT statement, MySQL had to actually perform three SELECT statements. The innermost subquery returned a list of order numbers that were then used as the WHERE clause for the subquery above it. That subquery returned a list of customer IDs that were used as the WHERE clause for the top-level query. The top-level query actually returned the desired data.

As you can see, using subqueries in a WHERE clause enables you to write extremely powerful and flexible SQL statements. There is no limit imposed on the number of subqueries that can be nested, although in practice you will find that performance tells you when you are nesting too deeply.

Caution

Columns Must Match When using a subquery in a WHERE clause (as seen here), make sure that the SELECT statements have the same number of columns as in the WHERE clause. Usually, a single column will be returned by the subquery and matched against a single column, but multiple columns may be used if needed.


Although usually used in conjunction with the IN operator, subqueries can also be used to test for equality (using =), non-equality (using <>), and so on.

Caution

Subqueries and Performance The code shown here works, and it achieves the desired result. However, using subqueries is not always the most efficient way to perform this type of data retrieval, although it might be. More on this is in Chapter 15, "Joining Tables," where you will revisit this same example.





MySQL Crash Course
MySQL Crash Course
ISBN: 0672327120
EAN: 2147483647
Year: 2004
Pages: 214
Authors: Ben Forta

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net