Hack 11. Convert Aggregate Subqueries to JOINs

You can avoid subqueries using JOIN or OUTER JOIN if they don't use aggregate functions. But what about subqueries that do use aggregation?

Some subqueries are easy to eliminate [Hack #10], but others are a bit trickier. Suppose you have the orders table shown in Table 2-6.

Table 2-6. The orders table
customer	whn	totalitems
Jim	2006-10-10	5
Jim	2006-10-11	3
Jim	2006-10-12	1
Brian	2006-10-10	7

Now suppose you need to show the date on which each customer purchased the most totalitems:

SELECT customer,whn,totalitems
FROM orders o1
WHERE o1.whn = (
 SELECT MAX(whn)
 FROM orders o2
 WHERE o1.customer = o2.customer
);

To do this you need to execute the subquery for every row of orders, so the preceding code may be slow to execute. In addition, older versions of MySQL cannot handle subqueries. To avoid using a subquery, you can use a HAVING clause with a self-join:

SELECT o1.customer,o1.whn,o1.totalitems
 FROM orders o1 JOIN orders o2 on (o1.customer = o2.customer)
 GROUP BY o1.customer,o1.whn,o1.totalitems
 HAVING o1.whn = max(o2.whn)

Here's what you'll get as a result:

+----------+------------+------------+
| customer | whn | totalitems |
+----------+------------+------------+
| Brian | 2006-10-10 | 7 |
| Jim | 2006-10-12 | 1 |
+----------+------------+------------+
2 rows in set (0.00 sec)

This approach works well for all aggregate functions.

SQL Fundamentals

Joins, Unions, and Views

Text Handling

Date Handling

Number Crunching

Online Applications

Organizing Data

Storing Small Amounts of Data

Locking and Performance

Reporting

Users and Administration

Wider Access

Index