MySQL allows you to embed one query within another. The embedded query is known as a subquery. Within a SELECT statement, you put another query inside parentheses. The subquery is evaluated first, and its result is used in the main query.
Filtering with a Subquery
You can use a subquery in an expression in a WHERE clause. Usually this is on the right side of the IN operator. The subquery returns a number of rows, and that result is used as the list of values to compare using IN. The subquery should return just one column.
The following example uses a subquery to find the books that are in good or mint condition, and then uses the result of that query to find the people that have borrowed those books:
SELECT DISTINCT p.name FROM loan l JOIN person p ON p.person_id = l.person_id WHERE l.book_id IN ( SELECT book_id FROM book WHERE cond in ('mint', 'good') );
Essentially, this query says "show me everyone who has borrowed a book in good or mint condition." To follow how this is evaluated, take the subquery and execute it separately. If you run
SELECT book_id FROM book WHERE cond IN ('mint', 'good');
the result looks like this:
+---------+ | book_id | +---------+ | 1 | | 2 | | 4 | +---------+
With this information, you can see that the actual query executed is as follows:
SELECT DISTINCT p.name FROM loan l JOIN person p ON p.person_id = l.person_id WHERE l.book_id IN (1, 2, 4);
If you are absolutely sure that a subquery can only ever return one row, you can use it with an equal operator. Be aware, however, that an error occurs if the subquery does return more than one row.
Selecting the Result of a Subquery
You can use a subquery in the SELECT clause, and its result is returned as a column. The subquery must return only one column or MySQL gives an error.
The following example uses two subqueries in the SELECT statement to return two pieces of summary information in a single query. Each subquery performs a COUNT(*) operation on a different table, with the two results ultimately being returned as a single row of data containing two columns.
SELECT (SELECT COUNT(*) FROM person) AS num_person, (SELECT COUNT(*) FROM book) AS num_book;
If a subquery references a table that also appears in the outer query, it cannot be executed in the way described previouslyit is impossible to evaluate the subquery without knowing each row value for the table in the outer query. This is known as a correlated subquery.
The following example uses a subquery to return the most recent loan date for each book returned by the main query. This is a correlated subquery because the inner query references the book table from the main query. You cannot execute the subquery on its own without knowing the book.book_id value for each row returned by the main query.
SELECT title, (SELECT max(date_lent) FROM loan WHERE loan.book_id = book.book_id) AS subquery FROM book;
As you might have already realized, this query could be rewritten fairly easily using a join and a GROUP BY clause. Frequently, you can perform the same query in several ways, using different joins or subqueries. Sometimes the method used can affect performancein particular, correlated subqueries can be very slow to executeso if a query is running slowly, try another method to see if it improves. As correlated subqueries are slow, you should only use one if there is no other way to perform the required query.