A subquery is a SELECT statement nested inside a SELECT, INSERT, UPDATE, or DELETE statement or inside another subquery. Subqueries can be nested or correlated; nested subqueries execute once when the outer query is executed, and correlated subqueries execute once for every row that is returned when the outer query is executed.
After this lesson, you will be able to:
- Use subqueries to break down and perform complex queries
Estimated lesson time: 50 minutes
Use subqueries to break down a complex query into a series of logical steps and solve a problem with a single statement. Subqueries are useful when your query relies on the results of another query.
Often, a query that contains subqueries can be written as a join. Query performance may be similar using either approach. The difference is that a subquery may require the query optimizer to perform additional steps, such as sorting, which may influence the processing strategy.
Using joins typically allows the query optimizer to retrieve data in the most efficient way. Do not use subqueries unless the query requires multiple steps and you cannot achieve the desired results using a join.
When you decide to use subqueries, consider the following facts and guidelines:
Nested subqueries return either a single value or a list of values. SQL Server evaluates a nested subquery once and then uses the value or list returned by the subquery in the outer query.
Use a nested subquery that returns a single value to compare to a value in the outer query.
When you use nested subqueries to return a single value, consider the following guidelines:
Example 1
This example uses a single-value nested SELECT statement in the WHERE clause to return all customers listed in the orders table who placed orders on the most recently recorded day.
USE Northwind SELECT OrderID, CustomerID FROM Orders WHERE OrderDate = (SELECT MAX(OrderDate) FROM Orders) |
Result
OrderID CustomerID ----------- ---------- 11074 SIMOB 11075 RICSU 11076 BONAP 11077 RATTC (4 row(s) affected) |
To fully understand this example, consider the result if the subquery is executed on its own:
SELECT MAX(OrderDate) FROM Orders |
Result
--------------------------- 1998-05-06 00:00:00.000 (1 row(s) affected) |
Therefore, the outer query is, effectively:
USE Northwind SELECT OrderID, CustomerID FROM Orders WHERE OrderDate = '1998-05-06 00:00:00.000' |
Example 2
This example contains two single-value subqueries. For each sale of a particular product, the query lists the quantity sold, the total sales of all products, and the percentage of total sales for that particular sale.
USE Northwind SELECT ProductID, Quantity AS Number_Sold, (SELECT SUM(Quantity) FROM [Order Details]) AS Total_Sold, ((CONVERT(Money, Quantity) / (SELECT SUM(Quantity) FROM [Order Details])) * 100) AS Percent_Of_Sales FROM [Order Details] |
Result
ProductID Number_Sold Total_Sold Percent_Of_Sales ----------- ----------- ----------- --------------------- 11 12 51317 .0200 42 10 51317 .0100 72 5 51317 .0000 75 4 51317 .0000 77 2 51317 .0000 (2155 row(s) affected) |
In this exercise, you will write and execute a query in the library database that returns member.firstname, member.lastname, loanhist.isbn, and loanhist.fine_paid for members who have paid the highest recorded fines for all books. The C:\SQLDW\Exercise\AppE\highpay.SQL script file contains a completed script for this exercise.
USE library SELECT DISTINCT firstname, lastname, isbn, fine_paid FROM member m JOIN loanhist lh ON m.member_no = lh.member_no WHERE lh.fine_paid = (SELECT MAX(fine_paid) FROM loanhist) |
Result
firstname lastname isbn fine_paid --------------- --------------- ----------- --------------------- Angela Valentine 103 8.0000 Shelly LaBrie 293 8.0000 Teresa Chen 291 8.0000 (3 row(s) affected) Warning: Null value eliminated from aggregate. |
To test for membership in a generated list of values, use a nested subquery that returns a list of values. When you use nested subqueries to return a list of values, consider the following facts:
Example 1
This example generates a list of companies that made purchases after the date 1/1/95.
USE Northwind SELECT CompanyName FROM Customers WHERE CustomerID IN (SELECT CustomerID FROM Orders WHERE OrderDate > '1/1/95') |
Result
CompanyName ---------------------------------------- Alfreds Futterkiste Ana Trujillo Emparedados y helados Antonio Moreno Taquería Wilman Kala Wolski Zajazd (89 row(s) affected) |
To fully understand this example, consider the result if the subquery is executed on its own:
SELECT CustomerID FROM Orders WHERE OrderDate > '1/1/95' |
Result
CustomerID ---------- VINET TOMSP HANAR BONAP RATTC (830 row(s) affected) |
Therefore, the outer query is, effectively:
USE Northwind SELECT CompanyName FROM Customers WHERE CustomerID IN ('VINET', 'TOMSP', 'HANAR', , 'BONAP', 'RATTC') |
In this exercise, you will write and execute a query in the library database on the reservation table that returns two columns, member_no, and DateReserved. The query returns a list of member numbers that have books on reserve with titles that contains the word First. The C:\SQLDW\Exercise\AppE\subquin.SQL script file contains a completed script for this exercise.
USE library SELECT member_no, CONVERT(char(8), log_date, 1) AS DateReserved FROM reservation WHERE isbn IN (SELECT it.isbn FROM item it INNER JOIN title ti ON it.title_no = ti.title_no WHERE ti.title LIKE '%First%') |
Result
The DateReserved will be a different date on your system.
member_no DateReserved --------- ------------ 35 01/12/99 86 01/12/99 137 01/12/99 9929 01/12/99 9980 01/12/99 (196 row(s) affected) |
For correlated subqueries, the inner query uses information from the outer query and executes once for every row in the outer query.
When you use correlated subqueries, consider the following facts and guidelines:
Example 1
This example returns a list of customers who ordered more than 20 items of product number 23 on a single order.
USE Northwind SELECT OrderID, CustomerID FROM Orders o WHERE 20 < (SELECT Quantity FROM [Order Details] od WHERE o.OrderID = od.OrderID AND od.ProductID = 23) |
Result
OrderID CustomerID ----------- ---------- 10337 FRANK 10348 WANDK 10396 FRANK Ö 10869 SEVES 10880 FOLKO (11 row(s) affected) |
Figure E.1 and the following steps describe how the correlated subquery is evaluated in Example 1:
Figure E.1 Evaluating a correlated subquery
Example 2
This example returns a list of products and the largest order ever placed for each product. Notice that this correlated subquery uses a second alias to reference the same table as the outer query. The Products table is joined simply to look up the product name; it is not involved in the subquery.
USE Northwind SELECT DISTINCT ProductName, Quantity FROM [Order Details] od1 JOIN Products p ON od1.ProductID = p.ProductID WHERE Quantity = (SELECT MAX(Quantity) FROM [Order Details] od2 WHERE od1.ProductID = od2.ProductID) |
Result
ProductName Quantity ---------------------------------------- -------- Alice Mutton 100 Aniseed Syrup 60 Boston Crab Meat 91 Ö Wimmers gute Semmelknödel 130 Zaanse koeken 55 (77 row(s) affected) |
In this exercise, you will create a query that uses a correlated subquery to calculate a value based on data from the outer query and then uses that value as part of a comparison. You will query the member and loanhist tables to return a list of library members who have assessed fines that total more than $5.00. A correlated subquery calculates the fines that are assessed for each member. The C:\SQLDW\Exercise\AppE\fineof5.SQL script file contains a completed script for this exercise.
USE library SELECT member_no, lastname FROM member m WHERE (SELECT SUM(fine_assessed) FROM loanhist lh WHERE m.member_no = lh.member_no) > 5 |
Result
member_no lastname --------- --------------- 203 Graff 617 Valentine 778 Chen Ö 9676 Harui 9806 Martin (41 row(s) affected) |
You can use the EXISTS and NOT EXISTS keywords to determine whether a subquery returns rows.
Use the EXISTS and NOT EXISTS keywords with correlated subqueries to restrict the result set of an outer query to rows that satisfy the subquery. The EXISTS and NOT EXISTS keywords return TRUE or FALSE, based on whether rows are returned by the subquery. The subquery is tested to see if it will return rows; it returns no data. Use * (asterisk) in the select list of the subquery as there is no reason to use column names.
Example 1
This example uses a correlated subquery with an EXISTS keyword in the WHERE clause to return a list of employees who received orders on 9/2/94.
USE Northwind SELECT LastName, EmployeeID FROM Employees e WHERE EXISTS (SELECT * FROM Orders WHERE e.EmployeeID = Orders.EmployeeID AND OrderDate = '9/5/97') |
Result
LastName EmployeeID -------------------- ----------- Peacock 4 King 7 (2 row(s) affected) |
Example 2
This example returns the same result set as Example 1 to show that you could use a join operation rather than a correlated subquery.
USE Northwind SELECT LastName, e.EmployeeID FROM Orders INNER JOIN Employees e ON Orders.EmployeeID = e.EmployeeID WHERE OrderDate = '9/5/97' |
Result
LastName EmployeeID -------------------- ----------- Peacock 4 King 7 (2 row(s) affected) |
Subqueries allow you to use queries within other queries anywhere that you can use an expression or in WHERE clause search conditions. Nested subqueries are executed once, when the outer query executes, and can return either a single value or a list of values for use by the outer query. Correlated subqueries are passed a value and then executed for each row in the outer query. The EXISTS and NOT EXISTS keywords are used in WHERE clauses to test for the existence of rows in the result set of a subquery.