Estimated lesson time: 45 minutes
A cursor is a server-side object that is declared on the server hosting SQL Server. It represents the set of rows returned from a SELECT statement. Multiple rows are stored in a result set that can be accessed one row at a time. When used inside of a stored procedure or a user function, cursors allow you to access a specific row, manipulate and interpret the data in that row, and return whatever data is needed. The process of moving backward and forward through the cursor is known as scrolling.
In certain cases, using a cursor to process a set of data and return only a single value or a subset of the data can be an efficient way of handling data. It is not necessary to pass all of the data between SQL Server and the application. Instead, the processing remains in SQL Server, and only a small amount of data is returned across the network. However, it is possible for cursors to be used inefficiently, so care must be used when deciding whether to use cursors. Additionally, developers should consider alternatives to using cursors. This will be discussed further in the next section.
If you decide to use a cursor, you must first declare it. The following is an example of the Transact-SQL code used to declare a cursor:
DECLARE crsrProducts CURSOR FOR SELECT ProductID, [Name], ProductNumber, StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00
Once the cursor is declared, it can be opened and then results will be fetched until there are no more results. For example, the following Transact-SQL code can be used to loop through the results of the crsrProducts cursor:
OPEN crsrProducts FETCH NEXT FROM crsrProducts WHILE @@FETCH_STATUS = 0 BEGIN --Add logic here that will perform operations --against the data in the cursor FETCH NEXT FROM crsrProducts END
Because cursors require the use of server memory, it is very important that you remember to close and deallocate the cursor. The following is an example of what this code would look like:
CLOSE crsrProducts DEALLOCATE crsrProducts
Before you decide to use a cursor, you should consider whether there are any other alternatives. SQL Server 2005 offers several methods for performing operations to multiple rows. This section highlights some of these alternatives.
In some cases, it might be possible to use a single SELECT statement in place of a cursor. You might have to use operators such as UNION to combine multiple result sets, or JOIN to retrieve data from one or more tables. You might also have to use comparison operators such as ANY, SOME, or ALL.
Very often, you can use a derived table inside a single SELECT statement instead of using a cursor. A derived table refers to a SELECT statement in the FROM clause of the outer SELECT statement. For example, the following Transact-SQL uses a derived table to retrieve the description for a product:
SELECT ProductID, [Name], ProductNumber, c.Description FROM Production.Product a INNER JOIN Production.ProductModelProductDescriptionCulture b ON a.ProductModelID = b.ProductModelID INNER JOIN (SELECT ProductDescriptionID, Description FROM Production.ProductDescription) AS c ON b.ProductDescriptionID = c.ProductDescriptionID ORDER BY [Name]
Built-in functions such as SUM, MAX, and AVG can be used to perform calculations on a set of values and return a single value. In many cases, this is the type of processing that a cursor needs to perform. For example, consider the following cursor, which is used to get a total count and sum of all the products that start with the letter "S":
DECLARE @SumAmt money DECLARE @Price money DECLARE @Recs int SET @Recs = 0 SET @SumAmt = 0 -- Get the ListPrice for all records -- that have a name starting with A DECLARE crsrProducts CURSOR READ_ONLY FOR SELECT ListPrice FROM Production.Product WHERE SUBSTRING([Name], 1, 1) = 'S' OPEN crsrProducts FETCH NEXT FROM crsrProducts INTO @Price WHILE @@fetch_status = 0 BEGIN SET @SumAmt = @SumAmt + @Price SET @Recs = @Recs + 1 FETCH NEXT FROM crsrProducts INTO @Price END SELECT @SumAmt, @Recs CLOSE crsrProducts DEALLOCATE crsrProducts
Alternatively, the same result can be obtained using a single SELECT statement. The following Transact-SQL query can be used to retrieve the total count and sum:
SELECT SUM(ListPrice), Count(ProductID) FROM Production.Product WHERE SUBSTRING([Name], 1, 1) = 'S'
Not only does the single SELECT statement take less code to implement, but it will execute significantly more quickly than the cursor alternative.
Although a WHILE loop is typically used when implementing a cursor, you can use a WHILE loop to perform operations against one or more rows without a cursor. For example, the following example uses a WHILE loop to increase the list price for products in the AdventureWorks database.
WHILE (SELECT AVG(ListPrice) FROM Production.Product) < $300 BEGIN UPDATE Production.Product SET ListPrice = ListPrice * 2 SELECT MAX(ListPrice) FROM Production.Product IF (SELECT MAX(ListPrice) FROM Production.Product) > $800 BREAK ELSE CONTINUE END
The loop begins if the average list price is below $300.00. By using the BREAK and CONTINUE keywords, the loop will continue to double the list price until the largest list price is greater than $800.00. Although a cursor could have been used to accomplish this result, this example demonstrates that a cursor is not always necessary.
Case functions allow you to evaluate one or more conditions and return a result. Instead of creating a cursor that uses an IF statement to test for multiple conditions, it might be possible to use a simple CASE function instead. For example, the following CASE function exists inside of one of the table-valued functions for the AdventureWorks database:
SET @ContactType = CASE -- Check for employee WHEN EXISTS(SELECT * FROM [HumanResources].[Employee] e WHERE e.[ContactID] = @ContactID) THEN 'Employee' -- Check for vendor WHEN EXISTS(SELECT * FROM [Purchasing].[VendorContact] vc INNER JOIN [Person].[ContactType] ct ON vc.[ContactTypeID] = ct.[ContactTypeID] WHERE vc.[ContactID] = @ContactID) THEN 'Vendor Contact' -- Check for store WHEN EXISTS(SELECT * FROM [Sales].[StoreContact] sc INNER JOIN [Person].[ContactType] ct ON sc.[ContactTypeID] = ct.[ContactTypeID] WHERE sc.[ContactID] = @ContactID) THEN 'Store Contact' -- Check for individual consumer WHEN EXISTS(SELECT * FROM [Sales].[Individual] i WHERE i.[ContactID] = @ContactID) THEN 'Consumer' END;
The CASE function is used to determine the type of contact given a certain contact ID. It then returns a string that identifies the contact type.
A common table expression (CTE) is a temporary result set that can be referenced multiple times in the same query. It can also be used to self-reference the same query and therefore can be used in what is known as a recursive query. Recursive queries involve a query that is executed multiple times until the desired result is obtained.
The AdventureWorks database comes with several stored procedures that utilize recursive queries. The uspGetWhereUsedProductID stored procedure can be used to generate a multilevel Bill of Material (BOM) statement. The first part of this stored procedure involves the invocation of the CTE and uses the WITH keyword to define the temporary result set. In this case, the CTE is named BOM_cte, and it contains eight different columns.
-- CTE name and columns WITH [BOM_cte]([ProductAssemblyID], [ComponentID],[ComponentDesc], [PerAssemblyQty], [StandardCost], [ListPrice], [BOMLevel], [RecursionLevel])
The next part of the query uses the AS keyword to specify what data will initially fill the CTE columns. In this case, the data will come from more than one table and will use a UNION ALL statement to reference the CTE itself.
AS ( SELECT b.[ProductAssemblyID], b.[ComponentID], p.[Name], b.[PerAssemblyQty], p.[StandardCost], p.[ListPrice], b.[BOMLevel], 0 -- Get the initial list of components for the bike assembly FROM [Production].[BillOfMaterials] b INNER JOIN [Production].[Product] p ON b.[ProductAssemblyID] = p.[ProductID] WHERE b.[ComponentID] = @StartProductID AND @CheckDate >= b.[StartDate] AND @CheckDate <= ISNULL(b.[EndDate], @CheckDate) UNION ALL SELECT b.[ProductAssemblyID], b.[ComponentID], p.[Name], b.[PerAssemblyQty], p.[StandardCost], p.[ListPrice], b.[BOMLevel], [RecursionLevel] + 1 -- Join recursive member to anchor FROM [BOM_cte] cte INNER JOIN [Production].[BillOfMaterials] b ON cte.[ProductAssemblyID] = b.[ComponentID] INNER JOIN [Production].[Product] p ON b.[ProductAssemblyID] = p.[ProductID] WHERE @CheckDate >= b.[StartDate] AND @CheckDate <= ISNULL(b.[EndDate], @CheckDate) )
The final part of the stored procedure will execute a SELECT statement against the CTE and represents the recursive invocation of the routine. In this case, the CTE routine will not be executed more than 25 times. This is specified using the MAXRECURSION keyword, as in the following example:
-- Outer select from the CTE SELECT b.[ProductAssemblyID], b.[ComponentID], b.[ComponentDesc], SUM(b.[PerAssemblyQty]) AS [TotalQuantity] , b.[StandardCost], b.[ListPrice], b.[BOMLevel], b.[RecursionLevel] FROM [BOM_cte] b GROUP BY b.[ComponentID], b.[ComponentDesc], b.[ProductAssemblyID], b.[BOMLevel], b.[RecursionLevel], b.[StandardCost], b.[ListPrice] ORDER BY b.[BOMLevel], b.[ProductAssemblyID], b.[ComponentID] OPTION (MAXRECURSION 25)
If at all possible, you should try to reduce the use of server-side cursors. For those cases where a cursor is determined to be the best solution, you need to consider ways to maximize cursor performance. This section highlights a few methods you can use to accomplish this goal.
Cursors consume server memory and can be costly in terms of application performance. For this reason, you always want to reduce the amount of data held in the cursor. This applies to both the number of columns and rows. The easiest way to do this is to restrict the number of columns returned from the SELECT statement. You would never want to declare a cursor that returned all of the data from a table, as follows:
DECLARE crsrProducts CURSOR FOR SELECT * FROM Production.Product
This cursor is inefficient because not only does it return all the columns in the table, but it does not use a WHERE clause to restrict the number of rows returned. Limiting the amount of data processed by the cursor can significantly reduce the amount of server memory utilized.
Concurrency control refers to the way SQL Server handles the updating of records by multiple users or processes. When declaring a cursor, SQL Server 2005 enables you to specify one of three different concurrency options. If you do not plan on updating any data values, then it is best to use the READ_ONLY option. This indicates that no locking will occur, because no positioned updating will occur.
It is not necessary to specify the concurrency option, and if one is not included, then the cursor is made updateable by default. Therefore, it is important for you to specify this option if you know that no updating will take place. The following is an example of a READ_ONLY cursor declaration:
DECLARE crsrProducts CURSOR READ_ONLY FOR SELECT ProductID, [Name], ProductNumber, StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00
If it is necessary to update data with your cursor, then the most efficient concurrency option is OPTIMISTIC. Try to avoid using the SCROLL_LOCK option, because this ensures all updates will succeed, but consumes the greatest amount of resources.
A cursor can be declared as FORWARD_ONLY and/or FAST_FOWARD. The FORWARD_ONLY type indicates that the cursor can only move forward and cannot be moved to a specific position in the cursor. This is the most efficient type because the other options, STATIC, KEYSET, and DYNAMIC, rely on temp tables and additional server memory. The following is an example of a cursor declared as FORWARD_ONLY and READ_ONLY:
DECLARE crsrProducts CURSOR FORWARD_ONLY READ_ONLY FOR SELECT ProductID, [Name], ProductNumber, StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00
The FAST_FORWARD option is similar to the FORWARD_ONLY type in that it indicates that the cursor can only move in one direction. However, this option also implies that the cursor will be read-only. The FORWARD_ONLY option is not read-only by default. Instead, you have to specifically add the READ_ONLY option. The FAST_FORWARD option does not require you to add the READ_ONLY option. The following is an example of a cursor declared as FAST_FORWARD:
DECLARE crsrProducts CURSOR FAST_FORWARD FOR SELECT ProductID, [Name], ProductNumber, StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00
A cursor consumes server memory as long as it is left open, but also until it is specifically deallocated using the DEALLOCATE keyword. Closing a cursor only releases any locks but does not remove the cursor from memory. Therefore, you should CLOSE and DEALLOCATE your cursor as soon as possible. Typically, this is done within the stored procedure in which the cursor is created.
Because cursors are frequently used inefficiently, you will want to periodically evaluate whether the use of cursors is warranted. It might be necessary to replace existing cursors with other alternatives or to move the data processing to the applications.
Your company might be utilizing cursors that were designed and implemented with an earlier version of SQL Server. In these cases, new features such as recursive queries were not available, so they would have not been considered. Now that you are using SQL Server 2005, you might need to reconsider whether the cursor is the best alternative.
The easiest way to compare a cursor with another alternative is to execute the Transact-SQL for both methods and compare the execution time. When you execute a query inside of SQL Server Management Studio, the execution time is displayed in the status bar of the query window, as shown in Figure 3-2.
Figure 3-2: Execution time is displayed in the status bar of the new query window
In Figure 3-2, the execution time was 00:00:00, which means it took less than 1 second to execute. If you were evaluating a cursor that took less than 1 second to execute, it would hardly be worth it to investigate much further, because the execution time does not seem to be an issue. Before you waste any time rewriting an existing cursor, you might want to do a quick execution time check and make sure there is a potential savings to recover.
Several dynamic management functions are included with SQL Server 2005. These functions allow you to monitor various SQL Server activities in real time. The function named sys.dm_exec_cursors enables you to monitor any currently allocated cursors. This means a cursor that has been created using the CREATE statement, but no corresponding DEALLOCATE statement has yet been issued.
The sys.dm_exec_cursors function provides a great way for you to determine whether you have any orphaned cursors on your SQL Server. Because cursors that are not released from memory continue to consume memory, it is important for you to know whether they exist. To use the cursors function, you would execute the following Transact-SQL command:
select * from sys.dm_exec_cursors(0)
If there are any allocated cursors, you will see records that contain useful information such as the following:
Session_id References the SQL Server session that was used to create the cursor
Name Refers to the name of the cursor
Properties Lists the options that were assigned to this cursor, such as whether it is FAST_FORWARD and/or READ_ONLY
Creation_Time Shows the DateTime stamp from when the cursor was created
In this lab, you will create a cursor that is used to update certain records in the AdventureWorks database. The first exercise will deal with the basics of creating an updateable cursor. The second exercise will move on to comparing two methods to examine the same task. You will compare the execution of a cursor against a single UPDATE statement that uses a CASE statement.
The completed lab is available in the \Labs\Chapter 03 folder on the companion CD.
Important | Lab requirements You will need to have SQL Server installed before you can complete this lab. Refer to the Introduction for setup instructions. |
Exercise 1: Create a Cursor
In this exercise, you create a cursor that loops through all the records in the Production.Product table for the AdventureWorks database. Within the cursor loop, the data will be printed out to the messages window.
Open Microsoft SQL Server Management Studio.
Connect to the instance of SQL Server 2005 that contains the AdventureWorks database.
Select New Query.
Add the following code to the query window:
USE AdventureWorks GO -- Declare variables that we will store fetch -- results in DECLARE @ProdName nvarchar(50) DECLARE @ProdNum nvarchar(25) DECLARE @StdCost money DECLARE @LstPrice money -- Allocate and define the cursor DECLARE crsrProducts CURSOR FAST_FORWARD FOR SELECT [Name], ProductNumber, StandardCost, ListPrice FROM Production.Product -- Open the cursor and fetch the results -- into the local variables OPEN crsrProducts FETCH NEXT FROM crsrProducts INTO @ProdName, @ProdNum, @StdCost, @LstPrice -- Loop through the cursor while the fetch -- is still successful WHILE @@FETCH_STATUS = 0 BEGIN -- Print out the results to the messages window PRINT 'Product: ' + @ProdName + ' (' + @ProdNum + ') ' + 'Cost: ' + Cast(@StdCost as varchar(8)) + ' Price: ' + Cast(@LstPrice as varchar(8)) -- Get the next set of results FETCH NEXT FROM crsrProducts INTO @ProdName, @ProdNum, @StdCost, @LstPrice END
Select the AdventureWorks database from the Available Databases drop-down list box, and then click Execute.
In the messages window, you should see results similar to the following:
Product: Adjustable Race (AR-5381) Cost: 0.00 Price: 0.00 Product: Bearing Ball (BA-8327) Cost: 0.00 Price: 0.00 Product: BB Ball Bearing (BE-2349) Cost: 0.00 Price: 0.00 Product: Headset Ball Bearings (BE-2908) Cost: 0.00 Price: 0.00 Product: Blade (BL-2036) Cost: 0.00 Price: 0.00 Product: LL Crankarm (CA-5965) Cost: 0.00 Price: 0.00 Product: ML Crankarm (CA-6738) Cost: 0.00 Price: 0.00 Product: HL Crankarm (CA-7457) Cost: 0.00 Price: 0.00 Product: Chainring Bolts (CB-2903) Cost: 0.00 Price: 0.00
At the bottom of the query window, add the following line of code:
select * from sys.dm_exec_cursors(0)
Because the cursor code added in step 4 did not include a DEALLOCATE statement, the cursor is still allocated. You should see a single record with the name crsrProducts. Look at the data in all the fields for this record.
At the bottom of the query window, add the following code to close, and deallocate the cursor:
-- Close and release from memory the cursor CLOSE crsrProducts DEALLOCATE crsrProducts
Exercise 2: Compare Cursor Alternatives
In this exercise, you create a cursor that loops through all the records in the Person.Contact table for the AdventureWorks database. Within the cursor loop, updates will be issued depending on the value of the EmailPromotion column. You then add the code for a replacement and compare the execution time results of both methods. This will allow you to see a situation in which the cursor is not the best alternative.
Open Microsoft SQL Server Management Studio.
Connect to the instance of SQL Server 2005 that contains the AdventureWorks database.
Select New Query.
Add the following code to the query window:
-- This cursor is used to update the EmailAddress in the -- Person.contact table. There are three potential domain -- names that can be used: adventure-works.net or -- adventure-works.net or adventure-works.org. Which domain -- is used depends on a field named EmailPromotion. USE AdventureWorks GO DECLARE @ID int DECLARE @Email nvarchar(50) DECLARE @Promotion int DECLARE crsrEmail CURSOR FORWARD_ONLY FOR SELECT ContactID, EmailAddress, EmailPromotion FROM Person.Contact -- Open the cursor and fetch the results -- into the local variables OPEN crsrEmail FETCH NEXT FROM crsrEmail INTO @ID, @Email, @Promotion -- Loop through the cursor while the fetch -- is still successful WHILE @@FETCH_STATUS = 0 BEGIN IF @Promotion = 0 BEGIN UPDATE Person.Contact SET EmailAddress = REPLACE(@email, 'adventure-works.com', 'adventure-works.net') WHERE ContactID = @ID END IF @Promotion = 1 BEGIN UPDATE Person.Contact SET EmailAddress = REPLACE(@email, 'adventure-works.com', 'adventure-works.biz') WHERE ContactID = @ID END IF @Promotion = 2 BEGIN UPDATE Person.Contact SET EmailAddress = REPLACE(@email, 'adventure-works.com', 'adventure-works.org') WHERE ContactID = @ID END -- Get the next set of results FETCH NEXT FROM crsrEmail INTO @ID, @Email, @Promotion END CLOSE crsrEmail DEALLOCATE crsrEmail
Select the AdventureWorks database from the Available Databases drop-down list box, and then click Execute. On a test laptop, this code took 18 seconds to complete.
Open a New Query window by selecting New Query.
Paste the following code into the query window:
UPDATE Person.Contact SET EmailAddress = CASE WHEN EmailPromotion = 0 THEN REPLACE(EmailAddress, 'adventure-works.com', 'adventure-works.net') WHEN EmailPromotion = 1 THEN REPLACE(EmailAddress, 'adventure-works.com', 'adventure-works.biz') WHEN EmailPromotion = 2 THEN REPLACE(EmailAddress, 'adventure-works.com', 'adventure-works.org') END
Select the AdventureWorks database from the Available Databases drop-down list box, and then click Execute. Note the amount of time it takes for the execution to complete. On a test laptop, it took only one second to complete. You can easily see how the cursor is not the most efficient alternative. Not only does it take more code to implement, but it takes 18 times longer to execute.
To reverse the update results and change the data back to its original value, execute the following query in a new query window:
UPDATE Person.Contact SET EmailAddress = CASE WHEN EmailPromotion = 0 THEN REPLACE(EmailAddress, 'adventure-works.net', 'adventure-works.com') WHEN EmailPromotion = 1 THEN REPLACE(EmailAddress, 'adventure-works.biz', 'adventure-works.com') WHEN EmailPromotion = 2 THEN REPLACE(EmailAddress, 'adventure-works.org', 'adventure-works.com') END