Lesson 2: Designing a Cursor Strategy

Estimated lesson time: 45 minutes

A cursor is a server-side object that is declared on the server hosting SQL Server. It represents the set of rows returned from a SELECT statement. Multiple rows are stored in a result set that can be accessed one row at a time. When used inside of a stored procedure or a user function, cursors allow you to access a specific row, manipulate and interpret the data in that row, and return whatever data is needed. The process of moving backward and forward through the cursor is known as scrolling.

In certain cases, using a cursor to process a set of data and return only a single value or a subset of the data can be an efficient way of handling data. It is not necessary to pass all of the data between SQL Server and the application. Instead, the processing remains in SQL Server, and only a small amount of data is returned across the network. However, it is possible for cursors to be used inefficiently, so care must be used when deciding whether to use cursors. Additionally, developers should consider alternatives to using cursors. This will be discussed further in the next section.

If you decide to use a cursor, you must first declare it. The following is an example of the Transact-SQL code used to declare a cursor:

 DECLARE crsrProducts   CURSOR FOR SELECT ProductID, [Name], ProductNumber,        StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00

Once the cursor is declared, it can be opened and then results will be fetched until there are no more results. For example, the following Transact-SQL code can be used to loop through the results of the crsrProducts cursor:

 OPEN crsrProducts FETCH NEXT FROM crsrProducts WHILE @@FETCH_STATUS = 0 BEGIN    --Add logic here that will perform operations    --against the data in the cursor    FETCH NEXT FROM crsrProducts END

Because cursors require the use of server memory, it is very important that you remember to close and deallocate the cursor. The following is an example of what this code would look like:

 CLOSE crsrProducts DEALLOCATE crsrProducts

Considering Cursor Alternatives

Before you decide to use a cursor, you should consider whether there are any other alternatives. SQL Server 2005 offers several methods for performing operations to multiple rows. This section highlights some of these alternatives.

Single SELECT Statements

In some cases, it might be possible to use a single SELECT statement in place of a cursor. You might have to use operators such as UNION to combine multiple result sets, or JOIN to retrieve data from one or more tables. You might also have to use comparison operators such as ANY, SOME, or ALL.

Very often, you can use a derived table inside a single SELECT statement instead of using a cursor. A derived table refers to a SELECT statement in the FROM clause of the outer SELECT statement. For example, the following Transact-SQL uses a derived table to retrieve the description for a product:

 SELECT ProductID, [Name], ProductNumber, c.Description FROM Production.Product a INNER JOIN Production.ProductModelProductDescriptionCulture b ON a.ProductModelID = b.ProductModelID INNER JOIN (SELECT ProductDescriptionID, Description FROM Production.ProductDescription) AS c ON b.ProductDescriptionID = c.ProductDescriptionID ORDER BY [Name]

Built-in functions such as SUM, MAX, and AVG can be used to perform calculations on a set of values and return a single value. In many cases, this is the type of processing that a cursor needs to perform. For example, consider the following cursor, which is used to get a total count and sum of all the products that start with the letter "S":

 DECLARE @SumAmt money DECLARE @Price money DECLARE @Recs int SET @Recs = 0 SET @SumAmt = 0 -- Get the ListPrice for all records -- that have a name starting with A DECLARE crsrProducts   CURSOR    READ_ONLY FOR SELECT ListPrice FROM Production.Product WHERE SUBSTRING([Name], 1, 1) = 'S' OPEN crsrProducts FETCH NEXT FROM crsrProducts INTO @Price WHILE @@fetch_status = 0 BEGIN    SET @SumAmt = @SumAmt + @Price    SET @Recs = @Recs + 1    FETCH NEXT FROM crsrProducts INTO @Price END SELECT @SumAmt, @Recs CLOSE crsrProducts DEALLOCATE crsrProducts

Alternatively, the same result can be obtained using a single SELECT statement. The following Transact-SQL query can be used to retrieve the total count and sum:

 SELECT SUM(ListPrice), Count(ProductID) FROM Production.Product WHERE SUBSTRING([Name], 1, 1) = 'S'

Not only does the single SELECT statement take less code to implement, but it will execute significantly more quickly than the cursor alternative.

WHILE Loops

Although a WHILE loop is typically used when implementing a cursor, you can use a WHILE loop to perform operations against one or more rows without a cursor. For example, the following example uses a WHILE loop to increase the list price for products in the AdventureWorks database.

 WHILE (SELECT AVG(ListPrice) FROM Production.Product) < $300 BEGIN    UPDATE Production.Product       SET ListPrice = ListPrice * 2    SELECT MAX(ListPrice) FROM Production.Product    IF (SELECT MAX(ListPrice) FROM Production.Product) > $800       BREAK    ELSE       CONTINUE END

The loop begins if the average list price is below $300.00. By using the BREAK and CONTINUE keywords, the loop will continue to double the list price until the largest list price is greater than $800.00. Although a cursor could have been used to accomplish this result, this example demonstrates that a cursor is not always necessary.

CASE Functions

Case functions allow you to evaluate one or more conditions and return a result. Instead of creating a cursor that uses an IF statement to test for multiple conditions, it might be possible to use a simple CASE function instead. For example, the following CASE function exists inside of one of the table-valued functions for the AdventureWorks database:

 SET @ContactType =    CASE    -- Check for employee       WHEN EXISTS(SELECT * FROM [HumanResources].[Employee] e          WHERE e.[ContactID] = @ContactID)       THEN 'Employee'    -- Check for vendor       WHEN EXISTS(SELECT * FROM [Purchasing].[VendorContact] vc          INNER JOIN [Person].[ContactType] ct       ON vc.[ContactTypeID] = ct.[ContactTypeID]          WHERE vc.[ContactID] = @ContactID)       THEN 'Vendor Contact'    -- Check for store       WHEN EXISTS(SELECT * FROM [Sales].[StoreContact] sc          INNER JOIN [Person].[ContactType] ct       ON sc.[ContactTypeID] = ct.[ContactTypeID]          WHERE sc.[ContactID] = @ContactID)       THEN 'Store Contact'    -- Check for individual consumer       WHEN EXISTS(SELECT * FROM [Sales].[Individual] i          WHERE i.[ContactID] = @ContactID)       THEN 'Consumer' END;

The CASE function is used to determine the type of contact given a certain contact ID. It then returns a string that identifies the contact type.

Recursive Queries

A common table expression (CTE) is a temporary result set that can be referenced multiple times in the same query. It can also be used to self-reference the same query and therefore can be used in what is known as a recursive query. Recursive queries involve a query that is executed multiple times until the desired result is obtained.

The AdventureWorks database comes with several stored procedures that utilize recursive queries. The uspGetWhereUsedProductID stored procedure can be used to generate a multilevel Bill of Material (BOM) statement. The first part of this stored procedure involves the invocation of the CTE and uses the WITH keyword to define the temporary result set. In this case, the CTE is named BOM_cte, and it contains eight different columns.

 -- CTE name and columns WITH [BOM_cte]([ProductAssemblyID], [ComponentID],[ComponentDesc],    [PerAssemblyQty], [StandardCost], [ListPrice], [BOMLevel],    [RecursionLevel])

The next part of the query uses the AS keyword to specify what data will initially fill the CTE columns. In this case, the data will come from more than one table and will use a UNION ALL statement to reference the CTE itself.

 AS (     SELECT b.[ProductAssemblyID], b.[ComponentID], p.[Name],       b.[PerAssemblyQty], p.[StandardCost], p.[ListPrice], b.[BOMLevel], 0     -- Get the initial list of components for the bike assembly        FROM [Production].[BillOfMaterials] b            INNER JOIN [Production].[Product] p            ON b.[ProductAssemblyID] = p.[ProductID]        WHERE b.[ComponentID] = @StartProductID            AND @CheckDate >= b.[StartDate]            AND @CheckDate <= ISNULL(b.[EndDate], @CheckDate)    UNION ALL     SELECT b.[ProductAssemblyID], b.[ComponentID], p.[Name],      b.[PerAssemblyQty], p.[StandardCost], p.[ListPrice], b.[BOMLevel],      [RecursionLevel] + 1     -- Join recursive member to anchor       FROM [BOM_cte] cte           INNER JOIN [Production].[BillOfMaterials] b           ON cte.[ProductAssemblyID] = b.[ComponentID]           INNER JOIN [Production].[Product] p           ON b.[ProductAssemblyID] = p.[ProductID]       WHERE @CheckDate >= b.[StartDate]           AND @CheckDate <= ISNULL(b.[EndDate], @CheckDate)       )

The final part of the stored procedure will execute a SELECT statement against the CTE and represents the recursive invocation of the routine. In this case, the CTE routine will not be executed more than 25 times. This is specified using the MAXRECURSION keyword, as in the following example:

 -- Outer select from the CTE SELECT b.[ProductAssemblyID], b.[ComponentID], b.[ComponentDesc],    SUM(b.[PerAssemblyQty]) AS [TotalQuantity] , b.[StandardCost],    b.[ListPrice], b.[BOMLevel], b.[RecursionLevel]   FROM [BOM_cte] b   GROUP BY b.[ComponentID], b.[ComponentDesc], b.[ProductAssemblyID],    b.[BOMLevel], b.[RecursionLevel], b.[StandardCost], b.[ListPrice]   ORDER BY b.[BOMLevel], b.[ProductAssemblyID], b.[ComponentID]   OPTION (MAXRECURSION 25)

Maximizing Cursor Performance

If at all possible, you should try to reduce the use of server-side cursors. For those cases where a cursor is determined to be the best solution, you need to consider ways to maximize cursor performance. This section highlights a few methods you can use to accomplish this goal.

Reducing the Amount of Data in Your Cursor

Cursors consume server memory and can be costly in terms of application performance. For this reason, you always want to reduce the amount of data held in the cursor. This applies to both the number of columns and rows. The easiest way to do this is to restrict the number of columns returned from the SELECT statement. You would never want to declare a cursor that returned all of the data from a table, as follows:

 DECLARE crsrProducts   CURSOR FOR SELECT * FROM Production.Product

This cursor is inefficient because not only does it return all the columns in the table, but it does not use a WHERE clause to restrict the number of rows returned. Limiting the amount of data processed by the cursor can significantly reduce the amount of server memory utilized.

Using the READ_ONLY Concurrency Option

Concurrency control refers to the way SQL Server handles the updating of records by multiple users or processes. When declaring a cursor, SQL Server 2005 enables you to specify one of three different concurrency options. If you do not plan on updating any data values, then it is best to use the READ_ONLY option. This indicates that no locking will occur, because no positioned updating will occur.

It is not necessary to specify the concurrency option, and if one is not included, then the cursor is made updateable by default. Therefore, it is important for you to specify this option if you know that no updating will take place. The following is an example of a READ_ONLY cursor declaration:

 DECLARE crsrProducts   CURSOR    READ_ONLY FOR SELECT ProductID, [Name], ProductNumber,        StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00

If it is necessary to update data with your cursor, then the most efficient concurrency option is OPTIMISTIC. Try to avoid using the SCROLL_LOCK option, because this ensures all updates will succeed, but consumes the greatest amount of resources.

Using the FORWARD_ONLY or FAST_FORWARD Cursor Types

A cursor can be declared as FORWARD_ONLY and/or FAST_FOWARD. The FORWARD_ONLY type indicates that the cursor can only move forward and cannot be moved to a specific position in the cursor. This is the most efficient type because the other options, STATIC, KEYSET, and DYNAMIC, rely on temp tables and additional server memory. The following is an example of a cursor declared as FORWARD_ONLY and READ_ONLY:

 DECLARE crsrProducts   CURSOR    FORWARD_ONLY READ_ONLY FOR SELECT ProductID, [Name], ProductNumber,        StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00

The FAST_FORWARD option is similar to the FORWARD_ONLY type in that it indicates that the cursor can only move in one direction. However, this option also implies that the cursor will be read-only. The FORWARD_ONLY option is not read-only by default. Instead, you have to specifically add the READ_ONLY option. The FAST_FORWARD option does not require you to add the READ_ONLY option. The following is an example of a cursor declared as FAST_FORWARD:

 DECLARE crsrProducts   CURSOR    FAST_FORWARD FOR SELECT ProductID, [Name], ProductNumber,        StandardCost, ListPrice FROM Production.Product WHERE ListPrice > 0 and ListPrice < 50.00

Closing and Deallocating Your Cursors

A cursor consumes server memory as long as it is left open, but also until it is specifically deallocated using the DEALLOCATE keyword. Closing a cursor only releases any locks but does not remove the cursor from memory. Therefore, you should CLOSE and DEALLOCATE your cursor as soon as possible. Typically, this is done within the stored procedure in which the cursor is created.

Evaluating Use of Cursors

Because cursors are frequently used inefficiently, you will want to periodically evaluate whether the use of cursors is warranted. It might be necessary to replace existing cursors with other alternatives or to move the data processing to the applications.

Your company might be utilizing cursors that were designed and implemented with an earlier version of SQL Server. In these cases, new features such as recursive queries were not available, so they would have not been considered. Now that you are using SQL Server 2005, you might need to reconsider whether the cursor is the best alternative.

Comparing Execution Times

The easiest way to compare a cursor with another alternative is to execute the Transact-SQL for both methods and compare the execution time. When you execute a query inside of SQL Server Management Studio, the execution time is displayed in the status bar of the query window, as shown in Figure 3-2.

image from book
Figure 3-2: Execution time is displayed in the status bar of the new query window

In Figure 3-2, the execution time was 00:00:00, which means it took less than 1 second to execute. If you were evaluating a cursor that took less than 1 second to execute, it would hardly be worth it to investigate much further, because the execution time does not seem to be an issue. Before you waste any time rewriting an existing cursor, you might want to do a quick execution time check and make sure there is a potential savings to recover.

Using a Dynamic Management Function

Several dynamic management functions are included with SQL Server 2005. These functions allow you to monitor various SQL Server activities in real time. The function named sys.dm_exec_cursors enables you to monitor any currently allocated cursors. This means a cursor that has been created using the CREATE statement, but no corresponding DEALLOCATE statement has yet been issued.

The sys.dm_exec_cursors function provides a great way for you to determine whether you have any orphaned cursors on your SQL Server. Because cursors that are not released from memory continue to consume memory, it is important for you to know whether they exist. To use the cursors function, you would execute the following Transact-SQL command:

 select * from sys.dm_exec_cursors(0)

If there are any allocated cursors, you will see records that contain useful information such as the following:

Session_id References the SQL Server session that was used to create the cursor
Name Refers to the name of the cursor
Properties Lists the options that were assigned to this cursor, such as whether it is FAST_FORWARD and/or READ_ONLY
Creation_Time Shows the DateTime stamp from when the cursor was created

Lab: Creating a Cursor

In this lab, you will create a cursor that is used to update certain records in the AdventureWorks database. The first exercise will deal with the basics of creating an updateable cursor. The second exercise will move on to comparing two methods to examine the same task. You will compare the execution of a cursor against a single UPDATE statement that uses a CASE statement.

The completed lab is available in the \Labs\Chapter 03 folder on the companion CD.

Important

Lab requirements

You will need to have SQL Server installed before you can complete this lab. Refer to the Introduction for setup instructions.

Exercise 1: Create a Cursor

In this exercise, you create a cursor that loops through all the records in the Production.Product table for the AdventureWorks database. Within the cursor loop, the data will be printed out to the messages window.

Open Microsoft SQL Server Management Studio.
Connect to the instance of SQL Server 2005 that contains the AdventureWorks database.
Select New Query.

Add the following code to the query window:

 USE AdventureWorks GO -- Declare variables that we will store fetch -- results in DECLARE @ProdName nvarchar(50) DECLARE @ProdNum nvarchar(25) DECLARE @StdCost money DECLARE @LstPrice money -- Allocate and define the cursor DECLARE crsrProducts   CURSOR  FAST_FORWARD FOR   SELECT [Name], ProductNumber,        StandardCost, ListPrice   FROM Production.Product -- Open the cursor and fetch the results -- into the local variables OPEN crsrProducts FETCH NEXT FROM crsrProducts   INTO @ProdName, @ProdNum,        @StdCost, @LstPrice -- Loop through the cursor while the fetch -- is still successful WHILE @@FETCH_STATUS = 0 BEGIN    -- Print out the results to the messages window     PRINT 'Product: ' + @ProdName + ' (' + @ProdNum + ') '          + 'Cost: ' + Cast(@StdCost as varchar(8))          + ' Price: ' + Cast(@LstPrice as varchar(8))    -- Get the next set of results    FETCH NEXT FROM crsrProducts      INTO @ProdName, @ProdNum,           @StdCost, @LstPrice END

Select the AdventureWorks database from the Available Databases drop-down list box, and then click Execute.

In the messages window, you should see results similar to the following:

 Product: Adjustable Race (AR-5381) Cost: 0.00 Price: 0.00 Product: Bearing Ball (BA-8327) Cost: 0.00 Price: 0.00 Product: BB Ball Bearing (BE-2349) Cost: 0.00 Price: 0.00 Product: Headset Ball Bearings (BE-2908) Cost: 0.00 Price: 0.00 Product: Blade (BL-2036) Cost: 0.00 Price: 0.00 Product: LL Crankarm (CA-5965) Cost: 0.00 Price: 0.00 Product: ML Crankarm (CA-6738) Cost: 0.00 Price: 0.00 Product: HL Crankarm (CA-7457) Cost: 0.00 Price: 0.00 Product: Chainring Bolts (CB-2903) Cost: 0.00 Price: 0.00

At the bottom of the query window, add the following line of code:
```
 select * from sys.dm_exec_cursors(0) 
```
Because the cursor code added in step 4 did not include a DEALLOCATE statement, the cursor is still allocated. You should see a single record with the name crsrProducts. Look at the data in all the fields for this record.
At the bottom of the query window, add the following code to close, and deallocate the cursor:
```
 -- Close and release from memory the cursor CLOSE crsrProducts DEALLOCATE crsrProducts 
```

Exercise 2: Compare Cursor Alternatives

In this exercise, you create a cursor that loops through all the records in the Person.Contact table for the AdventureWorks database. Within the cursor loop, updates will be issued depending on the value of the EmailPromotion column. You then add the code for a replacement and compare the execution time results of both methods. This will allow you to see a situation in which the cursor is not the best alternative.

Open Microsoft SQL Server Management Studio.
Connect to the instance of SQL Server 2005 that contains the AdventureWorks database.
Select New Query.

Add the following code to the query window:

 -- This cursor is used to update the EmailAddress in the -- Person.contact table. There are three potential domain -- names that can be used: adventure-works.net or -- adventure-works.net or adventure-works.org. Which domain -- is used depends on a field named EmailPromotion. USE AdventureWorks GO DECLARE @ID int DECLARE @Email nvarchar(50) DECLARE @Promotion int DECLARE crsrEmail   CURSOR  FORWARD_ONLY FOR   SELECT ContactID, EmailAddress, EmailPromotion   FROM Person.Contact -- Open the cursor and fetch the results -- into the local variables OPEN crsrEmail FETCH NEXT FROM crsrEmail   INTO @ID, @Email, @Promotion -- Loop through the cursor while the fetch -- is still successful WHILE @@FETCH_STATUS = 0 BEGIN    IF @Promotion = 0     BEGIN     UPDATE Person.Contact SET EmailAddress =      REPLACE(@email, 'adventure-works.com', 'adventure-works.net')      WHERE ContactID = @ID     END    IF @Promotion = 1     BEGIN     UPDATE Person.Contact SET EmailAddress =      REPLACE(@email, 'adventure-works.com', 'adventure-works.biz')      WHERE ContactID = @ID     END    IF @Promotion = 2     BEGIN     UPDATE Person.Contact SET EmailAddress =      REPLACE(@email, 'adventure-works.com', 'adventure-works.org')      WHERE ContactID = @ID     END    -- Get the next set of results    FETCH NEXT FROM crsrEmail      INTO @ID, @Email, @Promotion END CLOSE crsrEmail DEALLOCATE crsrEmail

Select the AdventureWorks database from the Available Databases drop-down list box, and then click Execute. On a test laptop, this code took 18 seconds to complete.
Open a New Query window by selecting New Query.

Paste the following code into the query window:

 UPDATE Person.Contact  SET EmailAddress =   CASE     WHEN EmailPromotion = 0      THEN REPLACE(EmailAddress, 'adventure-works.com', 'adventure-works.net')     WHEN EmailPromotion = 1      THEN REPLACE(EmailAddress, 'adventure-works.com', 'adventure-works.biz')    WHEN EmailPromotion = 2      THEN REPLACE(EmailAddress, 'adventure-works.com', 'adventure-works.org')   END

Select the AdventureWorks database from the Available Databases drop-down list box, and then click Execute. Note the amount of time it takes for the execution to complete. On a test laptop, it took only one second to complete. You can easily see how the cursor is not the most efficient alternative. Not only does it take more code to implement, but it takes 18 times longer to execute.

To reverse the update results and change the data back to its original value, execute the following query in a new query window:

 UPDATE Person.Contact  SET EmailAddress =   CASE     WHEN EmailPromotion = 0      THEN REPLACE(EmailAddress, 'adventure-works.net', 'adventure-works.com')     WHEN EmailPromotion = 1      THEN REPLACE(EmailAddress, 'adventure-works.biz', 'adventure-works.com')    WHEN EmailPromotion = 2      THEN REPLACE(EmailAddress, 'adventure-works.org', 'adventure-works.com')   END