As stated at the beginning of this chapter, using stored procedures can provide a number of benefits to your SQL Server applications. One performance benefit is reduced network traffic by minimizing the number of round trips between client applications and SQL Server. Stored procedures can consist of many individual SQL statements but can be executed with a single statement. This allows you to reduce the number and size of calls from the client to the server. If you have to take different actions based on your data values, you can make these decisions directly in the procedure, avoiding the need to send data back to the application to determine what to do with the data values.
By default, SQL Server sends a message back to the client application after each statement is completed within the stored procedure to indicate the number of rows affected by the statement. To further reduce the amount of "chatter" between the client and server, and as a result, to further improve stored procedure performance, these DONE_IN_PROC messages can be eliminated by issuing the set nocount on command at the beginning of the stored procedure. Be aware that if you turn this option on, the number of rows affected by the commands in the procedure will not be available to the ODBC SQLRowCount function or its OLE DB equivalent. You can still issue the select @@rowcount after a statement executes to determine the number of rows affected.
Another performance benefit of using stored procedures is potentially faster execution due to the caching of stored procedure query plans. Stored procedure query plans are kept in cache memory after the first execution. The code doesn't have to be reparsed and reoptimized on subsequent executions.
Query Plan Caching
When a batch of SQL statements is submitted to SQL Server, SQL Server performs a number of steps before the data can be returned to the client. These steps include the following:
The first time that a stored procedure executes, SQL Server loads the SQL code for the stored procedure from the syscomments table into the procedure code and optimizes and compiles an execution plan. The first step, parsing, is skipped on the first execution because the SQL was already parsed and the query tree built when the stored procedure was created.
The optimization of SQL statements is based on the parameters passed, the index distribution statistics, the number of rows in each table, and other information available at the time of the first execution. The compiled plan is then saved in cache memory. For subsequent executions, all SQL Server has to do is find the plan in cache and execute it, essentially skipping steps 1 and 2. Parsing and compilation always add some overhead, and depending on the complexity of the stored procedure code, they can sometimes be as expensive as the actual execution. Just by skipping these two steps, you can achieve a performance gain by using stored procedures.
In versions of SQL Server prior to 7.0, a separate memory area was reserved for the procedure cache. As you set aside more memory for the procedure cache, less memory was available for the data cache. Tuning the size of the procedure cache appropriately was an inexact science and often required a bit of trial and error before it was tuned properly. Often, the procedure cache was left at the default value (30 percent of available cache memory), which in many large memory configurations was unnecessarily high.
In version 7.0, the data and procedure cache were unified. SQL Server now uses the same cache area for storing data and index pages as well as procedure query plans. Also, SQL Server Versions 7.0 and later have the ability to keep query plans in cache for ad hoc queries as well. This means that even dynamic SQL queries might be able to reuse a cached execution plan and skip recompilation.
If an ad hoc query is cheap to compile, SQL Server will typically not keep the query plan in memoryit is cheaper to recompile the plan than to keep it around and waste valuable cache memory when it might never be needed again. Also, the query plans for ad hoc queries have the lowest priority in the procedure cache. If cache space is needed, they are the first to go.
With the ability to keep query plans for ad hoc queries in memory, it is not as critical in SQL Server 2000 for applications to use stored procedures to achieve performance benefits of using precompiled plans. However, when and how the plans are stored and reused for ad hoc queries is not nearly as predictable as with stored procedures. The query plans for stored procedures will remain in cache memory more persistently. In addition, you have little explicit control over the recompilation of ad hoc queries.
Procedure plans were not re-entrant in versions of SQL Server prior to 7.0. If two users executed a procedure at the same time, two execution plans were created and stored in procedure cache memory. This sometimes led to multiple copies of a query plan for a stored procedure residing in memory. This resulted in suboptimal usage of the cache area, an environment that was harder to tune, and unpredictable execution times (because two plans could have different access strategies).
Because query plans in SQL Server 2000 are re-entrant, it's typical that no more than one copy of an execution plan for a stored procedure is in cache memory. However, sometimes multiple query plans can be created and exist in procedure cache at the same time. One of the more likely causes is when users run the same procedure with different settings for specific session options. The following list shows the options that will result in different query plans if set differently from the settings for a query plan already in memory:
In addition, if the owner name is not specified for a table in a query and SQL Server has to implicitly resolve the owner name , other users cannot reuse the plan. This is because depending on who the user is at the time of execution, the table being referenced could be different. For example, if Joe owns a titles table in addition to one owned by the dbo , when he runs select * from titles , it references his version of the titles table. If Tom executes the same query, it references the dbo 's version of the table. To avoid ambiguity when referencing an object and to help ensure reuse of the query plan, fully qualify the table with the owner name as follows:
select * from dbo.titles
How does SQL Server know what plans are currently in memory and what settings were in effect when they were created? This information is contained in the syscacheobjects table in the master database. syscacheobjects keeps track of all the currently compiled plans in the procedure cache. The key columns to focus on when evaluating stored procedure recompilation are as follows:
For each procedure, you'll typically see a Compiled Plan and at least one Executable Plan listed in the cacheobjtype column. The compiled plan is the actual plan generated and used that can be shared by sessions running the same procedure. The executable plan will be generated for each concurrent execution of a compiled plan. It keeps track of the execution environment in which the plan was run. Each executable plan must be associated with a compiled plan, but not all compiled plans will have an associated executable plan.
If you see multiple compiled plans in cache for a stored procedure ( sort the results by dbid and objid or by sql to have them listed together), look at the other columns to determine the reason. Look for differences in the langid or dateformat columns, which would indicate that the session was running under a different language or using a different date format. If the bitmap values for setopts are different, one of the key session options described previously in this section was set differently.
For more information on the syscacheobjects table and how query plans are cached and used in SQL Server, see Chapter 36, "Query Analysis."
Automatic Query Plan Recompilation
Reusing execution plans for stored procedures provides a performance advantage over ad hoc SQL commands. However, stored procedures will recompile new query plans in the following circumstances:
In addition to these reasons, SQL Server Versions 7.0 and later introduced other events that can cause stored procedures to recompile new query plans:
Monitoring Stored Procedure Recompilation
You can monitor when stored procedures are automatically recompiled using SQL Profiler. The two events you want to monitor are located in the Stored Procedure category and are called SP:StmtStarting and SP:Recompile (see Figure 28.10).
Figure 28.10. Adding events in SQL Profiler to monitor stored procedure recompilation.
If a stored procedure is automatically recompiled during execution, SQL Profiler will display a SP:Recompile event and an SP:StmtStarting event because the statement that caused the recompile will be displayed before and after the recompile event. For example, you can create the following stored procedure to create and populate a temporary table:
create proc recomp_test as create table #titles (title_id varchar(6), title varchar(80), pubdate datetime) insert #titles select title_id, title, pubdate from titles select * from #titles where pubdate > '10/1/1991'
If you turn on SQL Profiler and then execute the procedure, you will capture the events as shown in Figure 28.11.
Figure 28.11. Recompile event captured for a stored procedure in SQL Profiler.
The key statement that causes the recompile is the insertion of rows into the temporary table. You can see this because SQL Profiler displays the statement starting event followed by a recompile event and the insert statement starting again. These statements are listed in Table 28.6.
Table 28.6. SQL Profiler Events for an Automatic Recompile
For more information on using SQL Profiler to monitor SQL Server performance, see Chapter 7, "Using the SQL Server Profiler."
Minimizing Stored Procedure Recompilation
Repeated recompiles during stored procedure execution can lead to less than optimal performance of stored procedures as a result of the overhead of the recompiles. To minimize recompiles due to row modifications in a table, write your stored procedures to meet the following guidelines:
If recompiles are occurring because of interleaved DDL and DML commands, rewrite the stored procedure to move all DML commands to the top of the procedure. When DDL operations (create table, create index, and so on) are performed in a stored procedure, a recompile will occur when the first DML statement (select, insert, update, or delete) is encountered on the table affected by the DDL.
For example, rewrite the previous stored procedure as follows:
create proc recomp_test as create table #titles (title_id varchar(6), title varchar(80), pubdate datetime) insert #titles select title_id, title, pubdate from titles create index idx1 on #titles (pubdate) select * from #titles where pubdate > '10/1/1991' return
Table 28.7 shows the events captured by SQL Profiler, showing two recompiles for the procedure.
Table 28.7. SQL Profiler Events for an Automatic Recompile
These recompiles occur on every execution of the stored procedure. To minimize the recompiles, perform all DDL statements at the beginning of the procedure, if possible, before performing any DML on the affected tables, as shown in the following example:
create proc recomp_test as create table #titles (title_id varchar(6), title varchar(80), pubdate datetime) create index idx1 on #titles (pubdate) insert #titles select title_id, title, pubdate from titles select * from #titles where pubdate > '10/1/1991' return
The first execution of this version of the stored procedure will recompile the insert and select, but the execution plan will be reused for subsequent executions.
To minimize the recompiles due to usage of temporary tables in stored procedures, follow these guidelines:
By default, SQL Server will generate new query plans for a stored procedure whenever a referenced temporary table created in the stored procedure is modified more than six times. In the event that a temporary table is modified frequently, but not in a significant manner that would affect the query plan chosen , consider using the KEEP PLAN option in SELECT statements that reference the temp table to reduce the number of recompilations. This will not prevent recompilation completely, but it will follow a more relaxed recompilation strategy similar to recompilations that occur on regular tables due to row modifications.
Forcing Recompilation of Query Plans
In some situations, a stored procedure might generate different query plans depending on the parameters passed in. At times, depending on the type of query and the parameter values passed in, it can be difficult to predict the best query plan for all executions. Consider the following stored procedure:
create proc advance_range (@low money, @high money) as select * from dbo.titles where advance between @low and @high return
Assume that a nonclustered index exists on the advance column in the titles table. A search where the advance is between 1,000 and 2,000 might be highly selective, and the index statistics might indicate that less than 5 percent of the rows fall within that range, and thus an index would be the best way to find the rows. If those were the values passed on the first execution, the cached query plan would indicate that the index should be used.
Suppose, however, that if on a subsequent execution, search values of 5,000 and 10,0000 were specified. These values match against 90 percent of the rows in the table, and if optimized normally, SQL Server would likely use a table scan because it would have to visit almost all rows in the table anyway. Without recompiling, however, it would use the index as specified in the cached query plan, which would be a suboptimal query plan because it would likely be accessing more pages using the index than a table scan would.
When a lot of variance exists in the distribution of data values in a table or in the range of values passed as parameters, you might want to force the stored procedure to recompile and build a new query plan during execution and not use a previously cached plan. Although you will incur the overhead of compiling a new query plan for each execution, it typically will be much less expensive than executing the wrong query plan.
You can force recompiling the query plan for a stored procedure by specifying the WITH RECOMPILE option when creating or executing a stored procedure. Including the WITH RECOMPILE option in the create procedure command will cause the procedure to generate a new query plan for each execution.
create proc advance_range (@low money, @high money) WITH RECOMPILE as select * from dbo.titles where advance between @low and @high return
If the procedure is not created with the WITH RECOMPILE option, you can generate a new query plan for a specific execution by including the WITH RECOMPILE option in the EXEC statement:
exec advance_range 5000, 10000 WITH RECOMPILE
Because of the performance overhead of recompiling query plans, try to avoid using WITH RECOMPILE whenever possible. One approach is to create different subprocedures and execute the appropriate one based on the passed-in parameters. For example, have a subprocedure to handle small range retrievals that would benefit from an index, and a different subprocedure to handle large range retrievals. The queries in each procedure are identicalthe only difference is in the parameters passed to them. This is controlled in the top level procedure. An example of this approach is demonstrated in Listing 28.25.
Listing 28.25 Using Multiple Stored Procedures As an Alternative to Using WITH RECOMPILE
create proc get_orders_smallrange (@lowdate datetime, @highdate datetime) as select * from orders where saledate between @lowdate and @highdate return go create proc get_orders_bigrange (@lowdate datetime, @highdate datetime) as select * from orders where saledate between @lowdate and @highdate return go create proc range_value (@lowdate datetime, @highdate datetime) as if datediff(hh, @highdate, @lowdate) >= 12 -- if the date range is 12 hours or more, execute the bigrange procedure exec get_orders_bigrange @lowdate, @highdate else -- execute the small range procedure exec get_orders_smallrange @lowdate, @highdate
Obviously, this solution would require substantial knowledge of the distribution of data in the table and where the threshold is on the range of search values that results in different query plans. Another approach that is simpler to implement is to execute the query dynamically in an EXEC statement or by using sp_executesql . The specific query will recompile for each execution, but the main procedure containing the query will likely not need to be recompiled.
Another type of stored procedure that can sometimes generate different query plans based on initial parameters is the multipurpose procedure, usually performing different actions based on conditional branching, as in the following example:
create proc get_order_data (@flag tinyint, @value int) as if @flag = 1 select * from orders where price = @value else select * from orders where qty = @value
At query compile time, the optimizer doesn't know which branch will be followed because the if ... else construct isn't evaluated until runtime. On the first execution of the procedure, the optimizer generates a query plan for all select statements in the stored procedure, regardless of the conditional branching, based on the parameters passed in on the first execution. A value passed into the parameter intended to be used for searches against a specific table or column (in this example, price versus qty ) might not be representative of normal values to search against another table or column.
Again, a better approach would be to break the different select statements into separate subprocedures and execute the appropriate stored procedure for the type of query to be executed:
create proc get_order_data_by_price (@value int) as select * from orders where price = @value create proc get_order_data_by_qty (@value int) as select * from orders where qty = @value create proc get_order_data (@flag tinyint, @value int) as if @flag = 1 exec get_order_data_by_price @value else exec get_order_data_by_qty @value
In versions of SQL Server prior to 7.0, you used the sp_recompile system stored procedure when you wanted to force all stored procedures that referenced a specific table to generate a new query plan upon the next execution. This was necessary if you had added new indexes to a table or had run UPDATE STATISTICS on the table. However, the usefulness of this command in SQL Server 2000 is questionable because new query plans are generated automatically whenever new indexes are created or statistics are updated on a referenced table. It appears that sp_recompile is available primarily for backward compatibility or for those times when you want the recompilations to occur explicitly for all procedures referencing a specific table.