One of the common techniques for troubleshooting a performance issue is to use SQL Profiler to obtain the top n worse-performing queries. You can do this by looking at the Duration, CPU, Reads, and Writes columns in Profiler. After you identify such queries, some of the techniques you can use to tune the workload include generating and analyzing graphical and text query execution plans, obtaining statistics (such as STATISTICS PROFILE, STATISTICS IO, and STATISTICS TIME), changing the queries to effectively use the indexes and produce a better execution plan, providing query hints to the optimizer, rewriting stored procedures to avoid the use of cursors, avoiding recompilations, and so on. You can continue to use these techniques in SQL Server 2005. The following sections describe some new features introduced in this release that you can use for query performance tuning. Plan GuidesAs mentioned earlier in this chapter, one of the techniques you can use to improve query performance is to use the OPTION clause to specify query hints such as MAXDOP, LOOP JOIN, MERGE JOIN, and KEEP PLAN. As a general rule, you should avoid using optimizer hints because they overrule the decision-making process of the optimizer. The SQL Server 2005 query optimizer is smart enough to generate highly efficient execution plans, and therefore it is recommended that hints be used only as a last resort and only by experienced developers and DBAs. Let's assume that you have a third-party vendor application that you are trying to tune and optimize. You used SQL Profiler to obtain some of the queries that are not performing well. After you identified such queries, you tried to tune the queries by using Management Studio, providing query hints and analyzing the query execution plans. Now assume that you are sure that one or more query hints can optimize the query or fix a problem. But what if you cannot change the query? What if the query is hard-coded into the application and there is no way to change the query to provide optimizer hints? Enter plan guides. In the Standard and Enterprise Editions of SQL Server 2005, you can run the sp_create_plan_guide system stored procedure in order to create plan guides to optimize the performance of queries by providing hints at runtime to queries when it is not possible or desirable to change the text of the query directly. In other words, if you can create a plan guide for a query or a type of query and specify the query hints while creating a plan guide, whenever such queries are executed, SQL Server attaches the query hints to them. This way, you don't have to change the original query text, but you can still provide hints to the optimizer whenever such queries are executed. For plan guides to work, the query text executed on the server and specified with sp_create_plan_guide must match exactly character-by-character. If there is a difference of even a single character, the plan guide will not work, and the hints will not be attached with the query. It is important to note that plan guides are not just for tuning third-party applications where you don't have access to source code to change the query. Even in scenarios where you have access to source code and could possibly change the query, it is recommended that you create a plan guide for the query instead of directly specifying the optimizer hints with the query. You should consider plan guides a recommended approach whenever you have to do optimizer hinting. This allows you to change the query behavior without changing the application and also simplifies application maintenance and future database upgrades. Here is the T-SQL syntax for the sp_create_plan_guide stored procedure: sp_create_plan_guide [ @name = ] N'plan_guide_name', [ @stmt = ] N'statement_text', [ @type = ] N' { OBJECT | SQL | TEMPLATE }', [ @module_or_batch = ] { N'[ schema_name.]object_name' | N'batch_text' | NULL }, [ @params = ] { N'@parameter_name data_type [,...n ]' | NULL }, [ @hints = ] { N'OPTION ( query_hint [,...n ] )' | NULL }; The @type parameter can have one of following three values:
Plan guides are database scoped, and you can use the sys.plan_guides catalog view to obtain a list of plan guides created in the current database. You can enable, disable, or drop plan guides by using the sp_control_plan_guide system stored procedure. Note Plan guides cannot be created against stored procedures, functions, or DML triggers that specify the WITH ENCRYPTION clause. Attempting to drop or alter a function, stored procedure, or DML trigger referenced by a plan guide, either enabled or disabled, results in an error. Attempting to drop a table with a trigger defined on it that is referenced by a plan guide also results in an error. Let's look at an example of a plan guide for attaching MAXDOP and LOOP JOIN optimizer hints with a SQL statement at runtime. First, you need to start SQL Server Management Studio and connect to a SQL Server 2005 instance by using the Query Editor. Then you should click the" Include Actual Execution Plan" toolbar button and run the following TSQL statements: SET NOCOUNT ON; USE AdventureWorks; GO SELECT h.SalesOrderID, h.OrderDate, h.DueDate, h.ShipDate, h.Status, h.SalesOrderNumber, c.FirstName, c.LastName, h.TotalDue, a.City, a.PostalCode FROM Sales.SalesOrderHeader h JOIN Person.Contact c ON c.ContactID = h.CustomerID LEFT OUTER JOIN Person.Address AS [a] ON [a].AddressID = (SELECT TOP 1 AddressID FROM Sales.CustomerAddress WHERE CustomerID = h.CustomerID) ORDER BY c.LastName, c.FirstName; GO Figure 9.4 shows the execution plan for above query. It shows that the query utilizes multiple processors on the server to execute the query in parallel. The query processor also uses hash match (inner join) operator. Figure 9.4. The execution plan for a sample query shows that the query is using parallelism and the hash match operator.Let's assume that you want to disable parallelism and that you want the optimizer to use a loop join operator for the preceding query. You run the following T-SQL statement to create a plan guide to provide MAXDOP 1 and LOOP JOIN hints for the query: IF EXISTS(SELECT * FROM sys.plan_guides WHERE name = 'planguideTest') EXEC sp_control_plan_guide N'DROP', N'planguideTest'; GO EXEC sp_create_plan_guide @name = N'planguideTest', @stmt = N' SELECT h.SalesOrderID, h.OrderDate, h.DueDate, h.ShipDate, h.Status, h.SalesOrderNumber, c.FirstName, c.LastName, h.TotalDue, a.City, a.PostalCode FROM Sales.SalesOrderHeader h JOIN Person.Contact c ON c.ContactID = h.CustomerID LEFT OUTER JOIN Person.Address AS [a] ON [a].AddressID = (SELECT TOP 1 AddressID FROM Sales.CustomerAddress WHERE CustomerID = h.CustomerID) ORDER BY c.LastName, c.FirstName;', @type = N'SQL', @module_or_batch = NULL, @params = NULL, @hints = N'OPTION (LOOP JOIN, MAXDOP 1)'; GO SELECT * FROM sys.plan_guides; GO This T-SQL script uses sys.plan_guides to check whether the named plan guide already exists. If it does, the script uses sp_control_plan_guide to drop the plan guide. Next, it creates a plan guide by using the sp_create_plan_guide system stored procedure. Now if you run the same SELECT statement again, you should notice that at runtime, SQL Server attaches MAXDOP 1 and LOOP JOIN operator hints with the query. You can see this by looking at an execution plan generated for exactly the same query (see Figure 9.5). Figure 9.5. After you create a plan guide, the same query generates a different execution plan with no parallelism and also uses a loop join instead of a hash join.In summary, plan guides are a powerful feature that you can use to address queries targeted for improved or stabilized performance. New Query HintsSQL Server 2005 introduces four new query hints that you can directly specify with the queries (not recommended) or specify by using plan guides (the recommended approach, discussed in the preceding section). These are the new query hints:
New Database OptionsThe previous section discusses a new database option, PARAMETERIZATION, that you can use to choose between simple and forced query parameterization behavior. The following sections discuss some other database options that you can set by using the ALTER DATABASE statement and that you can use to tune databases. The AUTO_UPDATE_STATISTICS_ASYNC OptionBy default, the AUTO_UPDATE_STATISTICS_ASYNC database option is turned off, which means that if a plan at compile time sees that statistics are out-of-date, the query will wait until the statistics are updated so that a better query optimization plan can be generated based on up-to-date statistics. This can result in different response times for the same query. If you turn on this option, and then if a plan at compile time sees that statistics are out-of-date, the query will not wait until the statistics are updated. The query will continue to use the existing statistics and start a new thread to update the statistics in parallel. The next time the query is called, it may use the updated statistics. This results in stable and consistent query response times. The AUTO_UPDATE_STATISTICS option must be ON for the AUTO_UPDATE_STATISTICS_ASYNC option to work. In addition, the sampling formula for statistics update has changed; SQL Server 2005 makes sure that the optimizer samples at least 8MB of pages. You can use the is_auto_update_stats_async_on column in the sys.databases catalog view to find out whether the AUTO_UPDATE_STATISTICS_ASYNC option is turned on or off. The DATE_CORRELATION_OPTIMIZATION OptionBy default, the DATE_CORRELATION_OPTIMIZATION database option is turned off, which means that SQL Server does not maintain any correlation statistics for tables in the database that are linked by a foreign key constraint and have datetime columns. You can turn on this database option to improve the performance of queries that perform an equi-join between two tables whose datetime columns are correlated and that specify a date restriction in the query predicate. For instance, the OrderDate column of the Purchasing.PurchaseOrderHeader table can be considered correlated with the DueDate column of the Purchasing.PurchaseOrderDetail table. If the application workload has queries on these tables, using equi-joins, and uses DueDate or OrderDate columns in the WHERE condition, turning on the DATE_CORRELATION_OPTIMIZATION option can increase the performance of such queries. You can use the is_date_correlation_on column in the sys.databases catalog view to find out whether the DATE_CORRELATION_OPTIMIZATION option is turned on or off. The ENABLE_BROKER OptionSQL Server 2005 introduces a new asynchronous messaging platform for creating scalable, queued, distributed database applications. You can set the ENABLE_BROKER database option for a database to allow Service Broker communication from and to that database. Service Broker is discussed in great detail in Chapter 14. Statement-Level RecompilationThe SQL Server 2005 performance-related enhancements discussed so far in this chapter require you to make the right design decisions and use the right techniques to maximize performance. This section presents one example of enhancements that are built into the SQL Server 2005 database engine to improve query performance without you making any changes to your application. As discussed in Chapter 2, "SQL Server 2005 Database Engine Architectural Changes," in order to optimize query execution performance, SQL Server uses a part of the memory buffer pool to cache the query execution plans. This space is called the plan cache. The execution plan consists of two main components:
When any SQL statement is executed in SQL Server 2005, the relational engine first looks through the plan cache to verify that an existing execution plan for the same SQL statement exists. SQL Server 2005 reuses any existing plan it finds, which saves the overhead of recompiling the SQL statement. If no existing execution plan exists, SQL Server 2005 generates a new execution plan for the query. Certain changes in a database can cause an execution plan to be either inefficient or invalid, given the new state of the database. SQL Server detects the changes that invalidate an execution plan and marks such a plan as invalid. A new plan must then be recompiled for the next connection that executes the query. In SQL Server 2000, whenever a statement within a batch causes recompilation, the entire batch, whether submitted via a stored procedure, a trigger, an ad-hoc batch, or a prepared statement, is recompiled. In SQL Server 2005, only the statement inside the batch that causes recompilation is recompiled. Because of this "statement-level recompilation" feature, SQL Server 2005 spends less CPU time and memory on batch recompilations, and it obtains fewer compile locks, which increases the overall performance. One other benefit of statement-level recompilations is that it is no longer necessary to break a long stored procedure into multiple short stored procedures just to reduce the recompilation penalty of the long stored procedure. You can use a new Profiler event called SQL:StmtRecompile, which is under the TSQL event class, to trace statement-level recompilations. In addition, the existing SP:Recompile event under the Stored Procedures event class has been updated to show what query caused the recompilation in the TexTData column on the SP:Recompile row. The following is an example of statement-level recompilation: IF OBJECT_ID('dbo.tblTest') IS NOT NULL DROP TABLE dbo.tblTest; CREATE TABLE dbo.tblTest (C1 INT IDENTITY(1,1) NOT NULL PRIMARY KEY, C2 INT, C3 VARCHAR(20), C4 CHAR(5)); IF OBJECT_ID('dbo.sprocTest') IS NOT NULL DROP PROCEDURE dbo.sprocTest; GO CREATE PROCEDURE dbo.sprocTest AS BEGIN SELECT * FROM sysobjects WHERE 1 = 2; SELECT * FROM master.dbo.sysprocesses; SELECT * FROM dbo.tblTest WHERE C2 % 10 = 0 ORDER BY C4 DESC; END; GO EXEC dbo.sprocTest; CREATE INDEX idxTest ON dbo.tblTest (C4); EXEC dbo.sprocTest; DROP INDEX tblTest.idxTest; EXEC dbo.sprocTest; This script creates a sample table that is accessed in the stored procedure. After the first time the stored procedure is executed, an index is created on column C4, and the stored procedure is executed again. In SQL Server 2000, the entire stored procedure was recompiled during the second execution, whereas in SQL Server 2005, only the SELECT statement on the tblTest is recompiled. To see this in action, you can start a SQL Server 2000 Profiler trace, select the Stored Procedures:SP Recompile event, and select a few columns, including the TexTData column, from the Data Columns tab. Then you can run the preceding script in Query Analyzer. You should see an SP:Recompile entry in Profiler and note that the Textdata column is empty. You have to select another Profiler event, such as SP:Starting or SP:Completed, to find out which stored procedure is being recompiled. Next, you should start SQL Server 2005 Profiler and select the Stored Procedures:SP Recompile and TSQL:StmtRecompile events. Then you run the preceding script in a query window in Management Studio, and you should notice that Profiler shows that only the SELECT statement is recompiled. Profiler should list SP:Recompile followed by SQL:StmtRecompile, but the Textdata for both the rows is the SELECT statement that caused the recompilation (because the schema for the table being accessed changed). |