Whenever SQL Server executes a query, it first has to determine the best way to execute it. This decision involves deciding how and in which order to access and join the data, how and when to perform calculation and aggregations, and so on. This is done by a subsystem called the Query Optimizer. The Query Optimizer uses statistics about data distribution, the metadata relating to the database objects involved, index information, and other factors to calculate multiple possible query plans. For each of these plans, it estimates the cost based on statistics about the data and chooses the plan with the minimal cost for execution. Of course, SQL Server does not calculate all possible plans for every query because, for some queries this calculation might itself take longer than executing the least efficient query plan. Therefore, SQL Server has complex algorithms to find an execution plan with reasonable costs close to the minimum possible costs. After the execution plan is generated it is stored in a buffer cache (where SQL Server allocates most of its virtual memory). It is then executed in the way the execution plan instructs the database engine.
Execution plans in a buffer cache can be reused when the same or a similar query is executed. Therefore, execution plans are kept in the cache as long as possible. To learn more about execution plan caching, see the whitepaper titled: "Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005" at http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx.
Whether or not SQL Server Query Optimizer can produce an efficient plan for a given query depends on the following factors:
Indexes Like an index in a book, a database index provides the ability to quickly find specific rows in tables. Many indexes can exist on each table. With indexes on tables, SQL Server Query Optimizer can optimize data access by finding the right indexes to use. Without an index, the Query Optimizer has only one choice, which is to scan all the data in the table to find the right data rows. Later in this chapter, you will learn how indexes work and how to design and create them.
Data distribution statistics: SQL Server keeps statistics about data distribution. If these statistics are missing or out of date, the Query Optimizer will not be able to calculate efficient query plans. In most cases, statistics are generated and updated automatically. You will see later in this chapter how this generation works and how the statistics can be managed.
As you can see, execution plan generation is a crucial function for SQL Server performance since the effectiveness of a query plan determines whether it completes within milliseconds, seconds, or even minutes. Query execution plans of poorly performing queries can be analyzed to determine whether indexes are missing, statistics are out of date or missing, or SQL Server has chosen an inefficient plan (this doesn't occur very often).
Of course, it's also possible that a poorly performing query was executed with a good query plan. In these cases, query optimization is not the problem. The problem most likely lies in another area, like the query design, contention, disk I/O (input/output), memory, CPU, the network, and so on. To learn more about these kinds of problems you can read the whitepaper "Troubleshooting Performance Problems in SQL Server 2005" which can be found at http://www.microsoft.com/technet/prodtechnol/sql/2005/tsprfprb.mspx.
Viewing Query Execution Plans
From the Start menu, select All Programs | Microsoft SQL Server 2005 | SQL Server Management Studio. Click the New Query button to open a New Query window and change the database context to AdventureWorks by selecting it from the Available Databases dropdown menu.
Execute the following SELECT statement. The code for this example is included in the sample files as Viewing Query Plans.sql.
SELECT SalesOrderID, OrderQTY FROM Sales.SalesOrderDetail WHERE ProductID = 712 ORDER BY OrderQTY DESC
To show the execution plan of this query, press Ctrl+L or choose Display Estimated Execution Plan from the Query menu. The execution plan is shown below. The estimated execution plan is produced without actually executing the query. The query is optimized by Query Optimizer, but not executed. This facet of Query Optimizer is advantageous when dealing with queries that have long runtimes because it is not necessary to wait until the query completes to see the query plan. The graphical query plan has to be read from right to left and from top to bottom. Every icon represents one operator in the plan, and the data exchanged between these operators is represented by arrows. The thickness of the arrows represents the amount of data that flows between the operators. We won't go into detail explaining every available operator, but the ones you see in this query plan are:
We will discuss the most important operators that SQL Server uses when you learn about indexes and joins. A complete list of operators can be found in the SQL Server Books Online topic "Graphical Execution Plan Icons."
SQL Server accesses the data using a Clustered Index Scan. This scan is the actual data access operation and will be discussed in more detail later.
The data comes to the Sort operator, which sorts the data based on the ORDER BY clause of the statement.
The data is sent to the client.
The cost percentage under each operator icon shows the percentage of the total cost of this query that the operation represents. This number can give you a good idea of which operation uses the most resources to execute. In our case, the Clustered Index Scan is the most costly operation, accounting for 89% of the total cost of the query.
Let the mouse pointer hover over the Clustered Index Scan operator. You will see a yellow window appear, as in the graphic below.
This window provides you with detailed information about the operation. Until now, you only knew that SQL Server was retrieving data using a scan operation. But in this window, you can see that it performs a Clustered Index Scan operation (which will be explained in detail later) on the clustered index of Sales.SalesOrderDetail table and that it searches for ProductID 712. This information is found in the Predicates section. Also, the estimated costs and the estimated number of rows and row size is shown. While the number of rows are estimated based on the statistics SQL Server has for this table, the cost numbers are found based on statistics and on numbers from a reference system. Therefore, cost numbers cannot be used to calculate how long a query will take on a computer. They can only be used to distinguish if one operation is cheaper or more expensive than another.
This operator information can also be seen in the Properties window of SQL Server Management Studio. To open the Properties window, right-click one of the operator icons and select Properties from the context menu.
Query plans can also be saved. To save a query plan, right-click the plan and choose Save Execution Plan As from the context menu. The plan will be saved using an XML format with the extension of .sqlplan. It can be opened with SQL Server Management Studio by choosing Open | File from the File menu.
What you have seen up to now is the estimated execution plan of a query, but it is also possible to show the actual execution plan of a query. The actual execution plan is similar to the estimated execution plan, but it also includes actual numbers (not estimations) for the number of rows, the number of rewinds, etc. To include the actual execution plan in the query, press Ctrl+M or select Include Actual Execution Plan from the Query menu. Then press F5 to execute the query. The query results are displayed as usual, but you will also see the execution plan displayed on the Execution Plan tab.