Optimizing T-SQL Statements | Microsoft SQL Server 2000 Administrators Companion

3 4

In the previous sections, you learned how to view the T-SQL statements running in the system by using Profiler and how to determine the execution plan and data access method by using Query Analyzer. You also were presented with a simple example of modifying a T-SQL statement in order to improve its performance. Armed with these tools, you now can modify many types of T-SQL statements to make them perform better. In this section, you'll learn about several ways that T-SQL statements can be optimized to provide better performance or to use fewer system resources.

Optimizing the Execution Plan

Modifying the execution plan can be difficult, and creating a better execution plan than the query optimizer creates can be even more difficult. The operations most likely to benefit from changes in the execution plan are Join, Group by, order by, and union. You can easily modify these operations by trying various hints and viewing the results, as described in the section "Using Hints" later in this chapter. By changing the hint and viewing the output of Query Analyzer, you might find that you have achieved a more efficient operation.

Unfortunately, no magic formula exists for tuning T-SQL statements for better performance. Because each database is unique and each application is different, modifications must be made on a case-by-case basis.

Choosing a Data Access Method

As mentioned earlier in this chapter, data access methods are essentially the objects SQL Server uses in retrieving data from the database. By analyzing your database and the data it contains, you might be able to optimize the data access method, with the goal of reducing the number of I/O operations.

Modifications to the data access method, like modifications to the execution plan, must be made on a case-by-case basis. The following guidelines will help you choose the data access method that performs best:

Use the best index. Using the best index for an operation is necessary to achieve the best performance possible. The best index for a particular operation is the one that finds the data fastest and uses the fewest I/O operations. You can determine the best index either by using your intimate knowledge of your database and its data or by using Query Analyzer. Query Analyzer lets you try various scenarios to determine which index will retrieve the least number of rows. (Remember that Query Analyzer simply estimates the number of rows that will be returned; to determine the exact number of rows, you must use Profiler.)

NOTE
As mentioned in Chapter 17, indexes are a great benefit to SQL Server but can adversely affect performance if used incorrectly. Monitor the number of indexes that exist per table, especially if many INSERT, UPDATE, and DELETE statement operations are performed. Too many indexes will result in slower performance for these types of operations because additional overhead is needed to modify the indexes.

Use covering indexes. As mentioned in Chapter 17, the use of covering indexes might help you eliminate an I/O operation in the data retrieval process. Instead of having to access the underlying table, you might be able to retrieve the needed data from the index itself.
Reduce the number of rows returned. Determine whether all the data that is returned from queries is actually needed. Modify your T-SQL statements so that they access only the data that is necessary. Do not retrieve rows that will be discarded later. Reducing the number of rows retrieved from the database can be achieved by increasing the selectivity of the query.

Using Hints

You can change the data access method and the execution plan by modifying the T-SQL statement itself, but if you are not careful, this technique can change the functionality of the T-SQL statement. A safer method of optimizing T-SQL statements is using hints. Hints let you specify which operations you want the query optimizer to perform and which objects you want it to use. In this section, you'll learn about the various SQL Server hints and how to use them.

Join Hints

Join hints are used to specify which types of join operations the query optimizer should perform. (If no type is specified in a query, the query optimizer selects one.) With SQL Server, you can perform nested loops joins, hash joins, merge joins, and remote joins. You specify a method of joining by using the following hints:

LOOP Specifies a nested loops join. In a nested loops join, for each row in the outer table, every row in the inner table is checked to see whether the values of specified fields are equal.
HASH Specifies a hash join. In a hash join, one table is reorganized as a hash table. The other table is scanned one row at a time, and the hash function is used to search for equalities.
MERGE Specifies a sort merge join. In a sort merge join, each table is sorted, and then one row at a time from each table is compared with the corresponding row in descending order.
REMOTE Specifies a remote join. A remote join is a join in which at least one participating table is remote.

Let's look at an example of using a join hint. We'll use the example from the "Viewing the Plan for a Join Operation" section earlier in this chapter, and we'll specify a hash join, as follows:

 SELECT OrderID, CustomerID, Employees.EmployeeID, FirstName, LastName, OrderDate FROM Orders, Employees WHERE Orders.EmployeeID = Employees.EmployeeID OPTION (HASH JOIN)

NOTE
Join hints are mutually exclusive—only one can be used at a time.

If you choose to use the SQL-92 syntax for joins, you can specify the type of join by using a hint also. Using the SQL-92 syntax, you could rewrite the preceding query as follows:

 SELECT OrderID, CustomerID, Employees.EmployeeID, FirstName, LastName, OrderDate FROM Orders INNER HASH JOIN Employees ON (Orders.EmployeeID = Employees.EmployeeID)

The topic of join hints is an advanced topic, and there are no rules of thumb concerning when to use join hints. There are many reasons for choosing a particular join operation, such as predicates, the size of each table in the join, and how many tables are being joined. The best way to determine whether changing the join operation will provide better performance is to try each type in Query Analyzer and see which offers the lowest cost. However, Query Optimizer will usually choose the best join operation for you.

Query Hints

Query hints are used to specify how particular query operations should be performed. Query hints are split into three categories: GROUP BY, UNION, and miscellaneous.

GROUP BY Hints The following hints specify how GROUP BY or COMPUTE operations should be performed:

HASH GROUP BY Specifies that a hashing function be used to perform the GROUP BY operation
ORDER GROUP BY Specifies that a sorting operation be used to perform the GROUP BY operation

Using the GROUP BY example we looked at in the "Viewing the Plan for an Aggregate Operation" section earlier in this chapter, we can specify how a hash GROUP BY operation should be performed by using a hint, as follows:

 SELECT CustomerID, SUM(OrderDetails.UnitPrice) FROM Orders, OrderDetails HASH GROUP BY CustomerID OPTION (HASH GROUP)

NOTE
GROUP BY hints are mutually exclusive—only one can be used at a time.

UNION Hints The following hints are used to specify how union operations should be performed:

MERGE UNION A MERGE operation is used to perform the UNION operation.
HASH UNION A hashing function is used to perform the UNION operation.
CONCAT UNION A concatenation function is used to perform the UNION operation.

Here is an example of using a CONCAT UNION hint:

 SELECT OrderID, CustomerID, EmployeeID, OrderDate FROM orders WHERE CustomerID = 'TOMSP' UNION SELECT OrderID, CustomerID, EmployeeID, OrderDate FROM orders WHERE EmployeeID = '4' OPTION (CONCAT UNION)

NOTE
UNION hints are mutually exclusive.

Unfortunately, there is no secret formula that you can use to determine which UNION operation will perform better in your situation. Again, the best route is to use Query Analyzer to try various UNION hints and to see which offers the best cost. Usually, the SQL Server query optimizer will determine the best strategy for UNION operations.

Miscellaneous Hints The following hints can be used to perform various query operations:

FORCE ORDER Forces tables to be accessed in the order in which the tables appear in the query. By default, SQL Server can reorder the table accesses.
ROBUST PLAN Forces the query optimizer to be prepared for the maximum potential row size. Here is an example of using this hint:

 SELECT OrderID, CustomerID, Employees.EmployeeID, FirstName, LastName, OrderDate FROM Orders, Employees WHERE Orders.EmployeeID = Employees.EmployeeID OPTION (ROBUST PLAN)

Table Hints

Table hints are used to control the way in which tables are accessed. Two table hints are described here:

FAST n Replaces FASTFIRSTROWS, which is retained for backward compatibility. Using this hint instructs SQL Server to optimize the retrieval of the first n rows of data.
INDEX = index_name Forces the query optimizer to use the specified index when possible. One of the first examples in this chapter demonstrated how to use an INDEX hint. This example is repeated here:

 SELECT OrderID, CustomerID, EmployeeID, OrderDate     FROM orders WITH (INDEX = EmployeeID)    WHERE EmployeeID = 5    OPTION (FAST 10)

The WITH qualifier is optional.

The INDEX = EmployeeID hint specifies that the EmployeeID index be used. If you specify FAST 10, SQL Server will optimize the retrieval of the first 10 rows (if possible) and then return the rest of the rows.