Index Usage Criteria | Microsoft SQL Server 2000 Unleashed (2nd Edition)

To effectively determine the appropriate indexes that should be created, you need to determine whether they'll actually be used by the SQL server. If an index isn't being used effectively, it's just wasting space and creating unnecessary overhead during updates.

The main criterion to remember is that SQL Server will not use an index for the more efficient bookmark lookup if at least the first column of the index is not included in a valid search argument (SARG) or join clause. Keep this in mind when choosing the column order for composite indexes. For example, consider the following index on the stores table:

 create index nc1_stores on stores (city, state, zip)

Each of the following queries could use the index because they include the first column, city , of the index as part of the SARG:

 select stor_name from stores     where city = 'Frederick'      and state = 'MD'      and zip = '21702' select stor_name from stores    where city = 'Frederick'      and state = 'MD' select stor_name from stores    where city = 'Frederick'      and zip = '21702'

However, the following queries will not use the index for a bookmark lookup because they don't specify the city column as a SARG:

 select stor_name from stores     where state = 'MD'      and zip = '21702' select stor_name from stores    where zip = '21702'

For the index nc1_stores to be used for a bookmark lookup in the last query, you'd have to reorder the columns so that zip was first ”but then the index wouldn't be useful for any queries specifying only city and/or state . To satisfy all the preceding queries in this case would require additional indexes on the stores table.

NOTE

For the past two preceding queries, if you were to display the execution plan information (how to display execution plans is described in Chapter 36, "Query Analysis"), you might see that the queries actually use the nc1_stores index to retrieve the resultset. However, if you look closely, you'll see the queries are not using the index in the most efficient manner ”it is being used to perform an index scan, rather than an index seek.

Not to get too much into query access methods , as this too is covered in more detail in Chapter 35, "Understanding Query Optimization," an index seek is what we are really after.

In an index seek, SQL Server searches for the specific SARG by walking the index tree from the root level down to the specific row(s) with matching index key values and then uses the bookmark value stored in the index key to directly retrieve the matching row(s) from the data page(s)(the bookmark is either a specific row identifier, or the clustered key value for the row).

For an index scan, SQL Server searches all the rows in the leaf level of the index looking for possible matches. If any are found, it then uses the bookmark to retrieve the data row.

Although both use the index, the index scan is still more expensive in terms of I/O than an index seek but slightly less expensive than a table scan, which is why it is used. However, the goal of this chapter is to learn to design indexes that result in index seeks, and when I talk about queries using an index, index seeks are what I am referring to (except for the section on index covering, but that's a horse of a slightly different color ).

You might think that the easy solution to get bookmark lookups on all possible columns is to index all the columns on a table so that any type of search criteria specified for a query can be helped by an index. This strategy might be somewhat appropriate in a read-only DSS environment supporting ad hoc queries, but not likely because many of the indexes probably still wouldn't even be used. As you'll see in the "Index Selection" section in this chapter, just because an index is defined on a column, it doesn't mean that the optimizer is necessarily always going to use it if the search criteria are not selective enough. Also, creating that many indexes on a large table could take up a significant amount of space in the database, increasing the time required to back up and run dbcc checks on the database. As mentioned earlier as well, too many indexes on a table in an online transaction processing (OLTP) environment can generate a significant amount of overhead during inserts , updates, and deletes and have a detrimental impact on performance.

TIP

A common design mistake I come across in my travels is too many indexes defined on tables in OLTP environments. In many cases, some of the indexes are redundant or are never even considered by the SQL Server optimizer to process the queries used by the applications. These indexes end up simply wasting space and adding unnecessary overhead to data updates.

A case in point was one client that had eight indexes defined on a table, four of which had the same column, which was a unique key, as the first column in the index. That column was included in the WHERE clauses for all queries and updates performed on the table. Only one of those four indexes was ever used.

Hopefully, by the end of this chapter, you'll understand why all these indexes were unnecessary and be able to recognize and determine which columns will benefit from having indexes defined on them and which indexes to avoid.