Rebuilding Indexes | Microsoft SQL Server 2000 Administrators Companion

3 4

SQL Server keeps statistics on each index, which describe its uniqueness, or selectivity, and the distribution of the index key values. SQL Server Query Optimizer then uses these statistics to determine which index to use, if any, to best satisfy a particular query. Index statistics are periodically updated by default. However, indexes sometimes become fragmented over a long period of time because of page splits, which physically scatter index pages in the database. As a result, performance worsens. An index can also become off balance, meaning that one portion of the tree has more full index pages than another portion of the tree. You can restore balance and continuity by rebuilding the index. Also, index statistics are re-created when the index is rebuilt. But do not drop and re-create an index. Please read on!

NOTE
By default, index statistics are updated, but you can turn this feature on or off by using the stored procedure sp_autostats.

Another problem with indexes that have become fragmented occurs when an index has more levels than are necessary. More index levels means more I/O operations per index lookup. By rebuilding an index, you can reduce the number of levels and thus lower the number of I/O operations required for all index lookups.

One method for rebuilding an index involves removing the index by hand and then building the index again. For a small table, this option might be acceptable. But for medium-size to large-size tables, do not use this method. It is best to use the options described in this section for rebuilding an index, which do not involve dropping and creating the index again. Here are some reasons why this is true. When nonclustered indexes are created on a clustered table, the nonclustered indexes are based on the cluster keys. When the clustered index is dropped, the nonclustered indexes must be re-created because there is no longer a clustered index on the table. If the clustered index is then created again on the table, the nonclustered indexes must also be re-created a second time! Again, if you drop and then re-create the clustered index, you must re-create the nonclustered indexes twice: when the clustered index is dropped and when it is re-created. If you use the other methods to rebuild the clustered index, the nonclustered indexes will be re-created only once.

The two methods for rebuilding an index, without dropping and re-creating it, are using CREATE INDEX…DROP_EXISTING and using DBCC DBREINDEX. Both of these options will rebuild an index in a single step, and SQL Server knows to reorganize an existing index. Using these methods enable you to avoid the dropping and re-creating of nonclustered indexes when you rebuild a clustered index. These one-step methods also take advantage of the sorted order of the data currently in the index; this data will not have to be sorted again.

CREATE INDEX…DROP_EXISTING is used to rebuild only one index at a time on a table. DBCC DBREINDEX is used with a database name and a table name to rebuild all of the indexes on that table, without having to execute separate commands for each index. See Books Online for the syntax and options for these two commands.

Updating Index Statistics

If you don't have the time or resources to re-create indexes, you can update the index statistics independently. This technique is not nearly as efficient as rebuilding an index because the index might be fragmented, which might be more of a problem than outdated statistics. And this is assuming that you have turned off automatic statistic updates in SQL Server. (Otherwise, your statistics are being updated periodically anyway.) You can update index statistics manually by using the UPDATE STATISTICS command. The syntax is shown here:

 UPDATE STATISTICS table_name [ index_name | (statistics_name   [, statistics_name, ...] ] [ WITH [ FULLSCAN | SAMPLE number {PERCENT | ROWS} ] [ ALL | COLUMNS | INDEX ] [ NORECOMPUTE] ]

The values enclosed in brackets are optional. The only required parameter is table_name. The optional parameters are described in Table 17-2.

If your system experiences a large number of inserts, updates, and deletes, you should rebuild the indexes occasionally to avoid the performance degradation mentioned earlier. If you cannot rebuild the indexes, you should at least update the statistics periodically.

Table 17-2. Optional parameters for use with the UPDATE STATISTICS command

Parameter	Description
index_name	Specifies the index to recompute statistics on. By default, statistics for all of the indexes in the table are recomputed. If index_name is specified, only that index's statistics are recomputed.
statistics_name	Allows you to specify which statistics to recompute. If this value is not specified, all of the statistics are recomputed.
FULLSCAN	Specifies that all rows in the table be read for statistics gathering. Using this parameter is by far the best way to gather statistics, but it is also the most costly in terms of system resources and time.
SAMPLE number PERCENT \| ROWS	Specifies either the number or the percentage of rows on which the statistics are based. By default, SQL Server determines the number of rows to sample. This option cannot be used with the FULLSCAN option.
ALL \| COLUMN \| INDEX	Specifies whether all statistics, column statistics, or just index statistics are gathered.
NORECOMPUTE	Specifies that statistics are not automatically recomputed in the future. To reestablish automatic statistics recomputation, run the com-mand again without the NORECOMPUTE option, or run the sp_autostats stored procedure.