3 4
Creating indexes is not difficult. You create clustered and nonclustered indexes in much the same way, by using wizards provided with Enterprise Manager or by using the SQL command CREATE INDEX. In this section, you'll learn how to create indexes by using these two methods, and you'll also learn about using the fill factor and how to use stored procedures to create a full-text index.
NOTE
Although the wizards are easy to use, if you will be repeating operations or creating multiple similar databases, you might find scripts more convenient. Scripts allow you to both document and reproduce the build process.
Obviously, if you want to create an index on a table, the table must already exist in the database. You can use the Create Index Wizard to create either a clustered or nonclustered index on a table by following these steps:
Figure 17-9. The Create Index Wizard welcome screen.
All of the indexes that have been created on the Customers table are simple indexes (there is only one column listed), and each has been created on a different column. When the Query Optimizer analyzes a query to choose the query execution plan, it will decide which index to use based on the available indexes and on the predicate in the WHERE clause.
Figure 17-10. The Select Database And Table screen.
Figure 17-11. The Current Index Information screen.
Figure 17-12. The Select Columns screen.
You can select the Make This A Unique Index option to specify that the index will be a unique index rather than a nonunique index. You can also specify the fill factor: optimal or fixed. Because rows in an index are stored in sorted order, SQL Server might have to move data to maintain this order. The fill factor option enables you to specify how full the newly created index should be in order to leave space for future inserts. The default fill factor (what you get if you click Optimal) is 0, which specifies full leaf nodes but free space in the upper index nodes. For a more detailed explanation of fill factor, see the section "Using Fill Factor to Avoid Page Splits" later in this chapter.
Figure 17-13. The Specify Index Options screen.
Figure 17-14. The Completing The Create Index Wizard screen.
The order of a composite index is important. An SQL statement can take advantage of an index only if the leading portion of the index is in the WHERE clause of the statement. Figure 17-15 shows the same screen with the index renamed to CustomerAreaIndex and the columns reordered as Region, CompanyName, and ContactName.
Figure 17-15. Reordering the index columns.
With the index columns in this order, an SQL statement must contain Region in its WHERE clause in order for the index to be used, because Region is the leading column. Of course, the statement could have Region and CompanyName, or even Region, CompanyName, and ContactName in the WHERE clause. If you can use all three values in the WHERE clause, you will achieve the best performance because you will be performing the fewest I/O operations. It does not matter in what order the column names appear in the WHERE clause.
CAUTION
If you are creating a unique index and duplicate values are found in the index key, the index creation process will fail.
Creating an index by using the Create Index Wizard is easy, but the process has some disadvantages. In particular, because the Create Index Wizard does not maintain information about the tasks you perform while using it, you must go through the process outlined in this section every time you want to create another index. If you create an index by using a script file, you can use the file over and over again. Also, if you want to re-create the database, you must go through the Create Index Wizard all over again for each index in the database. Remember though, after indexes are created, you can generate SQL scripts for them by using Enterprise Manager.
By using Transact-SQL (T-SQL) to create an index, you can script the command and run it over and over again. You can also modify the index creation script to create other indexes. In addition, this method of creating an index offers you more flexibility because more parameters are available to you. To use this technique to create an index, simply put the T-SQL commands in a file and read the file into OSQL by using the following syntax:
Osql _Uusername _Ppassword < create_index.sql
This command assumes that the file you created is named create_index.sql. You could also execute the script by using Query Analyzer. (See Chapter 13 for more information about this process.)
To create an index by using T-SQL, you must use the CREATE INDEX command. The syntax for using CREATE INDEX is shown here:
CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX index_name ON table_name ( column_name [, column_name, column_name, … ] ) [ WITH options ] [ ON filegroup_name ]
The values within the brackets are optional. You can create the index as unique or nonunique, as clustered or nonclustered, with one or more columns, and with the optional parameters listed in Table 17-1. You can also optionally specify a filegroup in which to place the index.
Table 17-1. Optional parameters for use with CREATE INDEX
Parameter | Description |
---|---|
PAD_INDEX | When used in conjunction with the FILL_FACTOR parameter, indicates that space should be left in the branch nodes, rather than just in the leaf nodes. |
FILL_FACTOR ? number | Specifies how full to make each leaf node; a percentage ranging from 0 through 100. |
IGNORE_DUP_KEY | Specifies that an insert into a unique index with a duplicate value will be ignored and a warning message will be issued. If IGNORE_DUP_KEY is not specified, the entire insert will be rolled back. |
DROP_EXISTING | Specifies that the existing index of the same name should be dropped and the index re-created. This parameter enhances performance when you re-create a clustered index on a table that has nonclustered indexes because separate steps are not required to drop and re-create the nonclustered indexes. |
STATISTICS_NORECOMPUTE | Specifies that statistics data should not be recomputed. This option is not recommended because if execution plans are based on old data, they probably will be less than optimal. Use this parameter only if you plan on updating statistics manually. |
MORE INFO
For further information regarding these optional parameters, go to the Books Online index, look up "CREATE INDEX," and then choose "CREATE INDEX (T-SQL)" from the Topics Found dialog box.
Using T-SQL scripts is preferable to using the Create Index Wizard. Although T-SQL is initially more difficult to use, in the long run, you'll find that it is much easier to create multiple indexes by using T-SQL.
When updates or inserts are made to a table that has indexes, the index pages must also be updated. Index pages are chained together by pointers from one page to another. There are two pointers, one to the next page and one to the previous page. When an index page is full, an update to the index will cause a change in the pointer chain because a new index page must be inserted between two pages (in a process known as an index page split) so that the new information can be put into the right spot in the index chain. SQL Server moves approximately half of the rows in the existing page (where the new data needs to go) to this new index page. The two pages that originally pointed to each other now point to the new page, and the new page points to both of them (forward and back). Now the new index page is pointed to in the right order in the chain, but the index pages are no longer physically in order in the database. (See Figure 17-16.) Eventually, because new index rows are constantly being added to the index (assuming that updates or inserts are occurring) and the index page size is finite, more and more index pages will fill up. When that happens, additional space for new index pages must be found. To create more space, SQL Server continues to perform index page splits, which result in increased system overhead because of additional CPU usage and additional I/O operations. This also causes a fragmented index. The index data is scattered around in the database, causing slower performance.
Figure 17-16. An index page split.
One way to reduce page splitting and fragmentation is by tuning the fill factor of the index nodes. The fill factor specifies the percentage of the node to be filled when you are creating the index, enabling you to leave room for additional index rows. You can specify the fill factor for an index by using the FILL_FACTOR option of the CREATE INDEX T-SQL statement, as described earlier. If a fill factor is not specified in the CREATE INDEX command, the system default will be used. The default is set to whatever value the sp_configure parameter fill factor is set to. This value was set to 0 when you installed SQL Server.
NOTE
The fill factor parameter affects only how an index is created; changing it has no effect after an index has been built.
The value of the fill factor ranges from 0 through 100, indicating the percentage of the index page that is filled. A value of 0 is a special case. When 0 is specified, the leaf nodes are completely filled, but the branch nodes and the root node retain some free space. This value is the SQL Server installation default and usually works well.
A fill factor value of 100 specifies that all index nodes be completely filled when the index is created. This value is optimal for indexes on tables that never have new data inserted into them and are not updated. Both the leaf index nodes and the upper level nodes will be completely packed, and any insert will cause a page split. Read-only tables are ideal for this setting, although deleting data would be OK as it does not cause page splits.
A low fill factor value will leave a lot of space for inserts but requires a lot of extra space to create the index. Unless you will be doing constant inserts into the database, a low value for the fill factor is usually discouraged. If too many page splits are occurring, try reducing the value for the fill factor and rebuilding the index to see whether this lowers the number of page splits.
You can find out the number of page splits per second that your system is experiencing by using the Performance Monitor counter Page Splits/Sec. This counter can be found in the SQL Server: Access Methods object.
If, over time, page splits do occur and your indexes become heavily fragmented, the solution is to rebuild your indexes. Fragmentation can happen even if you use a fill factor that leaves space in the index pages. Eventually, that space can fill up too. See the section "Rebuilding Indexes" later in this chapter for more information.
To use the Full-Text Indexing Wizard to create a full-text index, follow these steps. (The next section will show you how to take advantage of full-text indexes.)
Figure 17-17. The Full-Text Indexing Wizard welcome screen.
Figure 17-18. The Select A Table screen.
Figure 17-19. The Select Table Columns screen, with several columns selected.
Figure 17-20. The Select A Catalog screen.
Figure 17-21. The Select Or Create Population Schedules screen.
From here, you can click Next to continue or choose to create a schedule. If you click Next without creating a population schedule, the full-text index will be created once only, at the completion of the wizard (rather than being re-created on a periodic basis).
NOTE
Because the full-text index is not constantly updated with changes to the underlying database, it might be necessary to update it periodically. The scheduling feature allows you to schedule automatic updates to the full-text index. Once this schedule has been created, the index will be updated according to the schedule.
Figure 17-22. The Completing The SQL Server Full-Text Indexing Wizard screen.
You can also create full-text indexes by using stored procedures. A brief overview of the process of creating a full-text index by using stored procedures is presented here; for the complete syntax, consult SQL Server 2000 Books Online.
Creating a full-text index by using stored procedures is more complicated than using T-SQL commands to create a B-tree index. However, if you're creating a number of full-text index catalogs, it might be worth the trouble to create a script file to perform this task.
Once the full-text index has been created, it's easy to take advantage of it. You can specify T-SQL keywords that take advantage of the full-text indexes: CONTAINS and FREETEXT. The following statement shows how typical string recognition in SQL would be performed if you didn't use a full-text index. As you can see, the LIKE keyword is used in the WHERE clause of the query:
SELECT * FROM Customers WHERE ContactName LIKE '%PETE%'
This statement might not produce the desired result. To use a more user-friendly query and to take advantage of the full-text index, you can use the CONTAINS predicate. The CONTAINS predicate must contain the column name and the desired text, as follows:
SELECT * FROM Customers WHERE CONTAINS(ContactName, ' "PETE" ')
The CONTAINS predicate can find text strings that contain the desired string, such as "PETER" or "PETEY," by using the full-text index.
You can also search full-text indexes by using the FREETEXT keyword. Like CONTAINS, FREETEXT is used in the WHERE clause. FREETEXT can be used to find a word (or related words) whose meaning matches the meaning of a word (or a set of words) given in the FREETEXT call but whose form does not exactly match that of the specified word. This can be done in an SQL statement such as the following:
SELECT CategoryName FROM Categories WHERE FREETEXT(Description, 'Sweets candy bread')
This query might find category names containing such words as "sweetened," "candied," or "breads."