Indexes | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

Why are indexes so important to database applications or database management systems? Well, that answer doesn’t require much thought. An index helps you find data (row values) in a database quickly, without the system’s having to step through every row in the table. Now, if you had a table with only five rows, for example, going to each row and looking at its value in a specific column, a table scan, is not a big deal. But what if the table contained a gazillion rows-not an unusual situation for the average Web site database, or my wife’s contact file?

Imagine if the books we read were only a few paragraphs on a page. Would we still need an index at the back? Not at all. But we all know that most books contain a lot of pages, and without an index, it would be very difficult to find information rapidly without having to first scan through the entire book. I have indexed several books in a past life and the process is identical, from a logical or conceptual viewpoint, to the indexing of data in a table. You pinpoint exactly where in the book-the page number-a word or subject matter exists, and you list that word or subject matter in the index with the page number the reader can find it on. Professional indexers who practically index in their sleep can read through the printed pages at lightning speed and quickly build an index of information on all the pages.

Indexes are critical. This book has one. When I go to a bookstore, I hardly bother with the contents or the author’s bio or what he or she has to say in the preface or introduction. I go directly to the index, look up an item I want information on, and hope I find it. If the item is indexed, I go to the page and read the information to establish whether the book is imparting knowledge on the subject matter I am interested in. If not, I put the book back on the shelf and move on.

Indexing tables is absolutely essential, but there was a time in the early days of the relational engine that indexing caused such a performance hit on databases that indexes were kept to a minimum. Indexes still incur overhead, and we’ll look into that aspect shortly, but without them, looking for information in a table is a very loathsome exercise for the DBMS because it has to go through every row in the table to find the information requested by your query.

Indexes are database objects. A database index is a lot like a book index; it is a list of values in a table that contains the exact storage locations of the rows in the table that contain each value. Indexes can be created for a single column of values in a table or for a combination of columns. Indexes are usually implemented as B-trees. B-trees are sorted on the search key. If we create an index on [company name] and [company number], the B-tree index can efficiently locate a value on either column.

When you build queries, as discussed in the chapters in Part III, they will execute faster and return data to the client if the correct indexes have been created for the data in question. It is obviously wrong to place an index on the company number when all queries are done on the company name. The index for the former value would never be used, and the queries on company name would have the DBMS looking for data like a late moviegoer looking for his or her seat in the dark.

Having said that, SQL Server might at times find that an index scenario has reached a point of diminishing returns (sorry about that pun), and that the indexing process is costing more in overhead than iterating through the table row by row. This is the point at which you need to reevaluate the indexing and either drop the indexes or rearchitect the table, which could mean rebuilding an index solution at the minimum or rethinking the current schema. More about this is a moment.

Indexes on Keys and Constraints

Some indexes are created automatically by SQL Server. For example, when you create a primary key for a table, you automatically get an index thrown in, free of charge. The same applies to a unique constraint.

The Cost of Indexes

I mentioned earlier that indexes do come at a cost. They do engage additional system resources and overhead for their upkeep and operation. For starters, indexes can become rather bulky, which means they can cause the size of the database file to swell. In addition, whenever you add, change, or delete data from a table, the index has to be updated and such maintenance consumes time and resources. Imagine what would happen if after this entire book was proofread and indexed and ready for the printers that I called up the copyeditor and asked her if I could delete a chapter and replace it with new information- the people at the National Hurricane Center would have a heart attack with the storm that would ensue. SQL Server, on the other hand, understands that its data changes all the time, and so it has built-in algorithms to ensure that index regeneration happens as efficiently as possible.

The life of the DBA would be so wonderful if SQL Server were capable of taking care of all the index details. Unfortunately, SQL Server sometimes requires your input in the creation and management of indexes. Indexes are different for every table. The data is different, and the table schemas differ from table to table. Each table might have different keys and constraints on them that impact the search and return performance. The subjects of index design and index tuning thus should not be ignored. We return to indexing in Chapters 12 and 18.