One of the main functions of a system administrator is to allocate, manage, and monitor the space and storage requirements for a database. Estimating the space that a database requires can help you plan your storage layout and determine hardware requirements.
After this lesson, you will be able to
- Estimate the space requirements for a database
Estimated lesson time: 15 minutes
Figure 5.4 lists a number of factors that you should consider when attempting to determine the space requirements of your database.
Figure 5.4 Factors to consider when estimating the size of a database
Consider the following when you estimate the amount of space that your database will occupy:
NOTE
As a starting point, you should allocate 25 percent of the database size to the transaction log for online transaction processing (OLTP) environments. You can allocate a smaller percentage for databases that are used primarily for queries.
After you consider the amount of space that is allocated to the model database, you should estimate the minimum amount of data in each table, including projected growth. This can be calculated by determining the total number of rows, the row size, the number of rows that fit on a page, and the total number of pages required for each table in the database.
To estimate the minimum number of pages required for a table, find out the number of characters for each row and the approximate number of rows that the table will have. Calculate the number of pages, using the following method:
This example calculates the size of a table that has four int columns, three char(30) columns, and one datetime column. The table will have 250,000 rows.
In this example, SQL Server stored 63 rows per page for this table. The total number of pages is 3969, and the table size is 31,752 KB.
Estimating index space becomes more difficult when indexes have two types of pages. The first type, called leaf pages, holds the indexed key values. The second type forms the binary search tree that speeds indexed search and retrieval.
Index pages can intentionally be left unfilled by specifying a fill factor of less than 100 (percent) when building the index. This increases the number of pages in the index but makes table row inserts and updates faster because fewer new index pages need to be allocated during these operations.
A clustered index stores the data rows of a table in the indexed order of the key values of the index. A table can only have one clustered index. A nonclustered index is a separate structure that stores a copy of the key values from the table in indexed order and a pointer to each corresponding data row in the table.
Extra space for a clustered index is taken up only by the b-tree index pages because the data rows in the table are the index leaf pages. Also, if a fill factor is specified when building a clustered index, the data pages of the table are filled only to the level specified by the fill factor, making the table larger.
A nonclustered index consists of the b-tree index pages as well as the leaf pages of the index. The leaf pages hold a pointer to and an indexed copy of the key value of each row in the table. If the table has a clustered index, the pointer in a nonclustered index is the clustered index key value. For this reason, you should use small key values for the clustered index of a table that will have nonclustered indexes, or else the nonclustered indexes can become very large.
The following table shows some examples of the sizes of various indexes added to the table for which the size was calculated in the previous example.
The table first gives the size of an index based on a char(30) key column when there is no clustered index on the table. The size of the same index is then given when there is a clustered index on the table. Finally, the size of the index is given when the index has a fill factor of 50 percent and there is a clustered index on the table. Note that this index is larger than the table! The clustered index used a char(30) key column—this is not recommended and illustrates the dramatic increase in size of nonclustered indexes if you do use such a large key column for a clustered index.
The same figures are then repeated for an index based in an int key column instead of a char(30) key column.
Reminder: the table has 250,000 rows, and is 3969 pages or 31,752 KB in size.
Key | Clustered on table | Fill factor | Pages | Size |
---|---|---|---|---|
One char(30) column | No | 100 | 1360 | 10,880 KB |
One char(30) column | Yes | 100 | 2296 | 18,368 KB |
One char(30) column | Yes | 50 | 4548 | 36,384 KB |
One int column | No | 100 | 560 | 4480 KB |
One int column | Yes | 100 | 1492 | 11,936 KB |
One int column | Yes | 50 | 2944 | 23,552 KB |
There are many factors involved in the estimation of space requirements for a database, including the size of the rows in your tables, the estimated growth of the tables, the number and size of the indexes, and the fill factor for the indexes are some important factors. This lesson gave you the tools to estimate the size requirements of a database, allowing you to manage your resources more effectively.