Capacity Planning for the Disk Subsystem

3 4

Now that we have sized the memory and the processor, it's time to size the disk subsystem. Sizing this part of the system is easy because we have already calculated most of the information we require. First we need the total number of I/Os that will be processed through the system. We already have this information from the processor sizing. Second we need the size of the database. This information can be provided by the database designer. When you are sizing the disk subsystem, it is important to realize that you are sizing for either the size of the database or the number of I/Os per second, whichever yields the greater number of disk drives.

Many people are surprised to learn how many disk drives they need for their database. However, additional drives provide more access points to the data. If you have only one access point to the data, you have created a bottleneck. Since all transactions must pass through this bottleneck, response times will increase. The rule of thumb is to have as many access points to the data as you can. If you have more access points to the data, you are less likely to encounter the bottlenecks that might occur with fewer drives. You might also be generating many I/Os per second and might require more disks to accommodate the I/O load than are necessary for the database size.

For example, suppose you have a database system that is 10 GB in size and is generating 140 I/Os per second. Using the 85 percent rule for disk space utilization, you would need one approximately 12-GB drive to accommodate the size of the database. Now, looking at the drive requirement from an I/O point of view, if the disk drives were rated at 70 I/Os per second, three disk drives would be needed to accommodate the number of I/Os per second based on using only 85 percent of the I/O capacity of each drive. Therefore, since I/O capacity analysis yields the greatest result—three disk drives—we should use three disk drives (that total 12 GB, as we calculated earlier), each rated at 70 I/Os per second.

Note that this is the minimum configuration—you can use more higher- capacity drives if you want. Also note that this analysis ignores effects due to RAID configuration.

NOTE


When you are sizing a disk subsystem, always apply the 85 percent usage rule to both the size of the database and to the number of I/Os per second that users will generate. Use whichever calculation results in the larger number of drives. Also remember that 85 percent is the absolute maximum usage the disk should ever see. In practice, use a number lower that 85 percent. And remember that too many I/Os per second on the disk drives will cause bottlenecks and therefore prolonged response times.

Now let's take a more detailed look at how to determine the proper number of disk drives needed for your system, taking the RAID configuration into account. You'll need to store three major components: Windows 2000 and SQL Server, the log files, and the database itself. You'll calculate the number of drives you need for each component and then add the three numbers to obtain the total number of drives needed for your system.

Disk Drives for Windows 2000 and SQL Server

First you need to calculate the number of disk drives needed to support the first component—the Windows 2000 Server operating system and the SQL Server database. Usually, you will want these disk drives to be a separate volume set to RAID 1 (mirrored disk drives) for the fastest possible recovery. The number of disk drives may vary depending on size, but usually the Windows 2000 Server operating system and the SQL Server database system can fit on a single disk. Our simple calculation would look like this:

 opsys and SQL disks = (Windows 2000 Server and SQL disk) *                        (RAID factor increase)

In this case, the result would be two for mirrored disk drives. (Windows NT and SQL Server are on one disk, and that disk is mirrored in a RAID 1 volume.) Setting the operating system volume to RAID 5 or RAID 0 is not recommended. You must have at least two initial disk drives to use RAID 5, and you will want the fastest possible recovery for the operating system and the database executable.

Disk Drives for the Log Files

Second, you need to calculate the number of disk drives necessary to support your system's log files. This number depends largely on the total number of writes per second your transactions will cause. Remember that the information contained on these disks is of the most important kind—these disks provide the audit trails, or "before" images, that will be needed if anything happens to your database. Audit trails enable you to back out of partially complete transactions caused by disk failure. Calculating the number of writes that will take place was done during the processor sizing. Using arbitrary values, say a transaction resulted in 1,500,000 writes using a RAID 0 volume. Given that the RAID level that should be used for the log disk drives is RAID 1, we're looking at 3,000,000 writes over an 8-hour period, or 104.16 writes per second. (Remember that using RAID 1 results in twice as many writes per transaction as RAID 0.) To calculate the number of drives needed, use the following formula:

log disks = (writes/sec) / (disk I/O capability)

Remember that the disk I/O capability should be 85 percent of its rated maximum. Also, be sure to round up to the next whole number after dividing writes/sec by maximum disk I/O. And finally, be sure to adjust the writes/sec value for any increase in write activity due to the specified RAID level. If we use the 85 percent ceiling for the number of writes allowed on a disk drive that has a capacity of 70 I/Os per second, we would need 1.7 disk drives; rounded up, that would mean two disk drives.

Disk Drives for the Database

The final step is calculating the number of disk drives that will be required for the database. Remember to calculate the number of drives required based on both the size of the database and the number of I/Os per second, and to use whichever results in the larger number of drives.

For Database Size

To determine the number of disk drives that are required to accommodate the database size, use the following formula:

database disks = (data size) / (disk size) + (RAID factor increase)

Remember that the disk size should be 85 percent of its rated maximum. Also remember to use the same units (for example, KB and MB) for the data size and disk size. The RAID factor increase is the number of extra drives required to support fault tolerance. For RAID 1, this value is equal to the number of disks required to store the database; for RAID 5, one extra drive is needed. For our 10-GB database using RAID 5, we would need two 12-GB disk drives.

NOTE


RAID 5 is recommended for database drives.

For Database I/O

The number of disk drives required to accommodate I/Os can alter the database disk recommendation drastically, as we saw in our earlier simple example. To calculate this value, follow these steps:

  1. Calculate the total number of read I/Os that will be going through the system by using the following formula:

    total reads = (reads per transaction) * (total number of transactions)

    If we assume 500 reads per transaction and 50,000 transactions, we have 25,000,000 total reads.

  2. Determine how many of these read I/Os will be physical reads and how many will be logical reads by using the following formulas:

     total logical reads = (total reads) * (cache hit rate)  total physical reads = (total reads) _ (total logical reads)

    Assuming a target cache hit rate of 90 percent, we will have 22,500,000 total logical reads and 2,500,000 total physical reads.

  3. Convert the total number of physical reads to reads per second by using the following formula:

    physical reads/sec = (total physical reads) / (work period)

    The work period should be the length of time, in seconds, in which the work is to be performed. You will need this value for the calculation of CPU utilization later on. In our example, if we use an 8-hour work period, we will have 86.8 physical reads/sec.

  4. Now calculate the total number of write I/Os that will be going through the system by using the following formula:

     total writes = (writes per transaction) * (number of transactions) *                 (RAID factor increase)

    If we assume 10 writes per transaction using a RAID 5 system, we have (10) * (50,000) * (3), or 1,500,000 total writes.

  5. Convert the total number of physical writes to writes per second by using the following formula:

    physical writes/sec = (total physical writes) / (work period)

    In this example, we have 1,500,000 physical writes and an 8-hour work period (28,800 seconds), giving us 52.1 physical writes/sec.

  6. Calculate the total number of physical I/Os per second by using the following formula:

     total physical I/Os per second = (physical reads/sec) +                                   (physical writes/sec)

In this example, we have 86.8 physical reads/sec and 52.1 physical writes/sec, which gives us 138.9 total physical I/Os per second. Calculate the total number of database disk drives by using the following formula

 database disks = (total physical I/Os per second) /                   (disk I/O capability) + (RAID factor increase)

Remember to apply the 85 percent rule when determining the disk I/O capability, and remember that the RAID factor increase is the number of disks required to support fault tolerance. Using 138.9 total physical I/Os, a disk rated at 70 I/Os per second, and RAID 5, we come up with a total of four disk drives—three to support the total I/O and one more for the RAID 5 fault tolerance.

So, based on the database size of 10 GB, we would need at least one disk, but based on I/O activity, we would need three disks. Therefore, to accommodate the database, we will need three drives, the larger number of the two calculations.

Disk Drives Needed for the System

To find the total number of drives needed for the system, we take the sum of the parts. We need two disk drives for Windows 2000 Server and SQL Server, two disk drives for the log files, and four disk drives for the database, giving us a total requirement of eight drives for the entire system.

REAL WORLD   Leave Some Elbow Room


Most designers use the thresholds (75 percent CPU utilization, 85 percent disk utilization, and so on) as the maximum utilizations. In most cases, you will want to use lesser values. Of course, this choice is not always entirely up to the designer. Outside influences, such as the company's hardware budget, can affect design decisions. A good target for a system is 65 percent maximum CPU utilization and 70 percent disk utilization. However, you should use whatever percentages you find optimal for the types of systems you design.

Collecting Disk Usage Data

Once the system is set up and operational, you should collect disk usage data to keep apprised of any changes that might be necessary. The system might expand to more users (and thus more transactions), the requirement for the database might change (resulting in a larger database size), and so on.

When performing post-capacity planning studies on disk usage, you should track the following counters in Performance Monitor. These counters can be found in the PhysicalDisk object:

  • % Disk Time Percentage of elapsed time that the selected disk drive is busy servicing read or write requests.
  • % Disk Read Time Percentage of elapsed time that the selected disk drive is busy servicing read requests.
  • % Disk Write Time Percentage of elapsed time that the selected disk drive is busy servicing write requests.
  • Avg. Disk Read Queue Length Average number of read requests that were queued for the selected disk during the sample interval.
  • Avg. Disk Write Queue Length Average number of write requests that were queued for the selected disk during the sample interval.
  • Avg. Disk Queue Length Average number of both read and write requests that were queued for the selected disk during the sample interval. It is the sum of the previous two items.
  • Disk I/O Count Per Second I/O activity to the disk array per second averaged over the measurement period. This counter is not directly available through Performance Monitor; to arrive at this value, you simply add together the values of two other counters that are available—Disk Reads/sec and Disk Writes/sec.
  • Disk Space Used Amount of disk space currently being used by either the database or the operating system. This counter is not available through Performance Monitor; use the Disk Administrator to access this information.
  • Disk Space Available Amount of disk space currently available. This counter is not available through Performance Monitor; use the Disk Administrator to access this information.

To start the Disk Administrator, click Start, and then choose Programs, Administrative Tools (Common), and finally Disk Administrator. For further information about using the Disk Administrator, click the Help button in the Disk Administrator window.

Analyzing Disk Usage Data

Analyzing disk usage information is a simple process. For example, if we were analyzing a system, we would collect data about available disk space to determine how much space is free. Figure 6-7 shows the usage of the database in terms of available MB.

click to view at full size.

Figure 6-7. Available disk space predictive analysis.

As you can see, at the beginning of the analysis we have about 2.05 MB of free space out of 6.15 MB, which means that the disk is about 67 percent full. By January 14, 2000, we are down to about 1.5 MB, meaning that the disk is about 75 percent full. Using Microsoft Excel to plot a trend line, we estimate that by February 18, 2000, we will have only about 1.3 MB of available space, meaning that the disk will be about 83 percent full. At this point, the DBA might want to purchase additional disk space.



Microsoft SQL Server 2000 Administrator's Companion
Microsoft SQL Server 2000 Administrators Companion
ISBN: B001HC0RPI
EAN: N/A
Year: 2005
Pages: 264

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net