This section walks you through an example disk configuration. The goal is to understand the complexity of designing a disk array for a failover cluster that needs to take into account growth, performance, availability, and cost.
On the CD | A sample configuration worksheet can be found on the CD-ROM that accompanies this book. The file name is Disk_Configuration_Worksheet.doc, and it can be used to hypothesize the following scenarios or serve as a reference for the environment that you put together to become a part of your run book. |
An application is going to be upgraded to use SQL Server 2000. Currently, it is on a stand-alone machine running SQL Server 7.0 with all disks local. For a primary form of availability, you have decided to use failover clustering, which means the disk array needs to be upgraded to one on the cluster-specific HCL (including new controllers if needed) and needs some careful consideration to be optimized for the new disk usage patterns.
Unfortunately, without DBA involvement, management has already purchased the hardware, and you are constrained by the following rules:
The array has already been purchased and has the capacity to hold 16 disks.
12 disks have been purchased, and unless necessary, no more will be bought. You would need to justify the extra disks.
Each disk is 36 GB with a speed of 10,000 rpm.
The array can be configured so that two disks are online spares.
The array is configured to talk over one channel, so splitting the array into, say, two disks of eight that might use more than one channel, is impossible.
Remember the location for tempdb and the quorum in the configuration.
Two years worth of growth should be factored in.
Important | Outside the context of this example, be aware that having no DBA involvement in the hardware decision making is not recommended. All DBAs should be involved with hardware procurement because these decisions affect not only your servers, but your job as well. You would much rather have people praise you for the performance and availability of a system than ask you why it is down or how long it takes to run a query ( I kick off my query before lunch because it takes too long, and the results are there when I get back ). |
As shown in Table 4-3, this application uses three databases:
Sales This database is currently 50 GB, and has historically grown approximately 33 percent per year. This database uses tempdb heavily.
Billing This database is currently 150 GB and has seen growth of about 25 percent per year.
Contacts This database is currently 500 MB in size and grows roughly 10 percent per month.
Database Name | Size at End of Year 1 (in GB) | Size at End of Year 2 (in GB) |
---|---|---|
Sales | 66.5 | 88.5 |
Billing | 187.5 | 234.5 |
Contacts | 1.5 | 4.5 |
Raw Space Totals | 255.5 | 327.5 |
Assuming a 90:10 data-to-transaction log ratio, Table 4-4 shows the breakdown of file sizes to complement Table 4-3.
Database Name | Size at End of Year 1 (in GB) | Size at End of Year 2 (in GB) |
---|---|---|
Sales | 59.85 data/6.65 log | 79.65 data/8.85 log |
Billing | 168.75 data/18.75 log | 211.05 data/23.45 log |
Contacts | 1.35 data/0.15 log | 4.05 data/0.45 log |
Raw Space Totals | 229.95 data/25.55 log | 294.75 data/32.75 log |
Table 4-5 shows the total amount of raw drive space available.
Number of Drives | Raw Space (in GB) |
---|---|
12 | 432 |
13 | 468 |
14 | 504 |
15 | 540 |
16 | 576 |
There are more possible configurations, but the following four samples should give you a good idea of how challenging it is to configure a drive array. These examples all assume no use of files and filegroups. For your own edification, you might want to take these samples and try to think about how files and filegroups can be used with these configurations, and how they would change because of them.
None of the samples uses RAID 1+0/10/0+1. There are not enough drives to use that RAID level given the projected growth. None of these samples takes into account the archiving of data, either.
This sample (shown in Table 4-6) also adheres to the 12-drive limit. Quorum and tempdb each get their own RAID 1 LUN, and all of the data and logs are placed on a single RAID 5 LUN, meaning you will have a huge availability problem if the RAID 5 LUN fails. Notice that the total of 252 GB is not enough space for two years worth of growth, given the projections. The company has to buy more disks. No online spares means the company would have to stock drives in the event of drive failures.
Drive | LUN | RAID | LUN Size (in GB) | Purpose |
---|---|---|---|---|
1 | 1 | 1 | 36 | Quorum |
2 | 1 | |||
3 | 2 | 1 | 36 | Tempdb |
4 | 2 | |||
5 | 3 | 5 | 252 | All data, logs |
6 | 3 | |||
7 | 3 | |||
8 | 3 | |||
9 | 3 | |||
10 | 3 | |||
11 | 3 | |||
12 | 3 | |||
13 | ||||
14 | ||||
15 | ||||
16 |
This sample (shown in Table 4-7) also adheres to the 12-drive limit. The quorum and logs each get their own RAID 1 LUN (fixing the availability problem of Sample 1), and all of the data and tempdb are placed on a single RAID 5 LUN. Again, 252 GB is not enough space for two years worth of growth, given the projections. If the RAID 5 partition fails, you lose all of your databases, but you do have access to your logs in this case. You will have a serious performance problem because tempdb is on the same LUN as all of your data. Once more, the company has to buy more disks. No online spares means the company would have to stock drives in the event of drive failures.
Drive | LUN | RAID | LUN Size (in GB) | Purpose |
---|---|---|---|---|
1 | 1 | 1 | 36 | Quorum |
2 | 1 | |||
3 | 2 | 1 | 36 | All logs |
4 | 2 | |||
5 | 3 | 5 | 252 | All data, tempdb |
6 | 3 | |||
7 | 3 | |||
8 | 3 | |||
9 | 3 | |||
10 | 3 | |||
11 | 3 | |||
12 | 3 | |||
13 | ||||
14 | ||||
15 | ||||
16 |
Now that you know 12 disks are clearly not enough, there is a decision to buy two more disks. This time (Table 4-8), with 14 disks, you decide to split tempdb onto its own RAID 1 set to increase performance. You are still short on data growth space for two years worth of growth. As with Sample 2, your logs are split out, so you do have increased availability, but because all your logs are on one RAID 1 set, you will not get the maximum performance from your logs. You still need more drives. No online spares means the company would have to stock drives in the event of drive failure.
Drive | LUN | RAID | LUN Size (in GB) | Purpose |
---|---|---|---|---|
1 | 1 | 1 | 36 | Quorum |
2 | 1 | |||
3 | 2 | 1 | 36 | All logs |
4 | 2 | |||
5 | 3 | 5 | 252 | All data |
6 | 3 | |||
7 | 3 | |||
8 | 3 | |||
9 | 3 | |||
10 | 3 | |||
11 | 3 | |||
12 | 3 | |||
13 | 4 | 1 | 36 | Tempdb |
14 | 4 | |||
15 | ||||
16 |
The decision to max out the drive array at 16 disks is made. This configuration (Table 4-9) gives the proper amount of drive space, but you could not add online spares even if you wanted to.
Drive | LUN | RAID | LUN Size (in GB) | Purpose |
---|---|---|---|---|
1 | 1 | 1 | 36 | Quorum |
2 | 1 | |||
3 | 2 | 1 | 36 | All logs |
4 | 2 | |||
5 | 3 | 5 | 324 | All data |
6 | 3 | |||
7 | 3 | |||
8 | 3 | |||
9 | 3 | |||
10 | 3 | |||
11 | 3 | |||
12 | 3 | |||
13 | 3 | |||
14 | 3 | |||
15 | 4 | 1 | 36 | Tempdb |
16 | 4 |
This sample (Table 4-10) shows you from a maximum performance and space standpoint how you could potentially configure a disk subsystem if you have enough drives. This does not address files and filegroups, manageability, and hotspotting, which would come from knowing your data ”this just handles space and base performance from a 10,000-foot view.
Drive | LUN | RAID | LUN Size (in GB) | Purpose |
---|---|---|---|---|
1 | 1 | 1 | 36 | Quorum |
2 | 1 | |||
3 | 2 | 1 | 36 | Billing log |
4 | 2 | |||
5 | 3 | 1 | 36 | Contacts log |
6 | 3 | |||
7 | 4 | 1 | 36 | Sales log |
8 | 4 | |||
9 | 5 | Striped mirrors | 216 | Billing data |
10 | 5 | |||
11 | 5 | |||
12 | 5 | |||
13 | 5 | |||
14 | 5 | |||
15 | 5 | |||
16 | 5 | |||
17 | 5 | |||
18 | 5 | |||
19 | 5 | |||
20 | 5 | |||
21 | 6 | Striped mirrors | 108 | Sales data |
22 | 6 | |||
23 | 6 | |||
24 | 6 | |||
25 | 6 | |||
26 | 6 | |||
27 | 7 | 1 | 36 | Contacts |
28 | 7 | |||
29 | 8 | 1 | 36 | Tempdb |
30 | 8 |