2.7 RAID

only for RuBoard - do not distribute or recompile

A chapter about the history of storage wouldn t be complete without a discussion of RAID technology. Yes, there can be vast collections of disk drives , but they are essentially defenseless against failure and data loss.

It argues, therefore, that, mass storage can t succeed without the benefits of a great protection scheme: RAID. Along with redundant components , a hardware RAID solution is a key performance feature for modern high-availability disk arrays.

RAID stands for Random Array of Independent Disks (also known as Random Array of Inexpensive Disks). RAID technology groups individual disk drives into a logical disk unit that functions as one or more virtual disks. This can improve reliability, performance, or both.

RAID uses parity data or disk mirroring to allow the disk array to continue operating without data loss after a disk failure. After the failed disk is replaced , the unit automatically rebuilds the lost data in the RAID group from information stored on the other disks in the group. The rebuilt RAID group contains a replica of information it would have held if the disk module had not failed. This is a key component of high availability.

There are different utilities that can be used to bind disk modules into logical disk units (called RAID groups).

The industry has defined the following RAID levels:

RAID-0 (nonredundant individual access array)
RAID-1 (mirrored pair)
RAID-1/0 (mirrored RAID-0 group)
RAID-3 (parallel access array)
RAID-5 (individual access array)

2.7.1 RAID-0

RAID-0 is a performance solution, not a high availability solution. A RAID-0 group contains from 3 to 16 disk modules, and uses block striping. It spreads the data across the disk modules in the logical disk unit, and allows simultaneous I/Os to multiple disks.

Unlike the other RAID levels, RAID-0 doesn t provide any data redundancy, error recovery, or other high availability features. HP doesn t support RAID-0, because it doesn t provide redundancy.

2.7.2 RAID-1

A RAID-1 group consists of exactly two disk modules bound together as a mirrored pair. One disk is the data disk and the other is the disk mirror. The disk array hardware automatically writes the data to both the data disk and the disk mirror. Disk striping and parity are not used.

In a RAID-1 mirrored pair, if either the data disk or the disk mirror fails, the array uses the remaining disk for data recovery and continues operation until the failed disk can be replaced. If both disks fail, the RAID-1 mirrored pair becomes inaccessible.

RAID-1 is a high availability solution, but requires twice as many disks to store user data.

2.7.3 RAID-1/0

A RAID-1/0 group contains an even number of from 4 to 16 disk modules. Half of the disk modules are data disks and the other half are disk mirrors. Each disk mirror contains a copy of a data disk, so in essence, a RAID-1/0 group is a mirrored RAID-0 group.

A RAID-1/0 group uses block striping for performance and hardware mirroring for redundancy. The disadvantage of RAID level 1/0 is that the overhead cost is double that of RAID-0, because twice as many disk modules are required to store the user data, compared to a RAID-0 group.

When a data disk or a disk mirror fails, the disk array s processor automatically uses the remaining image for data recovery. A RAID-1/0 group can survive the failure of multiple disk modules, as long as one disk module (either the data disk or the disk mirror) in each pair of images continues to operate .

2.7.4 RAID-3

A RAID-3 group consists of 5 disk drives in a disk array, each on a separate internal single-ended SCSI-2 bus. RAID-3 uses disk striping over 4 disks for performance and a fifth, dedicated parity disk for redundancy.

When a failed disk drive is replaced, the disk array s storage processor automatically rebuilds the RAID group using the information stored on the remaining drives. If a data disk fails, the service processor automatically reconstructs all user data from the user data and parity information on the remaining disk modules. If the parity disk fails, the service processor reconstructs the parity information from the user data on the data disks.

Performance degrades while the service processor rebuilds the group, but the disk array continues to operate and all data is accessible during this time.

If 2 of the 5 disk modules in a RAID-3 group fail, the group becomes inaccessible. To guard against this, multiple global hot spares should be configured.

RAID-3 works well for applications using large block I/Os. It is not a good choice for transaction processing systems because the dedicated parity drive is a performance bottleneck. Whenever data is written to a data disk, a write must also be performed to the parity drive.

2.7.5 RAID-5

RAID-5 is usually the default configuration for HP disk arrays. It normally consists of 5 disk drives, but could contain from 3 to 16 drives. Like RAID-3, RAID-5 uses disk striping and parity, but it doesn t use a dedicated parity disk.

In a RAID-5 group, the hardware reads and writes parity information to each drive in the RAID group. For highest availability, the disk drives should each be on a separate internal single-ended SCSI-2 bus.

If a disk drive fails (or there s an internal SCSI bus failure), the disk array s storage processor reconstructs user data and parity information from the user data and parity information on the remaining drives.

Performance degrades while the service processor rebuilds the group, but the disk array continues to operate and all data is accessible during this time.

Like RAID-3, if 2 of the 5 disk modules in a RAID-5 group fail, the group becomes inaccessible. To guard against this, multiple global hot spares should be configured.

RAID-5 is good for multitasking environments. It has the same overhead cost as RAID-3 but provides faster random access because parity is spread across all drives in the RAID group. However, data transfers are a little slower than RAID-3, but that can be helped if the disk array has caching capabilities.

2.7.6 Disk Striping

Disk striping is a technique where data is written to and read from uniformly sized segments across all disk drives in a RAID group simultaneously and independently. The uniformly sized segments are called block stripes .

Hardware disk striping can be implemented by configuring the disk drives in a RAID-1/0 group, a RAID-3 group, or a RAID-5 group. By allowing multiple sets of read/write heads to work on the same I/O operation at the same time, disk striping can enhance performance.

The amount of information simultaneously written to or read from each drive is the stripe element size. The default stripe element size is 128 sectors of 512 bytes per sector or 65,536 bytes (except for RAID-3 groups, whose stripe element size is fixed at one sector and can t be modified). The stripe element size is configurable and can affect the performance of RAID groups.

The smaller the stripe element size, the more efficient is the distribution of data written or read across the stripes on the disks in the RAID group. The best stripe element size is the smallest size that will only rarely force I/Os to a second stripe. The stripe size should be an even multiple of 16 sectors (8 KB). The stripe element size becomes an integral part of the logical disk unit and can t be changed without unbinding the RAID group and losing all data on it.

The stripe size is the number of data disks in a RAID group multiplied by the stripe element size. For example, if the stripe element size is the default size of 128 sectors and the RAID group comprises 5 disk modules, the stripe size is 128 x 5, or 640 sectors (327,680 bytes).

2.7.7 Mirroring

Mirroring maintains a duplicate copy of a logical disk image on another disk drive. The copy is called a disk mirror. If either the original data disk or the disk mirror is inaccessible, the other disk provides continuous access to the data. The disk array continues running on the good image without interruption. There are two kinds of mirroring:

Hardware mirroring, in which the disk array controller automatically and transparently synchronizes the two disk images without user or operating system involvement
Software mirroring, in which the host operating system software synchronizes the disk images

You can create a hardware mirror by binding disk drives as a RAID-1 mirrored pair or a RAID-1/0 group. The disk array controller mirrors the data automatically and, in the event of a disk failure, rebuilds the data from the remaining disk image.

Using software mirroring, RAID-0 groups or individual disk units with no inherent data redundancy can be mirrored. The operating system mirrors the images.

2.7.8 Parity

Parity is a data protection feature that makes data highly available. Parity data makes it possible for a RAID group to survive a number of failures without losing user data.

If one data disk module fails, the disk array controller can reconstruct the user data from the remaining user data and parity information.
If the parity disk fails, the parity information can be recalculated from the data disks.
If each disk in a RAID group is bound on a separate internal bus and one bus fails, the disk on it is inaccessible. After the bus fault is cleared, the disk array controller can rebuild the RAID group from the user and parity information stored on the disks on the other buses.

In all three cases, the rebuilt RAID group contains a replica of the information it would have contained had the disk module or bus not failed.

Parity is calculated on each write I/O by doing a serial binary exclusive OR of the data segments in the stripe written to the data disks in the RAID group. For example, in a RAID group of five disk modules, the data segment written on the first disk is exclusive OR ed (XOR ed) with the data segment written on the second disk. The result is exclusive OR ed with the write segment on the third disk, which is exclusive OR ed with the write segment on the fourth disk. The result, which is the parity of the write segment, is written to the fifth disk of the RAID group.

RAID-3 and RAID-5 groups maintain parity data that lets a disk group survive one disk failure without losing data. The group can also survive a single internal bus failure if each disk in the RAID group is bound on a separate internal bus.

2.7.9 Global Hot Spares

The availability of all RAID groups in the disk array can be increased by using one or more disks as global hot spares. A global hot spare is a dedicated, online, backup disk used by the disk array as an automatic replacement disk when a disk in a RAID group fails. Hot spares cannot be used to store user data during normal disk array operations; after all, they are spares. When any disk in a RAID group fails, the disk array controller automatically begins rebuilding the failed disk s structure on a global hot spare. And when the disk array controller finishes this rebuild process, the disk group functions as usual, using the global hot spare as a replacement for the failed disk.

After the failed disk has been replaced with a new disk, the disk array controller starts copying the data from the former global hot spare onto the new disk. When the copy onto the replacement disk is completed, the disk array controller automatically frees the global hot spare to serve as a global hot spare again.

A global hot spare is most useful when the highest data availability is a requirement. It reduces the risk of a second disk failure. It also eliminates the time and effort needed for an operator to notice that a disk has failed, find a suitable replacement, remove the failed disk, and install the replacement. Multiple global hot spares can be configured in environments where data availability is crucial. This helps ensure that data remains accessible if multiple disks fail in a RAID group.

only for RuBoard - do not distribute or recompile