5.1 RAID Levels

     

RAID (Redundant Array of Inexpensive Disks) was first described in a University of California at Berkeley paper in 1987 by Patterson, Gibson, and Katz. (For some excellent papers that make a case for redundant array of inexpensive disks, see ftp://ftp.cs.berkely.edu/pub/raid/papers.) Many software and hardware implementations of RAID have tried to get around the problem of disk drives being the slowest and least reliable components in our computing system. The original RAID levels categorized by Patterson, Gibson, and Katz still hold true today, although some manufacturers have augmented the list with derivations they say are unique and special to them (EMCs RAID level S). The basic RAID levels can be summarized as follows :

  • RAID 0 : Disk striping. This is where a number of smaller physical disks are grouped together to form a larger logical drive. Instead of writing data to a single drive until it is full, we use a striping factor or stripe size, which indicates the size of the blocks written to a single disk. Consecutive blocks are written to successive disks in a round- robin fashion. The complete collection of individual blocks goes together to form a logical drive presented to the application as a single device. A hardware implementation would result in the application being the Operating System itself, while a software implementation of RAID 0 would result in a lower level subsystem, e.g., LVM presenting a logical device to an upper layer application such as a filesystem. The idea here is to match the stripe size with the IO size of the application in order to ensure that successive IOs are performed from successive disks. Read/write performance for small IOs is comparable to the IO performance of individual disks. Read/write performance for large sequential IOs can be significantly improved as successive disks, spreading the burden of IO across multiple spindles, are fetching successive blocks. The drawback with RAID 0 is that the loss of an individual disk renders the entire logical device unusable.

  • RAID 1 : Disk mirroring. A solution where all data written to the first disk are simultaneously written to a second disk. Multiple mirror copies can be achieved, e.g., a three-way mirror being three copies of the data or sometimes referred to as one original and two mirror copies. Ideally there is no concept of an original disk; all disks can be considered original with the possibility of any disk being dropped from the configuration. A key component here is to ensure that reads and writes to individual disks are not blocked due to contention , e.g., two disks on the same controller. This can lead to solutions where multiple disks are on independent controllers (separate interface cards) or connected via a high capacity non-blocking controller, e.g., a Service Processor, Controller, or Array Control Process on a disk array. The drawback here is loss of capacity: with a two-way mirror, we have lost 50 percent of the available capacity; with a three-way mirror, we have lost 66 percent of available capacity.

  • RAID 0/1 : Sector interleaved groups of mirrored disks. In its purest form, a RAID 0/1 will sector interleave a logical drive across all disks in the stripe set (RAID 0). In addition, there will be another complete set of disks added to the stripe set whereby all data on each original disk is mirrored to the newly added disks. By definition the new disks will have the same sector interleaved pattern as the original disks. This alleviates the problem of RAID 0 of losing a disk rendering the entire RAID set unusable while giving even better read performance by having additional spindles from which to read a single block. If the IO capacity of the underlying interface is sufficient, the write performance may not be impacted adversely. We do not experience the read-modify-write IO penalty of RAID 2, 3, 4, or 5. Of all RAID levels, this provides the best performance, but at the price of the overall available capacity (maximum of 50 percent). Some people would say the RAID 1/0 is not necessarily the same as RAID 0/1 whereby the original volume is mirrored but not necessarily striped. It is the mirrored data that is striped. Most people will consider RAID 0/1 and RAID 1/0 to be equivalent although this is technically naive. In most software implementations of RAID (Veritas VxVM), the distinction between RAID 0/1 and RAID 1/0 is clear and marked .

  • RAID 2 : Byte striping with multiple check disks storing ECC (error correction code) data similar to the ECC data used to detect errors in system memory and disk blocks (a Hamming code). Having multiple ECC disk (a configuration of ten data disks and four ECC disks is not uncommon) allows for multiple disk failures without losing access to the underlying data. Read performance is similar to RAID 0. Write performance can be less than optimal, as ALL blocks in a stripe set have to be read to recalculate the ECC data. RAID 2 is commonly regarded as overkill due to the low-level and amount of ECC data maintained .

  • RAID 3 : Byte striped with single parity disk. This is sometimes regarded as a step back from RAID 2, as we only maintain parity information on a single disk. The stripe size is sometimes calculated by dividing the underlying Operating System/application IO block size by the number of disks in the stripe set. This makes read performance for large transfers very good, as multiple spindles will be used to fetch data. Small reads result in performance as good as a single disk. Write performance is extremely unfavorable. For a single OS/application block write, all disks must perform an initial read in order to fetch the data comprising the block, the parity information needs to be re-calculated, and then the data and the parity information must written out to disk. This is known as the read-modify-write performance hit with RAID 3. This is exacerbated by the fact that the parity information is being stored on a single disk. This can be a considerable bottleneck as a single spindle is performing all parity-based IO.

  • RAID 4 : Block striped with a single parity disk. RAID 4 is equivalent to RAID 3 but with a stripe size equal to one OS/application block. Read requests of only a few blocks do not require every disk in the stripe set be involved with the transfer. Several such requests can be processed in parallel, increasing hopefully the overall throughput of the RAID array. Write performance is still an issue, although writing a single block requires that only the current block and the parity block be read, then modified, and then written back to disk. The factor of having only a single parity disk still highlights the IO hot-spot that is the parity disk.

  • RAID 5 : Block striping with parity spread over all disks. RAID 5 alleviates the bottleneck that is the parity disk by having the parity information spread over all disks in the stripe set. Overall, this gives us more disk spindles performing actual IO of data. In a four-disk RAID 4 set, only three disks can be performing IO to actual data. With a four-disk RAID 5 set, we still have only three disks worth of capacity; the IO to actual data blocks will, over time, be spread across all four disks with each disk being the parity disk for a given disk block. Aggregate read performance is better for RAID 5 as is writer performance because we have alleviated the bottleneck that is the single parity disk. Overall usable capacity for RAID 2 is usually approximately 70 percent; for RAID 3, 4, and 5, it is normally 70-90 percent, depending on the size of the stripe set.



HP-UX CSE(c) Official Study Guide and Desk Reference
HP-UX CSE(c) Official Study Guide and Desk Reference
ISBN: N/A
EAN: N/A
Year: 2006
Pages: 434

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net