6.1 Terminology

I need to introduce some terminology before we can continue to discuss RAID:

  • Since a disk array involves multiple physical disks, I use the term logical volume to distinguish a volume composed of multiple disks.

  • The Mean Time Between Failure (MTBF) refers to the average time between failures of a component. For example, the MTBF of a given disk drive might be specified at 500,000 hours, meaning that the typical disk of that model will run for half a million hours before failing. Since this is nothing more than a statistical measure, an individual disk drive might well fail fifty (or one million) hours into service. These component-based measurements can be applied to the whole system; for example, a 9 TB array composed of one thousand 9 GB disks with a MTBF of 1.2 million hours each would be expected to have an aggregate MTBF of 1,200 hours (about fifty days).

    figs/equ0601.gif

  • The Mean Time To Data Loss (MTTDL) gives the amount of time that a system can operate before it suffers a failure sufficient to cause data loss. If all the components operate independently, the MTTDL is equal to the MTBF of the least reliable component; however, designers go to great lengths to ensure that the MTTDL is increased despite constant MTBF numbers for components. For example, a mirrored pair of disks won't suffer data loss unless both drives fail, which is much less likely than one disk failing. It is also critical to note that a high MTTDL does not imply continuous availability: a host controller failure might render a mirrored disk pair inaccessible, but the data could still be completely intact.

  • The Mean Time To Data Inaccessibility (MTTDI) is not formally defined; however, it is increasingly important to be able to always access your data, not just protect it from loss. The MTTDI gives the amount of time that a system can operate before it suffers a failure sufficient to make data unavailable.

  • While we won't discuss it here, the Mean Time To Repair (MTTR) is often an important metric, and certainly something that should be monitored . As we'll see, the loss of a disk with some disk array schemes can result in degraded performance and susceptibility to data loss; knowing how long it takes your system to recover from a failed disk (including the time spent to physically replace the failed disk) can be quite useful.



System Performance Tuning2002
System Performance Tuning2002
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 97

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net