Chapter 5. Monitoring the Disks

I l @ ve RuBoard

Disk technologies provide some level of protection against failure. To reduce or eliminate system downtime due to disk failures, technologies such as Redundant Array of Inexpensive Disks (RAID) or mirroring must be used to provide data redundancy. RAID technology uses parity data to ensure that recovery is possible in the event of a disk failure. Mirroring allows multiple copies of data to be maintained . Three-way mirroring, in particular, provides added data protection during backups , because mirroring continues on the disks that are not being backed up.

With data redundancy, data is protected if an initial failure occurs, but the initial failure may result in a loss of redundancy, or a single point of failure (SPOF). Thus, a subsequent failure may result in loss of data. So, even though these technologies protect against single failures, notification of such failures is critical to reduce the risk of downtime. Fixing a failure and restoring data redundancy before the next failure occurs is key to maintaining availability. Data redundancy and fault monitoring do not replace the need for a regular backup strategy.

Although RAID and mirroring may protect against a single hardware failure, many other potential events exist that may affect the availability of data on the disk and increase the risk of downtime. The key to eliminating the risk of failure and downtime is to monitor your critical disk resources. You must know which resources to monitor and what tools are available to do that monitoring. This chapter addresses monitoring of disk resources, from the hardware device level up to filesystem resources. It does not cover backup, recovery, or tape storage devices.

I l @ ve RuBoard


UNIX Fault Management. A Guide for System Administrators
UNIX Fault Management: A Guide for System Administrators
ISBN: 013026525X
EAN: 2147483647
Year: 1999
Pages: 90

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net