|
|
In this chapter we discussed the following:
Most organizations today require at least some measure of fault tolerance in their networks. The degree of resilience offered on any network is a compromise between risk and cost. For mission- or business-critical applications, such as a public network service provider's switching center, very high levels of availability will be required (say 99.99 percent or greater), as well as carrier class equipment.
Resilience is the best approach from the top down. Potential failures should be identified from the wide area circuit design down to component level. Once these failures are identified, a plan for resolving each failure, together with associated cost, should be devised.
Techniques such as multilink and multipath load sharing are important in the wide area, where backbone links are typically expensive, congested, and critical to the successful operation of the network.
Protocols such as VRRP and HSRP enable key devices such as routers and firewalls to be deployed in a fault-tolerant configuration, transparently to end users.
HA clusters and fault-tolerant systems both provide effective availability solutions. Each has specific advantages depending upon the environment and problems to be solved. In some cases, a combination of HA clusters and fault tolerance is appropriate.
Fault tolerance often goes hand in hand with performance scalability, since the use of parallel paths or multiple system, often has a performance benefit.
[1] K. Norvag, "An Introduction to Fault-Tolerant Systems," IDI Technical Report 6/99, Norwegian University of Science and Technology, July 2000.
[2] T. Kenyon, High-Performance Network Design: Design Techniques and Tools (Woburn MA: Digital Press, 2001)
[3] www.drj.com/special/stats/tari.htm88, information about recent disasters.
[4] www.fema.gov/library/lib01.htm. information about recent disasters.
[5] www.ibm.com, IBM home page.
[6] www.symantec.com, Symantec home page.
[7] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques (San Mateo, CA: Morgan Kaufmann Publishers, 1993).
[8] www.sra.org, The Society for Risk Analysis, home page.
[9] www.riskworld.com, Risk World home page.
[10] A. Kershenbaum, Telecommunications Design Algorithms (New York: McGraw-Hill)
[11] himalaya.compaq.com, Tandem home page (part of Compaq).
[12] www.stratus.com, Stratus home page.
[13] www.hp.com, Hewlett-Packard home page.
[14] www.microsoft.com, Microsoft home page.
[15] Virtual Router Redundancy Protocol, RFC 2338, April 1998.
[16] Cisco Hot Standby Router Protocol (HSRP), RFC 2281, March 1998.
[17] www.nokia.com, see security products.
[18] United States Patent Office, Patent Number: 5,473,599, Standby Router Protocol, December 5, 1995.
[19] www.cisco.com, Cisco home page. See links for HSRP.
[20] www.arcoide.com, Arco Computer Products, Inc., home Web page. Supplier of disk mirroring products.
[21] http://teak.wiscnet.net/linux-mirror-cookbook, Linux disk mirroring cookbook.
|
|