6.6 Summary

In this chapter we discussed the following:

  • Most organizations today require at least some measure of fault tolerance in their networks. The degree of resilience offered on any network is a compromise between risk and cost. For mission- or business-critical applications, such as a public network service provider's switching center, very high levels of availability will be required (say 99.99 percent or greater), as well as carrier class equipment.

  • Resilience is the best approach from the top down. Potential failures should be identified from the wide area circuit design down to component level. Once these failures are identified, a plan for resolving each failure, together with associated cost, should be devised.

  • Techniques such as multilink and multipath load sharing are important in the wide area, where backbone links are typically expensive, congested, and critical to the successful operation of the network.

  • Protocols such as VRRP and HSRP enable key devices such as routers and firewalls to be deployed in a fault-tolerant configuration, transparently to end users.

  • HA clusters and fault-tolerant systems both provide effective availability solutions. Each has specific advantages depending upon the environment and problems to be solved. In some cases, a combination of HA clusters and fault tolerance is appropriate.

  • Fault tolerance often goes hand in hand with performance scalability, since the use of parallel paths or multiple system, often has a performance benefit.


