Chapter 6: Designing Reliable Networks

Overview

Organizations are increasingly reliant on computer networks for business-or mission-critical applications. The scope and size of these networks have expanded so rapidly over the past two decades that considerable effort and expense are now targeted at keeping network resources available, sometimes 24 hours a day, all year. Traditionally this area of network design has been the preserve of large mainframe sites and those sites requiring high levels of protection (such as nuclear power plants). However, the explosion of Web-based business methods means than many more organizations are now eager to maintain high availability in order to minimize service losses (Table 6.1 illustrates the kinds of losses we could expect by industry type and operation). If the network is poorly designed, and insufficient attention is paid to providing availability in core systems, users can experience anything from slow response times to complete loss of service (referred to as downtime) for extended periods. The technical issues in maintaining high availability are both complex and subtle, and it is the network designer's job to balance loss probability against cost, providing guidance to senior management on the likelihood of failures and their impact on the business.

Table 6.1: Average Downtime Costs by Industry Type and Operation (Source: Dataquest— Perspective, September 30, 1996).

Industry

Operation

Avg Cost per hour Downtime ($K)

Financial

Brokerage

$6,450.0

Financial

Credit Card/Sales Authorisation

$2,600.0

Media

Pay-Per-View

$150.0

Financial

ATM Fees

$145.0

Retail

Home Shopping (TV)

$113.0

Retail

Home Catalog Sales

$90.0

Transportation

Airline Reservations

$89.5

Media

Tee-Ticket Sales

$69.0

Transportation

Package Shipping

$28.0

Networks are rarely static environments, and budgets are finite. In practice network designers are required to make a range of pragmatic and technical decisions that address, accept, mitigate, or transfer the risks of failure—all within the constraints of a budget. The designer must also ensure that the solutions provided are scalable, so that additional nodes, services, and capacity can be added without major upheaval and without adversely affecting existing users. Downtime for truly business- and mission-critical systems can equate to losses of millions of dollars per minute; these organizations, therefore, demand high-availability (HA) networks and are often prepared to go to extraordinary lengths to achieve them. HA networks must provide alternate systems and network resources to compensate for critical system or component failures, ideally automatically and with no loss of data.

Failure knows no boundaries in a network design, and the smallest component failure can effectively bring down a whole business without warning (e.g., a failed hard disk controller on your core e-business server could stop all transactions). For practical reasons organizations are invariably broken down into teams responsible for different aspects of IT (desktop support, communications, applications, database, cabling, etc.). When a problem occurs, it is all too common for application staff to blame the network and vice versa. To maintain HA networks, different disciplines must work together, both at the design phase and subsequently. Good diagnostic, monitoring, and management tools can also help.

In practice there are a range of design techniques and vendor solutions available, some standard and some proprietary. Often these techniques are designed to address specific aspects of the design, such as application failure, network failure, system failure, media failure, and component failure. In this chapter we will cover the theoretical assessment of risk and availability, media and topological resilience, load sharing, high-availability protocols, and device and component resilience. Some of the techniques described here are closely related to performance optimization, since by providing resilience through parallel systems or parallel paths we are often able to make use of this extra capacity to provide performance scalability. Performance and resilience are, therefore, frequently tightly bound in network design.



Data Networks. Routing, Seurity, and Performance Optimization
ActionScripting in Flash MX
ISBN: N/A
EAN: 2147483647
Year: 2001
Pages: 117

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net