24x7 Operations

   


24x7 environments are a big subject in today's business arena. This section aims to provide an insight into the types of issues that need to be addressed when establishing a high availability system. The topic warrants an entire book on its own to cover every aspect thoroughly and to consider all the options, but it should prove useful to discuss it here as well.

There is an increasing need for information and application services to be available on a continuous basis, particularly with regard to e-business. Businesses cannot tolerate either planned or unplanned downtime ”the only way to provide this level of service is to create a fully operational environment that is available 24 hours a day, 7 days a week.

The continuous environment places a significantly higher demand on the computer systems because they must be constantly available to deliver the level of service that is required. Several issues must be considered when creating a 24x7 working environment, and the major ones are discussed here.

Solaris and Other Platforms

In this chapter, the book begins to focus on Solaris, whereas the preceding chapters largely addressed general system-management topics. It should be noted, however, that most of the principles outlined here are equally applicable to other platforms, although any solutions provided are specifically tailored for the Solaris and Sun environment.


Configuration of the Environment

The configuration of the 24x7 environment is normally determined by two major factors ”namely, the level of criticality attached to the requirement, and money, although not always in that order of priority!

First, the critical nature of some services dictates how the 24x7 environment is to be established ”that is, the level of replication required, whether to use a single site or to utilize remote sites for added resilience, and so on. The business needs to recognize the importance of the application being carried out and the effect of not being able to carry it out ”this is often the deciding factor. A true 24x7 computing environment cannot be achieved using a single system because it creates a single point of failure; there must be at least two systems, and, realistically , they have to be at separate sites. This is because a major disaster, such as fire or flood, could still bring the operation to an abrupt halt if all the systems were housed in the same location.

Sun Cluster software addresses exactly this type of scenario by allowing a number of computer systems to be "clustered" together so that they collectively provide the overall service to the customer. This provides the required availability because, if one computer fails, the others in the cluster pick up the work that was being carried out automatically. This also allows elements of the cluster to be separated by up to 10 kilometers, delivering a high level of contingency against disaster.

A further interesting aspect of this software is that it supports the Sun StorEdge disk configurations, allowing all of the critical data to be housed independently of any one system, adding to the resilience. This is achieved through the use of a storage area network (SAN). A SAN is a dedicated network of storage devices, normally with its own server, that is part of the overall computer network but independent of any of the systems that make use of its resource. This means that if any of the clustered systems mentioned previously were to fail, the data would still be available to the remaining members of the cluster.

The cluster solution is fully scalable, which means that all the systems comprising the cluster do not have to be the same. For example, a four-system cluster could include an Enterprise 10000 server, two Enterprise 5000 machines, and one Enterprise 3000 server. This enables companies to still make use of previous-generation systems, which can be highly cost-effective when attempting to finance the operation. Clustering solutions are discussed again in Chapter 7, "Disaster Recovery and Contingency Management," as part of disaster recovery. Figure 5.1 shows a simple diagram of a possible clustered environment.

Figure 5.1. The clustered environment provides added resilience and high availability, coupled with automatic failover facilities.

graphics\05fig01.gif

Budgetary Considerations

An extremely important question that the system manager must ask is this: "Who's paying for all this?" Managing and supporting a 24x7 environment is not cheap ”indeed, it costs significantly more than just running a "business hours" operation. These costs are not just for computer hardware and software so that high availability can be achieved, but also for additional accommodation ( especially when multiple sites are being used), extra staff to support the operation, and, of course, extra management to manage it. Somewhere, someone is going to have to pay for it, and it can't be the system manager. Normally, the requesting department ”that is, the person(s) or department that originally asked for the operation ”is the best place to start. Companies using the chargeback system, described in Chapter 2, "The IT Budget," should be capable of charging the customer for the services being provided ”in this case, a 24- hour computing environment.

Many system administrators and system managers come under pressure to deliver this level of service without being provided with any additional financial assistance. Without lowering standards of service or quality elsewhere, however, this simply isn't possible, which merely shifts the problem rather than solving it.

Staff Considerations

Manning a 24x7 operation involves additional staff: It is not possible to stretch the existing manpower levels to cater to a nonstop operation. Not only must the system administration and support staff be considered, but the management must as well. There is likely to be a requirement for at least one more system manager, who will be responsible for out-of-business working hours. Many organizations employ a formal shift roster, usually with either three or four shifts needed to provide the required level of support. It is important to remember that even shift workers and system managers need to have vacations , just like everyone else, and they are also sometimes ill, so any staffing requirements must account for expected attendance levels. Of course, this is expensive because the number of staff needed for a 24x7 operation will be at least double that of a business hours-only operation ”double rather than triple because the out-of-hours shift size can often be reduced. Greater use of automation and management tools can further reduce the number of staff needed, but there is no escaping the fact that each shift must be manned and that attendance factors must be considered, too.


   
Top


Solaris System Management
Solaris System Management (New Riders Professional Library)
ISBN: 073571018X
EAN: 2147483647
Year: 2001
Pages: 101
Authors: John Philcox

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net