Server Clusters | Microsoft Windows 2000 Server Administrators Companion

A server cluster is a group of independent nodes that work together as a single system. They share a common cluster database that enables recovery in the event of the failure of any node. A server cluster uses a jointly connected resource, generally a disk array on a shared SCSI bus or Fibre Channel, which is available to all nodes in the cluster. Each Windows 2000 Advanced Server node in the cluster must have access to the array, and each node in the cluster must be able to communicate at all times with the other nodes in the cluster.

Windows 2000 supports server clusters only on machines running Advanced Server or Datacenter Server. Additionally, as shipped, Advanced Server supports only two-node clusters using a shared disk resource, through either Fibre Channel or a shared SCSI bus. Both nodes of the cluster must be running TCP/IP for networking and should have at least one dedicated network interconnect available. To avoid a single point of failure, a second network interconnect is highly recommended.

Server Cluster Concepts

To understand and implement server clusters, it is important to understand several new concepts and their ramifications, as well as specialized meanings for certain terms.

Networks (Interconnects)

A cluster has two distinct types of networks: the private network that's used to maintain communications between nodes in the cluster and the public network that clients of the cluster use to connect to the services of the cluster. Each of these networks can share the same network card and physical network cabling, but it is a good practice to keep them separate. Having them separate gives you an alternate path for interconnection between the nodes of the cluster. Because the interconnect between the nodes of a cluster is a potential single point of failure, it should always be redundant. The cluster service uses all available networks, both private and public, to maintain communications between nodes.

Real World

Always Have at Least Two Interconnects

If you have only a single method of communication in a cluster, the failure of that interconnect has a 50 percent chance of causing the entire cluster to become unavailable to its clients—hardly why you opted for a highly available technology like clustering, certainly. Here's what happens when the two nodes of a cluster can no longer communicate. When the communications fail, each node recognizes that it is no longer able to talk to the other node of the cluster and decides that the other node in the cluster has failed. It therefore attempts to take over the functions of the cluster by itself. The nodes are "partitioned" and as each node attempts to enable itself to take over the functions of the entire cluster, it starts by trying to gain control of the quorum resource (discussed later in the section entitled Types of Resources) and thus the shared disk on which the quorum resides. Whichever node is unable to gain control of the quorum resource is automatically shut down, while the other node attempts to maintain the processes of the cluster. However, because either node has an equal chance of gaining control of the quorum resource, there's a 50 percent chance that the node with a failed network card wins, leaving all the services of the cluster unavailable.

Nodes

A node is a member of a server cluster. It must be running Windows 2000 Advanced Server and Windows Clustering. It must also be running TCP/IP, must be connected to the shared cluster storage device, and must have at least one network interconnect to the other nodes in the cluster.

Groups

Groups are the units of failover. Each group contains one or more resources. Should any of the resources within the group fail, all fail over together according to the failover policy defined for the group. A group can be owned by only one node at a time. All resources within the group run on the same node. If a resource within the group fails and must be moved to an alternate node, all other resources in that group must be moved as well. When the cause of failure on the originating node is resolved, the group falls back to its original location, based on the failback policy for the group.

Resources

Any physical or logical entity that can be brought online or offline can be a server cluster resource. It must be able to be owned by only one node at a time and will be managed as part of the cluster. The quorum resource is a special resource. It is the repository of the configuration data of the cluster and the recovery logs that allow recovery of the cluster in the event of a failure. The quorum resource must be able to be controlled by a single node, it must provide physical storage for the recovery logs and cluster database, and it must use the NTFS file system. The only resource type supported for a quorum resource is the Physical Disk resource as shipped with Windows 2000 (this and other resource types are described in the next section), but it is possible that other quorum resource types will be developed and certified by third parties.

Types of Resources

Windows 2000 Advanced Server includes several different resource types; the sections that follow examine each of these resource types and the role they play in a server cluster. The available cluster resource types are

Physical Disk
DHCP
WINS
Print Spooler
File Share
Internet Protocol
Network Name
Generic Application
Generic Service

Physical Disk

The Physical Disk resource type is the central resource type required as a minimum for all server clusters. It is used for the quorum resource that controls what node in the cluster is in control of all other resources. The Physical Disk resource type is used to manage a shared cluster storage device. It has the same drive letter on all cluster servers.

DHCP and WINS

The DHCP service provides IP addresses and various other TCP/IP settings to clients, and the WINS service provides dynamic resolution of NetBIOS names to IP addresses. Both can be run as a resource of the cluster, providing for high availability of these critical services to network clients. For failover to work correctly, the DHCP and WINS databases must reside on the shared cluster storage.

Print Spooler

The Print Spooler resource type lets you cluster print services, making them fault tolerant and saving a tremendous number of help desk calls when the print server fails. It also prevents the problem of people simply clicking the Print button over and over when there's a problem, resulting in a very long and repetitious print queue.

To be clustered, a printer must be connected to the server through the network. Obviously, you can't connect the printer to a local port such as a parallel or USB port directly attached to one of the nodes of the cluster. The client can address the printer either by name or by IP address, just as it would a nonclustered printer on the network.

In the event of a failover, all jobs that are currently spooled to the printer are restarted. Jobs that are in the process of spooling from the client are discarded.

File Share

You can use a server cluster to provide a high-availability file server using the File Share resource type. The File Share resource type lets you manage your shared file systems in three different ways:

As a standard file share with only the top-level folder visible as a share name.
As shared subfolders, where the top-level folder and each of its immediate subfolders are shared with separate names. This makes it extremely easy to manage users' home directories, for example.
As a stand-alone Distributed file system (Dfs) root. You cannot, however, use a cluster server File Share resource as part of a fault-tolerant Dfs root.

Internet Protocol and Network Name

The Internet Protocol resource type is used to manage the IP addresses of the cluster. When an Internet Protocol resource is combined with a Network Name resource and one or more applications, you can create a virtual server. Virtual servers allow clients to continue to use the same name to access the cluster even after a failover has occurred. No client-side management is required, because, from the client perspective, the virtual server is unchanged.

Generic Application

The Generic Application resource type allows you to manage regular, cluster-unaware applications in the cluster. A cluster-unaware application that is to be used in a cluster must, at a minimum,

Be able to store its data in a configurable location
Use TCP/IP to connect to clients
Have clients that can reconnect in the event of an intermittent network failure

When you install a generic, cluster-unaware application, you have two choices: you can install it onto the shared cluster storage, or you can install it individually on each node of the cluster. The first method is certainly easier because you install the application only once for the whole cluster. However, if you use this method you won't be able to perform a rolling upgrade of the application, because it appears only once. (A rolling upgrade is an upgrade of the application in which the workload is moved to one server while the application on the other server is upgraded and then the roles are reversed to upgrade the first server.)

To give yourself the ability to perform rolling upgrades on the application, you need to install a copy onto each node of the cluster. You need to place it in the same folder and path on each node. This method uses more disk space than installing onto the shared cluster storage, but it permits you to perform rolling upgrades, upgrading each node of the cluster separately.

Generic Service

Finally, server clusters support one additional type of resource—the Generic Service resource. This is the most basic resource type, but it does allow you to manage your Windows 2000 services as a cluster resource.

Defining Failover and Failback

Windows 2000 server clusters allow you to define the failover and failback policies for each group or virtual server. This ability enables you to tune the exact behavior of each application or group of applications to balance the need for high availability against the overall resources available to the cluster in a failure situation. Also, when the failed node becomes available again, your failback policy determines whether the failed resource is immediately returned to the restored node, is maintained at the failed-over node, or migrates back to the restored node at some predetermined point in the future. These options allow you to plan for the disruption caused when a shift in node ownership occurs, limiting the impact by timing it for off-hours.

Configuring a Server Cluster

When planning your server cluster, you'll need to think ahead to what your goal is for the cluster and what you can reasonably expect from it. Server clusters provide for extremely high availability and resource load balancing, but you need to make sure your hardware, applications, and policies are appropriate.

High Availability with Load Balancing

The most common cluster configuration is static load balancing. In this scenario, the cluster is configured so that some applications or resources are normally hosted on one node while others are normally hosted on another node. If one node fails, the applications or resources on the failed node fail over to the other node, providing high availability of your resources in the event of failure and balancing the load across the cluster during normal operation. The limitation of this configuration is that in the event of a failure, your applications will all attempt to run on a single node, and you will need to implement procedures either to limit the load by reducing performance or availability, or to not provide some less-critical services during a failure. Another possibility for managing the reduced load-carrying capacity during a failure scenario is to have "at risk" users and applications that can be shut off or "shed" during periods of reduced capacity, much like power companies do during peak load periods when their capacity is exceeded.

It's important to quickly take steps to manage load during periods of failure when you configure your cluster for static load balancing. Failure to shed load can lead to catastrophic failure, or such extreme slowdown as to simulate it, and then no one will have access to the resources and applications of the cluster.

Maximum Availability Without Load Balancing

The cluster configuration with the highest availability and reliability for critical applications is to run one node of the cluster as a "hot spare." This scenario requires that each server node be sufficiently powerful to run the entire load of the cluster by itself. You then configure all the applications and resources to run on a single node, with the other node sitting idle. In the event of failure on the primary node, the applications fail over to the idle node and continue with full capability. After the primary node is back online it can continue as the new hot spare, or you can force the applications back to the primary node, depending on the needs of your environment.

This scenario provides full and complete fault tolerance in the event of the failure of one of the nodes, but it has the greatest hardware cost. Use this clustering configuration only where your applications or resources are critical and you can afford the extra hardware expense far more than any limits to the load in case of a failure.

Partial Failover (Load Shedding)

Another cluster configuration is called "load shedding" or partial failover. In this configuration, critical applications and resources are designed to fail over to the other node in the cluster in the event of a failure, but noncritical applications and resources will be unavailable until the cluster is back to full functionality. The critical resources and applications are thus protected in a failure situation but noncritical ones simply run as though they were on a stand-alone server.

In this configuration, you might, depending on capacity and load conditions, have to configure the noncritical applications and resources on both nodes to be unavailable in the event of a failure on either node. This allows you to maintain a high level of performance and availability of your most critical applications while shedding the load from less critical applications and services when necessary. This strategy can be very effective when you must, for example, service certain critical applications or users under any and all circumstances but can allow other applications and users with a lower priority to temporarily fail.

Virtual Server Only

You can create a server cluster that has only a single node, which allows you to take advantage of the virtual server concept to simplify the management and look of the resources on your network. Having a single node doesn't give you any additional protection against failure or any additional load balancing over that provided by simply running a single stand-alone server, but it allows you to easily manage groups of resources as a virtual server.

This scenario is an effective way to stage an implementation. You create the initial virtual server, putting your most important resources on it in a limited fashion. Then, when you're ready, you add another node to the server cluster and define your failover and failback policies, giving you a high-availability environment with minimal disruption to your user community. In this scenario, you can space hardware purchases over a longer period while providing services in a controlled test environment.

Planning the Capacity of a Server Cluster

Capacity planning for a server cluster can be a complicated process. You need to thoroughly understand the applications that will be running on your cluster and make some hard decisions about exactly which applications you can live without and which ones must be maintained under all circumstances. You'll also need a clear understanding of the interdependencies of the resources and applications you'll be supporting.

The first step is to quantify your groups or virtual servers. Make a comprehensive list of all applications in your environment, and then determine which ones need to fail over and which ones can be allowed to fail but still should be run on a virtual server.

Next determine the dependencies of these applications and what resources they need to function. This information allows you to group dependent applications and resources in the same group or virtual server. Keep in mind that a resource can't span groups, so if multiple applications depend on a resource, such as a Web server, they must all reside in the same group or on the same virtual server as the Web server and thus share the same failover and failback policies.

A useful mechanism for getting a handle on your dependencies is to list all your applications and resources and draw a dependency tree for each major application or resource. This helps you visualize not only the resources that your application is directly dependent on, but also the second-hand and third-hand dependencies that might not be obvious at first glance. For example, a cluster that is used as a high-availability file server uses the File Share resource. And it makes perfect sense that this File Share resource is dependent on the Physical Disk resource. It's also dependent on the Network Name resource. However, the Network Name resource is dependent on the IP resource. Thus, although the File Share resource isn't directly dependent on the IP resource, when you draw the dependency tree you will see that they all need to reside in the same group or on the same virtual server. Figure 16-2 illustrates this dependency tree.

Figure 16-2. The dependency tree for a File Share resource.

Finally, as you're determining your cluster capacity, you need to plan for the effect of a failover. Each server must have sufficient capacity to handle the additional load imposed on it when a node fails and it is required to run the applications or resources that were owned by the failed node.

The disk capacity for the shared cluster storage must be sufficient to handle all the applications that will be running in the cluster as well as to provide the storage that the cluster itself requires for the quorum resource. Be sure to provide enough RAM and CPU capacity on each node of the cluster so that the failure of one node won't overload the other node to the point that it too fails. This possibility can also be managed to some extent by determining your real service requirements for different applications and user communities and reducing the performance or capacity of those that are less essential during a failure. However, such planned load shedding might not be sufficient and frequently takes a significant amount of time to be accomplished, so give yourself some margin to handle that initial surge during failover.