Planning, Implementing, and Maintaining Highly Available Servers | MCSE 70-293 Exam Prep: Planning and Maintaining a Microsoft Windows Server 2003 Network Infrastructure (2nd Edition)

When we discuss highly available solutions, we can look at the problem in two distinctly different ways: hardware and software . Windows Server 2003 provides two types of software-based high availability: clustering and network load balancing ( NLB ) .

Clustering is accomplished when you take a group of independent servers and put them together into one large collective entity that is accessed as if it were a single system. Incoming requests for service can be evenly distributed across multiple cluster members or may be handled by one specific cluster member.

The Microsoft Cluster Service (MSCS) in Windows Server 2003 provides highly available fault-tolerant systems through failover . When one of the cluster members (nodes) is unable to respond to client requests, the remaining cluster members respond by distributing the load among themselves , thus responding to all existing and new connections and requests for service. In this way, clients see little ”if any ”disruption in the service being provided by the cluster. Cluster nodes are kept aware of the status of other cluster nodes and their services through the use of heartbeats. A heartbeat keeps track of the status of each node and also sends updates in the configuration of the cluster. Clustering is most commonly used for database, messaging, and file/print servers. Windows Server 2003 supports up to eight nodes in a cluster.

Windows Server 2003 also provides network load balancing ( NLB ) in which all incoming connection requests are distributed to members of an NLB cluster using a mathematical algorithm. NLB clustering is best used in situations in which clients can connect to any server in the cluster, such as Web sites, Terminal Services servers, and VPN servers. You can configure how the client interacts with the NLB cluster as well, such as allowing the client to use multiple NLB cluster members during a single connection (acceptable for Web sites) or forcing the client to use the same cluster member for the entire connection period (a necessity for VPN and Terminal Services servers). Windows Server 2003 NLB clusters can contain as many as 32 nodes.

Although you can use both clustering and NLB in your final design, such as in the case of an e-commerce site that uses NLB for front-end Web servers and clustering for back-end SQL servers, you cannot use both technologies on the same server.

When a network load balancing cluster is created, port rules are used to determine what type of traffic is to be load-balanced across the cluster nodes. Within the port rule is the additional option to configure port rule filtering , which determines how the traffic will be load-balanced across each of the cluster nodes.

In an NLB cluster, every cluster node can answer for the cluster's IP address; thus, every cluster node receives all inbound traffic by default. When each node receives the inbound request, it either responds to the requesting client or drops the packet if the client has an existing session in progress with another node. If no port rule is configured to specifically define how traffic on the specific port is to be handled, the request is passed off to the cluster node having the lowest configured priority. This may result in decreased performance by the NLB cluster as a whole if the traffic is not meant to be or cannot be load-balanced.

Port rules allow you to change this behavior in a deliberate and controlled fashion. Think of port rules as the network load balancing equivalent of a firewall rule set. When you configure port rules to allow traffic on the specific ports you require to reach the NLB cluster and configure an additional rule to drop all packets not meeting any other port rules, you can greatly improve the performance of the NLB cluster by allowing it to drop all packets that are not allowed to be load-balanced. From an administrative and security standpoint, port rules allow for easier server monitoring due to the limited number of ports that must be monitored .

You can configure how NLB clusters load-balance traffic across cluster nodes; this process is referred to as filtering . By configuring filtering, you can specify whether only one node or multiple nodes within the NLB cluster are allowed to respond to multiple requests from the same client during a single session (connection). The three filtering modes are as follows :

Single Host ” When this filtering mode is configured, all traffic that meets the port rule criteria is sent to a specific cluster node. The Single Host filter might be used in a Web site that has only one SSL server; thus, the port rule for TCP port 443 would specify that all traffic on this port must be directed to that one node.
Disable Port Range ” This filtering mode instructs the cluster nodes to ignore and drop all traffic on the configured ports without any further action. This type of filtering can be used to prevent ports and port ranges from being load-balanced.
Multiple Host ” The default filtering method, multiple host, specifies that all active nodes in the cluster are allowed to handle traffic. When Multiple Host filtering is enabled, the host affinity must be configured. Affinity determines how clients interact with the cluster nodes and varies depending on the requirements of the applications that the cluster is providing. The following three types of affinities can be configured:
- None ” This affinity type sends an inbound client request to all nodes within the cluster. This type of affinity results in increased speed but is suitable only for providing static content to clients, such as static Web sites and FTP downloads. Typically, no cookies are generated by the applications running on the clusters that are configured for this type of affinity.
- Class C ” This affinity type causes all inbound client requests from a particular Class C address space to a specific cluster node. This type of affinity allows a user 's state to be maintained but can be overloaded or fooled if all client requests are passed through a single firewall or proxy server.
- Single ” This affinity type maintains all client requests on the same node for the duration of the session (connection). This type of affinity provides the best support for maintaining user state data and is often used when applications are running on the cluster that generates cookies.

The mathematical algorithm used by network load balancing sends inbound traffic to every host in the NLB cluster. The inbound client requests can be distributed to the NLB cluster nodes through one of two methods: unicast or multicast. Although both methods send the inbound client requests to all hosts by sending them to the media access control (MAC) address of the cluster, they go about it in different ways.

When you use the unicast method, all cluster nodes share an identical unicast MAC address. To do so, NLB overwrites the original MAC address of the cluster network adapter with the unicast MAC address that is assigned to all the cluster nodes. When you use the multicast method, each cluster node retains its original MAC address of the cluster network adapter. The cluster network adapter is then assigned an additional multicast MAC address, which is shared by all the nodes in the cluster. Inbound client requests can then be sent to all cluster nodes by using the multicast MAC address.

The unicast method is usually preferred for NLB clusters unless each cluster node has only one network adapter installed in it. Recall that in any clustering arrangement, all nodes must be able to communicate not only with the clients, but also among themselves. Recall that NLB modifies the MAC address of the cluster network adapter when unicast is used; thus, the cluster nodes cannot communicate among themselves. If only one network adapter is installed in each cluster node, you need to use the multicast method.

NLB uses a group of between 2 and 32 servers to distribute inbound requests among them in a fashion that permits the maximum amount of loading with a minimal amount of downtime. Each NLB cluster node contains an exact copy of the static and dynamic content that every other NLB cluster node has; in this way, it doesn't matter which NLB cluster node receives the inbound request, except in the case of host affinity where cookies are involved. The NLB cluster nodes use heartbeats to keep aware of the status of all nodes.

Clustering, on the other hand, uses a group of between 2 and 8 servers that all share a common storage device. Recall that a cluster resource is an application, service, or hardware device that is defined and managed by the cluster service. The cluster service (MSCS) monitors these cluster resources to ensure that they are operating properly. When a problem occurs with a cluster resource, MSCS attempts to correct the problem on the same cluster node. If the problem cannot be corrected ”such as a service that cannot be successfully restarted ”the cluster service fails the resource, takes the cluster group offline, moves it to another cluster node, and restarts the cluster group. MSCS clusters also use heartbeats to determine the operational status of other nodes in the cluster.

Two clustering modes exist:

Active/Passive ” One node in the cluster is online providing services. The other nodes in the cluster are online but do not provide any services or applications to clients. If the active node fails, the cluster groups that were running on that node are failed over to the passive node. The passive node then changes its state to active and begins to service client requests. The passive nodes cannot be used for any other purpose during normal operations because they must remain available for a failover situation. All nodes should be configured identically to ensure that when failover occurs no performance loss is experienced .
Active/Active ” One instance of the clustered service or application runs on each node in the cluster. If a failure of a node occurs, that instance is transferred to one of the running nodes. Although this clustering mode allows you to make use of all cluster nodes to service client requests, it can cause significant performance degradation if the cluster was already operating at a very high load at the time of the failure.

Each resource group within the cluster has a prioritized listing of the nodes that are supposed to act as its host.

You can configure failover policies for each resource group to define exactly how each group behaves when a failover occurs. You must configure the following three settings:

Preferred nodes ” An internal prioritized list of available nodes for resource group failovers and failbacks. Ideally, all nodes in the cluster are in this list, in the order of priority you designate .
Failover timing ” The resource can be configured for immediate failover if the resource fails, or the cluster service may be configured to try to restart the resource a specified number of times before failover actually occurs. The failover threshold value should be equal to or less than the number of nodes in the cluster.
Failback timing ” Failback can be configured to occur as soon as the preferred node is available or during a specified period of time, such as when peak load is at its lowest so as to minimize service disruptions.