Introduction to Exchange Server Clusters

< Day Day Up >

What's all the buzz about high availability? Why is everyone so intent on achieving the Utopia of server availability: five nines? It really all comes down to one thing: economics. The economics of today's Internet-centric world demand that critical services and servers be available 100% of the time. In the absence of perfection (which no one has delivered yet), the bar for highly available solutions has been set at five nines: 99.999% uptime. What exactly does that equate to, though?

Five nines availability enables you to have critical services offline for 5.25 minutes per year. That's an unbelievably low number, no matter how you look at it. But that's the goal of highly available solutions. As you might know, 5 minutes per year is barely enough time to apply a hot fix, much less a service pack. The answer to this problem is highly available server solutions. When discussing highly available solutions, there are two distinctly different ways to look at the problem: one based on hardware and one based on software. Windows Server 2003 provides you with two types of software-based high availability: clustering and Network Load Balancing (NLB). We examine the pertinent details of clustering as it relates to supporting Exchange Server 2003 computers.

Although the discussion in this chapter centers around using the Microsoft Clustering Service (MSCS) to create highly available Exchange server solutions, there is still a small bit of Network Load Balancing knowledge that you should possess. NLB is not supported for the creation of highly available Exchange clusters, such as those that you would find on the back-end of a front-end/back-end implementation, but it can be used on front-end servers that are running IIS and Exchange Server. These front-end servers are typically used to support Outlook Web Access (OWA) connections. Due to the nature of IIS, NLB is a very effective solution for providing highly available IIS solutions. If you want to learn more about both of the clustering methods available in Windows Server 2003, check out MCSE 70-293 Training Guide: Planning and Maintaining a Windows Server 2003 Network Infrastructure by Will Schmied and Rob Shimonski, Que Publishing, 2003.

Of course, having any solution in place highly available or not is of little use if disaster strikes and removes it from operation. Environmentally or intentionally caused disasters are a fact of life that you simply cannot afford to ignore. Although you might not be able to prevent your servers from experiencing a disaster condition, you can prevent extended downtimes and the temporary unavailability of the network by implementing a well-planned and practiced disaster-recovery plan as we discuss later in this chapter.

Clustering is accomplished when you group independent servers into one large collective entity that is accessed as if it were a single system. Incoming requests for service can be evenly distributed across multiple cluster members or can be handled by one specific cluster member.

The Microsoft Cluster Service (MSCS) in Windows Server 2003 provides highly available fault-tolerant systems through failover. When one of the cluster members (nodes) cannot respond to client requests, the remaining cluster members respond by distributing the load among themselves, thus responding to all existing and new connections and requests for service. In this way, clients see little, if any, disruption in the service being provided by the cluster. Cluster nodes are kept aware of the status of other cluster nodes and their services through the use of heartbeats. A heartbeat is used to keep track of the status of each node and also to send updates in the configuration of the cluster. Clustering is most commonly used to support database, messaging, and file/print servers. Windows Server 2003 supports up to eight nodes in a cluster.

High-Availability Terminology

Although we don't typically take pages within a chapter to define key terms, the terminology associated with clustering is somewhat esoteric and a good understanding of it is key to successfully implementing and managing any clustered solution. Although the following list of terms is not all-inclusive, it represents some of the more important ones you should understand:

Cluster A group of two or more independent servers that operate together and are viewed and accessed as a single resource.
Cluster resource A network application, service, or hardware device (such as a network adapter or storage system) that is defined and managed by the cluster service.
Cluster resource group A defined set of resources contained within a cluster. Cluster resource groups are used as failover units within a cluster. When a cluster resource group fails and cannot be automatically restarted by the cluster service, the entire cluster resource group is placed in an offline status and failed over to another node.
Cluster virtual server A cluster resource group that has a network name and IP address assigned to it. Cluster virtual servers are accessible by their NetBIOS name, Domain Name System (DNS) name, or IP address.
Convergence The process by which NLB clustering hosts determine a new, stable state among themselves and elect a new default host after the failure of one or more cluster nodes. During convergence, the total load on the NLB cluster is redistributed among all cluster nodes that share the handling of traffic on specific ports, as determined by their port rules.
Heartbeat A network communication sent among individual cluster nodes at intervals of no more than 500 milliseconds (ms); used to determine the status of all cluster nodes.
Failback The process of moving a cluster group (either manually or automatically) back to the preferred node after the preferred node has resumed cluster membership. For failback to occur, it must be configured for the cluster group, including the failback threshold and selection of the preferred node.
Failover The process of a cluster group moving from the currently active node to a designated, functioning node in the cluster group. Failover typically occurs when the active node becomes unresponsive (for any reason) and cannot be recovered within the configured failure threshold period.
Node An individual server within a cluster.
Quorum disk The disk drive that contains the definitive cluster-configuration data. Clustering with MSCS requires the use of a quorum disk and requires continuous access to the data contained within the quorum disk. The quorum disk contains vital data about the nodes participating in the cluster, the applications and services defined within the cluster resource group, and the status of each node and cluster resource. The quorum disk is typically located on a shared storage device.

How Does Clustering Work?

Clustering uses a group of between two and eight servers that all share a common storage device. Recall that a cluster resource is an application, service, or hardware device that is defined and managed by the cluster service. The cluster service (MSCS) monitors these cluster resources to ensure that they are properly operating. When a problem occurs with a cluster resource, MSCS attempts to correct the problem on the same cluster node. If the problem cannot be corrected such as a service that cannot be successfully restarted the cluster service fails the resource, takes the cluster group offline and moves it to another cluster node, and restarts the cluster group.

MSCS clusters also use heartbeats to determine the operational status of other nodes in the cluster.

Two clustering modes exist:

Active/Passive One node in the cluster is online providing services. The other nodes in the cluster are online but do not provide any services or applications to clients. If the active node fails, the cluster groups that were running on that node are failed over to the passive node. The passive node then changes its state to active and begins to service client requests. The passive nodes cannot be used for any other purpose during normal operations because they must remain available for a failover situation. All nodes should be configured identically, to ensure that when failover occurs, no performance loss is experienced.
Active/Active One instance of the clustered service or application runs on each node in the cluster. If a failure of a node occurs, that instance is transferred to one of the running nodes. Although this clustering mode enables you to make use of all cluster nodes to service client requests, it can cause significant performance degradation if the cluster was already operating a very high load at the time of the failure.

Clustering in Exchange Server 2003 is limited to two nodes when installing on Windows 2000 Advanced Server Service Pack 4 (SP4), four nodes when installing on Windows 2000 Datacenter Server SP4, and eight nodes when installing on Windows Server 2003, Enterprise Edition or Windows Server 2003, Datacenter Edition. These are limitations associated with the operating systems themselves, not with Exchange Server 2003.

Exchange Clustering Specifics

You might be wondering which mode is better: active/passive or active/active. When using the active/active mode, you can deploy Exchange in a cluster with only two nodes which is the maximum supported. Each node in that cluster can run two instances of the Exchange virtual server (EVS; recall the definition of the cluster virtual server), for a total of four instances of the EVS. Should one of the nodes fail, the single remaining node is loaded with all resources from both servers, possibly even resulting in an overloaded condition and causing it to fail as well. In addition, to ensure that reliable failover occurs, each node in the active/active cluster can host a maximum of only 1,900 active mailboxes much less than an Exchange server might normally hold.

On the other hand, if you implement an active/passive mode cluster (as Microsoft recommends), you can achieve a much more reliable and robust solution. An active/passive cluster must contain at least one active node and at least one passive node however, you cannot exceed eight nodes total. As an example, suppose you had an eight-node active/passive cluster. You might configure six nodes as active and the remaining two as passive. This gives you a multilayer backup plan if more than one active node should fail within a short period of time.

Finally, you must also bear in mind that a single Exchange server is limited to four storage groups. This is typically not a problem when you use an active/passive cluster, but becomes an acute problem when you use an active/active cluster. If one of the active servers has three storage groups on it and the other one also has three storage groups on it, two of the storage groups on the node that fails will not be mounted, thus making those mailboxes or public folders unavailable to clients. This is one more reason why active/passive clustering is the best way to cluster your Exchange servers.

Cluster Models

Three distinctly different cluster models exist for configuring your new cluster. You must choose one of the three models at the beginning of your cluster planning because the chosen model dictates the storage requirements of your new cluster. The three models are presented in the following sections in order of increasing complexity and cost.

Single-Node Cluster

The single-node cluster model, shown in Figure 7.1, has only one cluster node. The cluster node can make use of local storage or an external cluster storage device. If local storage is used, the local disk is configured as the cluster storage device. This storage device is known as a local quorum resource. A local quorum resource does not make use of failover and is most commonly used as a way to organize network resources in a single network location for administrative and user convenience. This model is also useful for developing and testing cluster-aware applications.

Figure 7.1. The single-node cluster can be used to increase service reliability and also to prestage cluster resource groups.

graphics/07fig01.gif

Despite its limited capabilities, this model does offer the administrator some advantages at a relatively low entry cost:

The cluster service can automatically restart services and applications that might not be capable of automatically restarting after a failure. This increases the reliability of network services and applications.
The single node can be clustered with additional nodes in the future, preserving the resource groups that you have already created. You simply need to join the additional nodes to the cluster, move the quorum to the shared storage device, configure the failover, and move policies for the resource groups to make ready the newly added nodes.

By default, the New Server Cluster Wizard creates the single-node cluster using a local quorum resource if the cluster node is not connected to a cluster storage device.

Single-Quorum Cluster

The single-quorum cluster model, shown in Figure 7.2, has two or more cluster nodes that are configured so that each node is attached to the cluster storage device. All cluster-configuration data is stored on a single-cluster storage device. All cluster nodes have access to the quorum data, but only one cluster node runs the quorum disk resource at any given time.

Figure 7.2. The single-quorum cluster shares one cluster storage device among all cluster nodes.

graphics/07fig02.gif

Majority Node Set Cluster

The majority node set cluster model, shown in Figure 7.3, has two or more cluster nodes that are configured so that the nodes might or might not be attached to one or more cluster storage devices. Cluster configuration data is stored on multiple disks across the entire cluster, and the cluster service is responsible for ensuring that this data is kept consistent across all of the disks. All quorum traffic travels in an unencrypted form over the network using server message block (SMB) file shares. This model provides the advantage of being able to locate cluster nodes in two geographically different locations; they do not all need to be physically attached to the shared cluster storage device.

Figure 7.3. The majority node set cluster model is a high-level clustering solution that allows for geographically dispersed cluster nodes.

graphics/07fig03.gif

Even if all cluster nodes are not located in the same physical location, they appear as a single entity to clients. The majority node set cluster model provides the following advantages over the other clustering models:

Clusters can be created without cluster disks. This is useful when you need to make available applications that can fail over, but you have another means to replicate data among the storage devices.
If a local quorum disk becomes unavailable, it can be taken offline while the rest of the cluster remains available to service client requests.

However, you must abide by some requirements when implementing majority node set clusters to ensure that they are successful:

A maximum of two sites can be used.
The cluster nodes at either site must be capable of communicating with each other with less than a 500ms response time so that the heartbeat messages can accurately indicate the correct status of the cluster nodes.
A high-speed, high-quality wide area network (WAN) or virtual private network (VPN) link must be established between sites so that the cluster's IP address appears the same to all clients, regardless of their location on the network.
Only the cluster quorum information is replicated among the cluster storage devices. You must provide a proven effective means to replicate other data among the cluster storage devices.

The primary disadvantage to this clustering model is that if a certain number of nodes fail, the cluster loses its quorum and it then fails. Table 7.1 shows the maximum number of cluster nodes that can fail before the cluster itself fails.

Table 7.1. Number of Failed Nodes to Fail the Majority Node Set Cluster
Number of Nodes in the Cluster	Number of Failed Nodes That Cause the Cluster to Fail
2	0
3	1
4	1
5	2
6	2
7	3
8	3

As shown in Table 7.1, the majority node cluster set remains operational as long as a majority more than half of the initial cluster nodes remains available.

Majority node set clusters most likely will be the clustering solution of the future because of their capability to geographically separate cluster nodes. This further increases the reliability and redundancy of your clustering solution. However, Microsoft currently recommends that you implement majority node set clustering only in very specific instances and only with close support provided by your original equipment manufacturer (OEM), independent software vendor (ISV), or independent hardware vendor (IHV).

Cluster Operation Modes

You can choose from four basic cluster operation modes when using a single-quorum cluster or a majority node set cluster. These operation modes are specified by defining the cluster failover policies accordingly, as discussed in the next section, "Cluster Failover Policies." The four basic cluster operation modes are listed here:

Failover Pair This mode of operation is configured by allowing applications to fail over between only two specific cluster nodes. Only the two desired nodes should be placed in the possible owner list for the service of concern.
Hot Standby (N+I) This mode of operation helps you reduce the expenses and overhead associated with dedicated failover pairs by consolidating the spare node for each failover pair into a single node. This provides a single cluster node that is capable of taking over the applications from any active node in the event of a failover. Hot standby is often referred to as active/passive, as discussed previously in this chapter. Hot standby is achieved through a combination of using the preferred owners list and the possible owners list. The preferred node is configured in the preferred owners list and designated as the node that will run the application or service under normal conditions. The spare (hot standby) node is configured in the possible owners list.
Failover Ring This mode of operation has each node in the cluster running an instance of the application or service. If a node fails, the application or service on the failed node is moved to the next node in the sequence. The failover ring mode is achieved by using the preferred owner list to define the order of failover for a given resource group. This order should start on a different node on each node in the cluster.
Random This mode of operation allows the cluster to randomly determine the node to which it will be failed over. The random failover mode is configured by providing an empty preferred owner list for each resource group.

Now that you've been introduced to failover, let's examine cluster failover policies.

Cluster Failover Policies

Although the actual configuration of failover and failback policies is discussed later in this chapter, it is important to discuss them briefly here to properly acquaint you with their use and function. Each resource group within the cluster has a prioritized listing of the nodes that are supposed to act as its host.

You can configure failover policies for each resource group to define exactly how each group will behave when a failover occurs. You must configure three settings:

Preferred nodes An internal prioritized list of available nodes for resource group failovers and failbacks can be configured. Ideally, all nodes in the cluster will be in this list, in the order of priority that you designate.
Failover timing The resource can be configured for immediate failover if the resource fails, or the cluster service can be configured to try to restart the resource a specified number of times before failover actually occurs. The failover threshold value should be equal to or less than the number of nodes in the cluster.
Failback timing Failback can be configured to occur as soon as the preferred node is available or during a specified time, such as when peak load is at its lowest, to minimize service disruptions.