8.7 Microsoft cluster basics | Microsoft Exchange Server 2003 Administrators Pocket Consultant

< Day Day Up >

Microsoft clusters use the shared-nothing model, which means that each server owns and manages local devices (e.g., disks) as specific cluster resources. Clusters include common devices that are available to all of the nodes, but these are owned and managed by only one node at one time. For example, an Exchange virtual server that supports one storage group usually places its transaction logs on a volume, which we will call L: for the moment. The L: volume is visible to all of the servers in the cluster, but only the server that currently hosts the Exchange virtual server running the storage group can access L: at one time. If a failure occurs and the cluster transitions the virtual server to another physical server in the cluster, that server takes ownership of the L: volume.

8.7.1 Resources

Microsoft cluster management services take care of the complex interaction between the physical servers in the cluster, the virtual servers they host, and the resources such as disks that they use, including the management of the different network addresses (names and IP addresses) used by clients to access cluster resources. In this context, a resource is any physical or logical component that you can bring online or take offline within the cluster, but only a single server can own or manage the resource at one time. A network interface card (NIC) is an example of a physical resource, while an IP address is an example of a logical resource.

Each server in the cluster has its own system disk, memory, and copy of the operating system. Each server is responsible for some or all of the resources owned by the cluster, depending on the current state of the cluster. For example, in a two-node cluster, where one node has just failed, the single surviving node hosts its own unique resources (such as its system disk) as well as all the shared cluster resources and the applications that depend on those resources. When the failed server is available again, the cluster redistributes the shared resources to restore equilibrium.

8.7.2 Resource groups and other cluster terms

The number of resources used in a cluster can be quite large, so cluster services use resource groups as the fundamental unit of management within cluster; they also represent the smallest unit that can fail over between nodes in a cluster. Resource groups hold a collection of resources for both the cluster (its network name, IP address, etc.) itself as well as applications. For management purposes, clusters define Exchange virtual servers as resource groups. The shared-nothing model prevents the different nodes within the cluster from attempting to own resources or resource groups simultaneously, so all the resources that make up an Exchange virtual server must run on a single node. In fact, if you have enough processing power, you can run multiple Exchange virtual servers on a single physical computer- something that is interesting in the software laboratory but not recommended for production.

Resource groups can contain both logical and physical resources. For Exchange, the logical resources include the name of the virtual server and its IP address as well as the set of services that make up Exchange. The physical resources include details of any shared disks (used to hold the binaries, Store, and logs). Resource groups often have dependencies on other resource groups-conditions that must be satisfied before the resource group can come online. The properties of a resource or resource group state any dependencies that exist. For example (Figure 8.4), an Exchange virtual server cannot come online unless it has a valid IP address to allow clients to connect. You can only bring Exchange resources online in dependency order.

click to expand
Figure 8.4: Resource dependency.

Dependencies also exist on standard Exchange servers, the best example being the Information Store service, which cannot start if the System Attendant is not running. Note that dependencies cannot span resource group boundaries, since this would complicate cluster management enormously and create situations where resource dependencies might be scattered across various physical servers. In our example, you could not create a dependency for an Exchange virtual server on an IP address that is part of a different resource group.

Figure 8.5 shows the resource groups and resources for a very simple cluster. In this case, the cluster consists of one physical server. Even on a single-node cluster, the basic principles of a cluster still apply, so we can see details of cluster resources as well as the resources that make up an Exchange virtual server. Notice that Exchange represents all of the services that you would expect to see on a standard Exchange server to the cluster as resources. The resources also include some elements that are under the control of IIS, such as the different protocol virtual servers used by Exchange (IMAP, SMTP, POP3, and HTTP).

click to expand
Figure 8.5: Cluster groups and resources.

Before going too far, we should first explain the various names used in a cluster, which include:

The name of the cluster (in this case, HPQNETCL1)
The names of each of the physical servers (nodes) that make up the cluster-here we have two physical servers (HPQNET-CLNODE1 and HPQNET-CLNODE2), which are the computers that Windows, cluster services, and applications such as Exchange run on.
The names of each of the virtual servers that the cluster hosts: Clients do not connect to the cluster, nor do they connect to a physical computer. Logically, they look for the name of the Exchange server that holds their mailboxes. This cluster supports only one Exchange virtual server (HPQNET-EVS1), which runs on a physical server that is part of the cluster. Cluster services move a virtual server from one physical server to another within the cluster. Moves do not affect clients, because the cluster services take care of redirecting incoming client requests to the combination of hardware and software that represents the virtual server within the cluster at that point in time.

It makes sense to decide upon and use naming conventions for cluster systems and virtual servers so that their purpose is obvious at a glance. Some practical definitions of other important cluster terms include:

Generically cluster aware: A mode where an application is cluster aware by using the generic cluster support DLL, meaning that the application is not specially upgraded to support clusters and can only operate on one node of the cluster at a time. Microsoft supplies the generic cluster support DLL to allow vendors (including its own development groups) to run applications on a cluster with minimum effort.
Purpose-built cluster aware: A mode where an application is cluster aware through special application-specific code, which enables the application to take full advantage of cluster capabilities, and the application can run on all nodes of the cluster concurrently. Exchange implements its support for clusters through EXRES.DLL, which the setup program installs when you install Exchange on a cluster. EXRES.DLL acts as the interface between the Exchange virtual server and the cluster. At the same time, setup installs EXCLUADM.DLL to enable the cluster administration program to manage Exchange components so that they respond to calls such as "come online," "go offline," and so on. With these components installed, the core of Exchange can run on all nodes in a cluster (active-active mode), but some older or less frequently used code does not support this mode or cannot run at all on a cluster.
Cluster registry: A separate repository to the standard system registry used to track the cluster configuration and details about resources and resource groups. The quorum resource holds the cluster registry. A mechanism called "global update" publishes information about cluster changes to members of the cluster.
Members (or nodes): The physical computers that make up the cluster. In production, clusters range from a two-node cluster to an eight- node cluster (on Windows Server 2003 Enterprise Edition), although you can build a single-node cluster for training or test purposes.
Quorum resource (Figure 8.6): Most Windows clusters use a disk quorum, literally a physical disk that holds the registry and other data necessary to track the current state of the cluster plus the necessary information to transfer resource groups between nodes. While Exchange 2003 does not have any direct involvement with quorums (this is the responsibility of the OS), you can install Exchange clusters with disk quorums as well as local and majority node set quorums. A local quorum is only available to a single-node cluster (also known as a "lone wolf" cluster), which you would typically use in a disaster recovery scenario, while a majority node set quorum is usually found in stretched clusters where multiple systems use a disk fabric to communicate across several physical locations. In this situation, network interrupts may prevent all the systems from coming online at the same time, so majority set quorums allow the cluster to function once a majority of the nodes connect. For example, once five nodes in an eight-node cluster connect, a quorum exists.

Figure 8.6: The cluster quorum.

Cluster purists may not agree with some of the definitions offered here. However, they are functional rather than precise and provide enough foundation to proceed.

8.7.3 Installing Exchange on a cluster

You must have the following resources to install Exchange on a cluster:

An IP address and a network name for each virtual server. You cannot use dynamic IP addresses.
The physical hardware for the cluster nodes, ideally balanced in terms of CPU (number and speed) and memory.
Physical shared disk resources configured to hold the Store databases and transaction logs.

It is best to create a separate resource group for each Exchange virtual server in the cluster and then move the storage used for the databases and so on into the resource group. Installing Exchange on a cluster is no excuse to ignore best practice for the Store, so make sure that you place the databases and the transaction logs on separate physical volumes. Interestingly, the number of available drive letters may cause some design problems on very large Exchange clusters, since you have to allocate different drive letters to each storage group and perhaps the volume holding the transaction logs for each storage group. This problem does not occur when you deploy Exchange 2003 on Windows 2003 clusters, because you can use mount points to overcome the lack of available drive letters. By convention, clusters use drive Q: for the quorum resource and M: for ExIFS (such as all other Exchange servers).

Remember that on Windows 2003 clusters, you have to install components such as IIS and ASP.NET on each node before you can install Exchange. Exchange 2003 requires Microsoft DTC, so you have to create it as a cluster resource before you install Exchange.

Equipped with the necessary hardware, you can proceed to install the cluster and elect for an active-passive or active-active configuration (for a two-node cluster) up to an eight-node cluster where seven nodes are active and one is passive. Installing the cluster is reasonably straightforward, and defining the number of storage groups and databases is the only issue that you have to pay much attention to afterward. The enterprise edition of Exchange 2000 or 2003 supports up to four storage groups of five databases. Each virtual server running in a cluster can support up to these limits, but such a configuration runs into problems when a failure occurs, because Exchange cannot transfer the storage groups over to another cluster node. Consider this scenario: You have two virtual servers, each configured with three storage groups of three databases. A failure occurs and Exchange attempts to transfer the three storage groups from the failed server to the virtual server that is still active. The active virtual server can accept one storage group and its databases and then encounters the limit of four storage groups, so a full transition is impossible. Cluster designs, therefore, focus on failure scenarios to ensure that remaining virtual servers can take the load and never exceed the limits. In a very large cluster, where each virtual server supports two storage groups, you may only be able to handle a situation where two or three servers fail concurrently, depending on the number of storage groups each virtual server supports.

8.7.4 What clusters do not support

The vast majority of Exchange code runs on a cluster, but you should think of clusters as primarily a mailbox server platform, because of some limitations on connector support. In addition, you never think of clusters for front-end servers, because these systems do not need the high level of resilience and failover that clusters can provide and they are too expensive.

Most of the components not supported by clusters are old or of limited interest to the general messaging community. These are:

NNTP
Exchange 2000 Key Management Server
Exchange 2000 Instant Messaging
Exchange 2000 Chat
MTA-based connectors (GroupWise, Lotus Notes, cc:Mail, Microsoft Mail, IBM PROFS, IBM SNADS)
Exchange 2000 Event Service
Site Replication Service

You can break down these components into a set of old connectors, which, depending on the MTA, are being phased out in favor of SMTP connections; subsystems such as Instant Messaging, which Exchange 2003 does not support; and the Site Replication Service, which is only needed while you migrate from Exchange 5.5. The exception is NNTP, but very few people use Exchange as an NNTP server or to accept NNTP newsfeeds simply because other lower-cost servers are better at the job. In addition, using a cluster for NNTP is total overkill.

8.7.5 Dependencies

Figure 8.7 illustrates the resource models implemented in Exchange 2000 and Exchange 2003. The resource model defines dependencies between the various components that run in a cluster. The Exchange 2000 resource model centers on the System Attendant and the Store, so if either of these processes fails, it affects many other processes. By comparison, the Exchange 2003 resource model removes many of the previous dependencies on the Store and makes the System Attendant process the sole "must-be- alive" process for a cluster to function. The change improves failover times by reducing the processes that have to be stopped and restarted if a problem occurs; this is entirely logical, because the protocol stacks have a dependency on IIS rather than the Store.

click to expand
Figure 8.7: Exchange cluster resource models.

8.7.6 Clusters and memory fragmentation

When Microsoft released Exchange 2000, system designers looked forward to a new era of high-end email servers built around active-active clusters, a promise that was further embellished when Exchange 2000 SP1 provided the necessary support for Windows 2000 Datacenter Edition to enable four-way active-active clusters. System designers look to clustering to provide high degrees of both system resilience and availability and often as a way to consolidate a number of servers into a smaller set of large clusters.

Exchange 5.5 supports active-passive two-node clustering, meaning that one physical system or node actively supports users while its mate remains passive, waiting to be brought into action through a cluster state transition should the active system fail. This is an expensive solution, because of the need for multiple licensed copies of the application, operating system, and any associated third-party utilities (e.g., backup or antivirus programs), as well as the hardware. Active-active clusters provide a better "bang" for your investment, because all of the hardware resources in the cluster are available to serve users.

Unfortunately, active-active clusters ran into virtual memory fragmentation problems within the Store, and this issue prevents Exchange from taking full advantage of clustering. The way that Exchange implements Store partitioning is by establishing a storage group as a cluster resource that is transitioned (along with all its associated databases and transaction logs) if a problem occurs. However, while everything looked good on the theoretical front, clustering has not been so good in practice. Exchange uses dynamic buffer allocation (DBA) to manage the memory buffers used by the Store process. DBA sometimes gives administrators heart palpitations, because they see the memory used by STORE.EXE growing rapidly to a point where Exchange seems to take over the system. This behavior is by design since DBA attempts to balance the demands of Exchange to keep as many Store buffers and data in memory as possible against the needs of other applications. On servers that only run Exchange it is quite normal to see the Store take large amounts of memory and keep it, because there is no other competing applications that need this resource.

During normal operation, Windows allocates and deallocates virtual memory in various sizes to the Store to map mailboxes and other structures. Virtual memory is sometimes allocated in contiguous chunks, such as the approximately 10 MB of memory that is required to mount a database, but as time goes by it may become difficult for Windows to provide the Store with enough contiguous virtual memory, because it has become fragmented. In concept, this is similar to the fragmentation that occurs on disks, and usually it does not cause too many problems-except for cluster state transitions.

During a cluster state transition, the cluster must move the storage groups that were active on a failed node to one or more other nodes in the cluster. Storage groups consist of a set of databases, so the Store has to be able to initialize the storage group and then mount the databases to allow users to access their mailboxes. You can track this activity through event 1133 in the application event log (see left-hand screen shot in Figure 8.8).

click to expand
Figure 8.8: Allocating resources to mount a database, and a failure.

On a heavily loaded cluster, it may be possible that the Store is not able to mount the databases, because no contiguous virtual memory or not enough contiguous virtual memory is available, in which case you will see an event such as 2065, shown in the right-hand screen shot in Figure 8.8. Thus, we arrive at the situation where the cluster state transition occurs but the Store is essentially brain dead, because the databases are unavailable.

Now, it is worth noting that this kind of situation only occurs on heavily loaded systems, but you will remember that server consolidation and building big, highly resilient systems is one of the prime driving factors for system designers to consider clusters in the first place. After receiving problem reports, Microsoft analyzed the data and realized that it had a problem. It began advising customers to limit cluster designs to lower the numbers of concurrently supported clients (1,000 in Exchange 2000, 1,500 in SP1, and 1,900 in SP2, going a little higher with SP3^[5]) when running in active-active mode.

Because MAPI is the most functional and feature-rich protocol, MAPI clients usually generate the heaviest workload for Exchange, so these numbers reflect a MAPI load. Outlook Web Access clients generate much the same type of demand as MAPI. The functions exercised through other client protocols (such as IMAP4 and POP3) typically generate lower system demand and may result in a lesser workload for the server, so it is possible that you will be able to support more client connections before the virtual memory problem appears. Your mileage will vary, and a solid performance and scalability test is required to settle on any final cluster configuration. The test must be realistic and include all of the software incorporated in the final design.

From Exchange 2000 SP3 onward, the Store includes a new virtual memory management algorithm, which changes the way it allocates and frees virtual memory. The key changes are:

JET top-down allocation: Prior to SP3, the JET database engine allocates virtual memory for its needs from the bottom up in 4-K pages. Other processes that require virtual memory (Store, epoxy, IIS, etc.) are also allocating virtual memory from the bottom up, but they allocate memory in different sizes. This method of managing memory can result in virtual memory fragmentation when multiple processes are continuously requesting and releasing virtual memory. SP3 changed the JET virtual memory allocation to a top-down model to eliminate contention for resources with other system processes. In practical terms, the top-down model results in less virtual memory fragmentation, because small JET allocations pack together tightly. It also allows the Store process to access larger contiguous blocks of virtual memory over sustained periods of load.
Max open tables change: When the JET database engine initially starts, it requests the virtual memory necessary to maintain a cache of open tables for each storage group. The idea is to have tables cached in memory to avoid the need to go to disk and page tables into and out of memory as the Store services client requests. SP2 allocates enough memory for each storage group to hold 80,000 tables open, which requires a sizable amount of virtual memory. SP3 reduces the request to 27,000 open tables per storage group. The reduction in the request for memory does not seem to affect the Store's performance and increases the size of the virtual memory pool available to other processes. In addition, lowering the size of MaxOpenTables leads to fewer small allocations by JET.

Experience to date demonstrates that servers running SP3 encounter less memory problems on high-end clusters. Thus, if you want to run a cluster or any high-end Exchange server, make sure that you carefully track the latest release of the software in order to take advantage of the constant tuning of the Store and other components that Microsoft does in response to customer experience.

The problems with virtual memory management forced Microsoft to express views on how active clusters should be. Essentially, Microsoft's advice is to keep a passive node available whenever possible, meaning that a two-node cluster is going to run in active-passive mode and a four-node cluster will be active on three nodes and be passive on the fourth. Of course, this approach is most valid if the cluster supports heavy load generated by clients, connectors, or other processing. Clusters that support a small number of clients and perhaps run only a single storage group with a few databases on each active node usually operate successfully in a fully active manner, because virtual memory fragmentation is less likely to occur.

By definition, because a "fresh" node is always available in an active-passive configuration, clusters can support higher numbers of users per active node, perhaps up to 5,000 mailboxes per node. The exact figure depends on the system configuration, the load generated by the users, the type of clients used, and careful monitoring of virtual memory on the active nodes as they come under load. There is no simple and quick answer to the "how many users will a system support" question here, and you will need to work through a sizing exercise to determine the optimum production configuration. See the Microsoft white paper on Exchange clustering posted on its Web site for more details about how to monitor clustered systems, especially regarding the use of virtual memory.

8.7.7 Monitoring virtual memory use

Exchange incorporates a set of performance monitor counters that you can use to check virtual memory use on a cluster. Table 8.2 lists the essential counters to monitor.

Table 8.2: Performance Counters to Monitor Virtual Memory
Performance Object	Performance Counter	Description
MSExchangeIS	VM largest block size	Size in bytes of the largest free virtual memory block
MSExchangeIS	VM total free blocks	Total number of free virtual memory blocks
MSExchangeIS	VM total 16 MB free blocks	Total number of free virtual memory blocks larger than or equal to 16 MB
MSExchangeIS	VM total large free block bytes	Total number of bytes in free virtual memory blocks larger than or equal to 16 MB

Figure 8.9 shows the performance monitor in use on a cluster. In this case, there is plenty of virtual memory available, so no problems are expected. If available virtual memory begins to decline as the load on a cluster grows, Exchange logs a warning event 9582^[6] when less than 32 MB of available memory is present and then flags the same event again, this time with an error status, when no contiguous blocks of virtual memory larger than 16 MB exist inside STORE.EXE. After the Store reaches the threshold, the cluster can become unstable and stop responding to client requests, and you will have to reboot. Microsoft Knowledge Base article 317411 explains some of the steps that you can take to capture system information to assist troubleshooting if virtual memory problems occur.

click to expand
Figure 8.9: Monitoring virtual memory.

You may also see event 9582 immediately after a failover to a passive node, if the passive node has ever hosted the same virtual server that the cluster now wishes to transition. Each node maintains a stub STORE.EXE process, and the memory structures within the Store process may already be fragmented before a transition occurs, leading to the error. You can attempt to transition the virtual server to another node in the cluster and then restart the server that has the fragmented memory, or, if a passive node is not available, you will have to restart the active node. The rewrite of the virtual memory management code included in Exchange 2000 SP3 generates far fewer problems of this nature, and you are unlikely to see event 9582 triggered under anything but extreme load.

Microsoft made many changes to virtual memory management in Exchange 2003, and, generally speaking, the situation is much better and you should not see 9582 events logged as frequently as on an Exchange 2000 server. In addition, Microsoft incorporated a new safety valve into the Store process that kicks in if the Store signals the warning 9582 event. When this happens, the Store requests a one-time reduction (or back-off) of the ESE buffer to free up an additional 64-MB block of virtual memory. The net effect is that the Store can use this memory to handle the demand that caused the amount of free virtual memory to drop to critical limits. However, because the Store releases the virtual memory from the ESE buffer, server performance is affected and you cannot ignore the event. Instead, you should schedule a server reboot as soon as convenient. The advantage of the one-time reduction is that you have the opportunity to schedule the server reboot in a graceful manner, but it is not an excuse to keep the server up and running, because the 9582 error event will eventually occur again and you have to conduct an immediate reboot.

Note that some third-party products-particularly virus checkers-can affect how the Store uses virtual memory. If you run into problems, check that you have the latest version of any third-party product and monitor the situation with the product enabled and then disabled to see if it makes a difference.

8.7.8 RPC client requests

The RPC Requests performance counter for the Store (MSExchangeIS) tracks the number of outstanding client requests that the Store is handling. On very large and heavily loaded clusters, the workload generated by clients may exceed the capacity of the Store and requests begin to queue. Normally, if the server is able to respond to all the client workload, the number of outstanding requests should be zero or very low. If the value of the RPC Requests counter exceeds 65, you may encounter a condition where Exchange may lose connectivity to the Global Catalog Server, resulting in clients experiencing a "server stall." Outlook 2003 clients that operate in cached Exchange mode experience fewer interruptions during cluster transitions or when servers have other problems, so you may want to deploy Outlook 2003 clients alongside Exchange 2003 clusters to isolate users as much as possible from server outages.

8.7.9 Deciding for or against a cluster

Assuming that you have the knowledge to properly size, configure, and manage an Exchange cluster, Table 8.3 lists some of the other factors that companies usually take into account before they decide to put clusters into production.

Table 8.3: Pros and Cons of Exchange Clusters
Pros	Cons
Clusters allow you to update software (including service packs) on a rolling basis, one node at a time. This ensures that you can provide a more continuous service to clients, because you do not have to take the cluster totally offline to update software.	If you plan software upgrades properly, schedule them for low-demand times (e.g., Sunday morning), and communicate the necessary downtime to users well in advance, so you can take down a server to apply an upgrade without greatly affecting users. Routine maintenance is necessary for all systems, so planning a software upgrade at the same time is not a big problem. Microsoft hot fixes are often untested on clusters when released to customers, so it is a mistake to assume that you can apply every patch to a cluster. In addition, third-party product upgrades do not always support rolling upgrades, and you can only apply the upgrade to the active node.
Clusters provide greater system uptime by transitioning work to active members of the cluster when problems occur.	Clusters are expensive and may not justify the additional expense over a well-configured standard server in terms of additional uptime.
Active-active clusters are a great way to spread load across all the servers in a cluster.	Memory management problems limit the number of concurrent clients that an active-active cluster supports, so many clusters run in active-passive mode to ensure that transitions can occur.
Clusters provide protection against failures in components such as motherboards, CPUs, and memory.	Clusters provide no protection against storage failures, so they have an Achilles heel. Because clusters are not widely used, a smaller choice of add-on software products is available for both Windows and Exchange. Clusters require greater experience, knowledge, and attention to detail from administrators than standard servers. Clusters do not support all Exchange components and therefore are only useful as mailbox servers. Failures in the shared disk subsystem remain the Achilles heel of clusters: A transition from one node to another that depends on a failed disk will not work.

When many companies reviewed their options for Exchange server configurations, they decided not to use clusters and opted for regular servers instead. Common reasons cited by administrators include:

Not all locations in the organization require (or can fund) the degree of uptime that a cluster can provide. Deployment and subsequent support is easier if standard configurations are used everywhere and the total investment required to support Exchange is less.
Administrators can be trained on a single platform without having to accommodate "what if" scenarios if clusters are used.
The choice of third-party products is much wider if clusters are not used.
The hardware and software used by the cluster are expensive.
Experience of Exchange 5.5 clusters had not been positive.

Every company is different, and the reasons why one company declines to use clusters may not apply elsewhere. Compaq was the first large company to achieve a migration to Exchange 2000 and opted not to use clusters. As it happens, the Exchange organization at Compaq does include a couple of clusters, but they only support small user populations and support groups that have the time and interest to maintain the clusters. In addition, none of the clusters at Compaq uses active-active clustering. On the other hand, many companies operate two-node and four-node production-quality clusters successfully. In all cases, these companies have dedicated the required effort and expertise to deploy and manage the clusters.

8.7.10 Does Exchange 2003 make a difference to clusters?

The combination of Windows 2003 and Exchange 2003 introduces a new dimension to consider when you look at clusters. The major improvements are:

The dependency on Windows 2000 Datacenter Edition is gone, so you can now deploy up to eight-node clusters without the additional expense that Windows 2000 Datacenter edition introduces. Now that the Enterprise Edition of Exchange 2003 supports up to eight nodes in a cluster, administrators have a lot more flexibility in design.
Windows 2003 and Exchange 2003 both make changes that contribute to better control of memory fragmentation, which may increase the number of MAPI clients that a cluster supports. Windows and Exchange also make better use of large amounts of memory, because Microsoft has gained more experience of how to use memory above 1GB when it is available.
You can use drive mount points to eliminate the Windows 2000/ Exchange 2000 restriction on the number of available drive letters, which limits the number of available disk groups in a cluster. This is important when you deploy more than ten storage groups spread across multiple cluster nodes.
Assuming that you use appropriate hardware and backup software, you can use the Volume ShadowCopy Services (VSS) API introduced in Windows 2003 to take hot snapshot backups. This is critical, because clusters cannot attain their full potential if administrators limit the size of the databases they are willing to deploy, in turn limiting the number of mailboxes that a cluster can host.
The Recovery Storage Group feature lets administrators recover from individual database failures more quickly and without having to deploy dedicated recovery servers.
The Store is faster at moving storage groups from failed servers to active nodes.

In addition, if you deploy Outlook 2003 clients in cached Exchange mode, there is potential to support more concurrent MAPI clients per cluster node because the clients generate less RPC operations against the server, since much of the work that previous generations of MAPI clients did using server-based data is now executed against client-side data. However, we are still in the early days of exploring this potential and hard results are not yet available.

To Microsoft's credit, it is using clusters to test the technology and help consolidate servers. For its Exchange 2003 deployment, Microsoft has a "datacenter class" cluster built from seven nodes that support four Exchange virtual servers. Four large servers (HP Proliant DL580 G2 with quad 1.9- GHz Xeon III processors and 4 GB of RAM) take the bulk of the load by hosting the Exchange virtual servers, each supporting 4,000 mailboxes with a 200-MB default quota. A passive server is available to handle outages, and two other "auxiliary" servers are available to perform backups and handle other administrative tasks. Microsoft performs backups to disk and then moves the virtual server that owns the disks holding the backup data to the dedicated backup nodes, a technique necessary to handle the I/O load generated when they move the data to tape for archival. All the servers connect to an HP StorageWorks EVA5000 SAN, and the storage design makes heavy use of mount points to allocate disk areas for databases, transaction logs, SMTP work areas, and so on. Supporting 16,000 mailboxes on a large cluster demonstrates that you can deploy clusters to support large numbers of users. Of course, not all of the users are active at any time, and the administrators pay close attention to memory defragmentation in line with best practice, along with normal administrative tasks.

One thing is certain: It is a bad idea simply to install a cluster because you want to achieve highly reliable Exchange. An adequately configured and well-managed standalone server running the latest service pack is as likely to attain a "four nines" SLA as a cluster.

8.7.11 Clusters-in summary

Microsoft did its best to fix the problems with memory fragmentation, but there is no doubt that Exchange 2000 clusters have been a disappointment. As with Exchange 5.5 clustering, which initially promised a lot and ended up being an expensive solution for the value it delivered, the problems have convinced many who considered Exchange clusters to look at other alternatives, notably investing in standalone servers that share a Storage Area Network (SAN). In this environment, you devote major investment into building resilience through storage rather than clusters. If you have a problem with a server, you still end up with affected users, but the theory is that the vast majority of problems experienced with Exchange are disk related rather than software or other hardware components. Accordingly, if you take advantage of the latest SAN technology to provide the highest degree of storage reliability, you may have a better solution to the immediate need for robustness. Going with a SAN also offers some long-term advantages, since you can treat servers as discardable items, planning to swap them out for newer computers as they become available, while your databases stay intact and available in the SAN.

Fools rush in to deploy clusters where experienced administrators pause for thought. There is no doubt that Exchange clusters are more complex than standard servers are. Experience demonstrates that you must carefully manage clusters to generate the desired levels of uptime and resilience. Those who plunge in to deploy clusters without investing the necessary time to plan, design, and deploy generally encounter problems that they might avoid with standard servers after they have installed some expensive hardware. On the other hand, those who know what they are doing can manage clusters successfully and attain the desired results. At the end of the day, it all comes down to personal choice.

The early reports of successful deployments of Exchange 2003 clusters, including Microsoft's own, are encouraging and we can hope that the changes in Windows 2003, Exchange 2003, and Outlook 2003, as well as improvements in server and storage technology and third-party software products, all contribute to making Exchange clusters a viable option for more deployments. The challenge for Microsoft now is to continue driving complexity out of cluster software and administration so that it becomes as easy to install a cluster as it is to install a standard server. That day is not yet here.

I remain positive about clusters. Providing that you carefully plan cluster configurations and then deploy those configurations, along with system administrators who have the appropriate level of knowledge about the hardware, operating system, and application environment, clusters do a fine job; I am still content to have my own mailbox located on an Exchange cluster. The problem is that there have been too many hiccups along the road, and clusters have not achieved their original promise. Work is continuing to improve matters, but in the interim, anyone who is interested in clustering Exchange servers should consider all options before making a final decision.

^[5] . At the time of writing, Microsoft has not yet completed its testing to identify suggested levels of mailbox support for Exchange 2003.

^[6] . Article 314736 describes how incorrect use of the /3GB switch in BOOT.INI on Exchange 2000 servers can also generate event 9582.

< Day Day Up >