Examining Windows Server 2003 Clustering Technologies


Windows Server 2003 provides two clustering technologies, which are included on the Enterprise and Datacenter server platforms. Clustering is the grouping of independent server nodes that are accessed and viewed on the network as a single system. When an application is run from a cluster, the end user can connect to a single cluster node to perform his work, or each request can be handled by multiple nodes in the cluster. In cases where data is read-only, the client may request data and receive the information from all the nodes in the cluster, improving overall performance and response time.

The first clustering technology Windows Server 2003 provides is Cluster Service, also known as Microsoft Cluster Service (MSCS). The Cluster Service provides system fault tolerance through a process called failover. When a system fails or is unable to respond to client requests, the clustered services are taken offline and moved from the failed server to another available server, where they are brought online and begin responding to existing and new connections and requests. Cluster Service is best used to provide fault tolerance for file, print, enterprise messaging, and database servers.

The second Windows Server 2003 clustering technology is network load balancing (NLB) and is best suited to provide fault tolerance for front-end Web applications and Web sites, Terminal servers, VPN servers, and streaming media servers. NLB provides fault tolerance by having each server in the cluster individually run the network services or applications, removing any single points of failure. Certain applicationsfor example, Terminal Servicesrequire a client to connect to the same server during the entire session, while clients viewing Web sites can request pages from any node in the cluster during a visit. Configuring how client/server communication is divided and balanced across the servers is dependent on the application's needs.

Note

Microsoft does not support running both MSCS and NLB on the same computer due to potential hardware sharing conflicts between the two technologies.


Reviewing Cluster Terminology

Before you can design and implement MSCS and NLB clusters, you must understand certain clustering terminology. The following list describes key terms associated with Windows Server 2003 clustering:

  • Cluster A cluster is a group of independent servers that are accessed and viewed on the network as a single system.

  • Node A node is an independent server that is a member of a cluster.

  • Cluster resource A cluster resource is a network application or service defined and managed by the cluster application. Some examples of cluster resources are network names, IP addresses, logical disks, and file shares.

  • Cluster resource group Cluster resources are contained within a cluster in a logical set called a cluster resource group, or commonly referred to as a cluster group. Cluster groups are the units of failover within the cluster. When a cluster resource fails and cannot be restarted automatically, the entire cluster group is taken offline and failed over to another available cluster node.

  • Cluster virtual server A cluster virtual server is a cluster resource group that contains a network name and IP address resource. Virtual server resources are accessed either by the domain name system (DNS) or NetBIOS name resolution or directly from the IP address. The name and IP address remain the same regardless of which cluster node the virtual server is running on.

  • Cluster heartbeat The cluster heartbeat is the communication that is kept between individual cluster nodes that is used to determine node status. Typically, heartbeat communication between nodes must be no longer than 500 milliseconds, or the nodes may believe that there is a failure and commence cluster group failovers.

  • Cluster quorum disk The cluster quorum disk maintains the definitive cluster configuration data. MSCS uses a quorum disk or disks and requires continuous access to the cluster configuration data contained within it. The quorum contains configuration data defining which server nodes actively participate in the cluster, what applications and services are defined in the cluster, and the current states of the resources and the individual nodes. This data is used to determine whether a particular resource group or groups need to be failed to an available cluster node in the event of a failure on an active node. If a cluster node loses access to the quorum, the Cluster Service will fail on that node. In a typical MSCS cluster, the quorum resource is located on a shared storage device.

  • Local quorum resource Like the quorum resource, the local quorum contains the cluster configuration data. Unlike the standard quorum device that is usually housed on a shared disk, the local quorum is kept on a node's local disk. The local quorum resource was created for single-node cluster configurations, commonly used for cluster application development and testing.

  • Majority Node Set (MNS) resource The MNS resource is the quorum resource used for a Majority Node Set cluster. The MNS resource maintains consistent configuration data across all the nodes in the cluster. If the MNS quorum is lost, it can be recovered by "forcing the quorum" on a remaining cluster node. Refer to the Windows Server 2003 online help and look for the topic "Forcing the Quorum in a Majority Node Set Cluster."

  • Generic cluster resource Generic cluster resources were created to define cluster-unaware applications within a cluster group. This gives the ability to fail the resource over to another node in the cluster when the active node fails. This resource is not monitored by the cluster application; therefore, application failure does not result in a restart or failover scenario. Generic cluster resources include the generic application, generic script, and generic service resources. For more information on these resources, refer to the Windows Server 2003 Help and Support tool and search for "generic cluster resources."

  • Cluster-aware application A cluster-aware application provides a mechanism by which the Cluster Service can test the application availability to determine whether it is functioning as desired. When a cluster-aware application fails, the cluster can stop and restart the application as necessary on the same node and, if necessary, move it to another available node where it can be restarted.

  • Cluster-unaware application A cluster-unaware application can run on a cluster, but the application itself is not monitored by the Cluster Service. This means that the cluster can fail over the application only in the event that another resource fails in the cluster group. If the application stops responding, the cluster is not aware and therefore cannot restart it. Keep in mind that there are other ways to manage cluster-unaware applications outside the cluster, and in some cases these approaches may be the only option. For more information on how to install and configure generic applications, refer to the Windows Server 2003 Help and Support and search for "generic application resource type."

  • Failover Failover is the process of a cluster group moving from the current active node to another available node in the cluster. Failover occurs when a server becomes unavailable or when a resource in the cluster group fails and cannot recover with the failure threshold.

  • Failback Failback is the process of a cluster group moving back to a preferred node after the preferred node resumes cluster membership. Failback must be configured within a cluster group for this to happen. The cluster group must have a preferred node defined and a failback threshold configured. A preferred node is the node you would like your cluster group to run on during regular cluster operation. When a group is failing back, the cluster is performing the same failover operation but is triggered by a server rejoining or resuming cluster operation instead of by a server or resource failure.

Note

Plan carefully when considering failback. For more information, refer to the "Configuring Failover and Failback" section later in this chapter.


Active/Passive Clustering Mode

Active/passive clustering occurs when one node in the cluster provides clustered services while the other available node or nodes remain online but do not provide services or applications to end users. When the active node fails, the cluster groups previously running on that node are failed over to the passive node, causing the node's participation in the cluster to go from passive to active state to begin servicing client requests.

This configuration is usually implemented with database servers that provide access to data that is stored in only one location and is too large to replicate throughout the day. One advantage of Active/Passive mode is that if each node in the cluster has similar hardware specifications, there is no performance loss when a failover occurs. The only real disadvantage of this mode is that the passive node's hardware resources cannot be leveraged during regular daily cluster operation.

Note

Active/passive configurations are a great choice for keeping cluster administration and maintenance as low as possible. For example, the passive node can be used to test updates and other patches without directly impacting production. However, it is nonetheless important to test in an isolated lab environment or, at a minimum, during after hours or predefined maintenance windows.


Active/Active Clustering Mode

Active/active clustering occurs when one instance of an application runs on each node of the cluster. When a failure occurs, two or more instances of the application can run on one cluster node. The advantage of Active/Active mode over Active/Passive mode is that the physical hardware resources on each node are used simultaneously. The major disadvantage of this configuration is that if you are running each node of the cluster at 100% capacity, in the event of a node failure, the remaining active node assumes 100% of the failed node's load, greatly reducing performance. As a result, it is critical to monitor server resources at all times and ensure that each node has enough resources to take over the other node's responsibilities if the other should failover.




Microsoft Windows Server 2003 Unleashed(c) R2 Edition
Microsoft Windows Server 2003 Unleashed (R2 Edition)
ISBN: 0672328984
EAN: 2147483647
Year: 2006
Pages: 499

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net