Lesson 1: Introduction to Server Clusters | MCSE Training Kit (Exam 70-226): Designing Highly Available Web Solutions with Microsoft Windows 2000 Server Technologies (MCSE Training Kits)

Windows 2000 Advanced Server and Windows 2000 Datacenter Server provide high availability by allowing a server in a cluster to take over and run a service or application that was running on another cluster member that failed. These services or applications are provided by means of virtual servers. To users, a virtual server appears as a single system. The cluster can provide any number of virtual servers, limited only by the capacity of the servers in the cluster and the storage available to provide the required performance. Administrators control the cluster servers as a single unit and can administer the cluster remotely. The Cluster service provides many benefits, including rolling upgrade support, improved use of hardware resources, greater availability, and ease of user access. This lesson will introduce you to the various components that make up clusters and the Cluster service.

After this lesson, you will be able to

Identify the components and objects that make up the Cluster service
Provide an overview of the server cluster architecture and how it’s implemented in a Windows 2000 Server network

Estimated lesson time: 30 minutes

Overview of Server Clusters

A server cluster is a group of computers working together as a single system to ensure that mission-critical applications and resources remain available to clients. Each computer, known as a node, must be running Windows 2000 Advanced Server or Windows 2000 Datacenter Server. Every node is attached to one or more common storage devices used by every node in the cluster. Clustering allows users and administrators to access and manage the nodes as a single system rather than as separate computers.

The Cluster service is a Windows 2000 service that is made up of components on each node that perform cluster-specific activity. One of the primary activities is to manage resources, which are the hardware and software components within the cluster. The instrumentation mechanism that the Cluster service provides for managing resources is the resource dynamic-linked library (DLL). Resource DLLs define resource abstractions, communication interfaces, and management operations.

Windows NT Server 4, Enterprise Edition, supported a clustering technology named Microsoft Cluster Server, which provided much of the same functionality as the Cluster service in Windows 2000.

A resource is online when it’s available and providing its service to the cluster. Resources include physical hardware devices, such as disk drives and network cards, and logical items, such as Internet Protocol (IP) addresses, applications, and application databases. Each node in the cluster has its own local resources. However, the cluster also has common resources, such as a common data storage array and private cluster network. Each node in the cluster can access these common resources. One special common resource is the quorum resource, a dedicated physical resource in the common cluster disk array that plays a critical role in cluster operations. It must be present for node operations—such as forming or joining a cluster—to occur.

Server Cluster Software

A server cluster runs several pieces of software that fall into two categories: the software that makes the cluster run (clustering software) and the software that you use to administer the cluster (administrative software). Table 4.1 describes each of these types of software.

Table 4.1 Clustering and Administrative Software on a Server Cluster

Type of Software	Description
Clustering	Clustering software enables a cluster’s nodes to exchange specific messages that trigger the transfer of resource operations at the appropriate times. The two main pieces of clustering software are the Resource Monitor and the Cluster service.
Administrative	Administrators use cluster management applications to configure, control, and monitor clusters. Windows 2000 provides Cluster Administrator for this purpose. Any computer running Windows NT 4 Service Pack 3 or later, regardless of whether it’s a cluster node, can install Cluster Administrator.

Server Cluster Components

Windows 2000 Advanced Server and Windows 2000 Datacenter Server use the server cluster components to create server clusters that provide high availability, easy manageability, and enhanced scalability. These components work together to manage the cluster objects. Figure 4.1 shows how the cluster components relate to applications of various types and to each other within a single cluster node.

Table 4.2 describes the components that are shown in Figure 4.1.

Figure 4.1 - Components of the Cluster service

Table 4.2 Components of the Cluster Service

Component	Description
Cluster service	Manages all cluster-specific activity, including managing cluster objects and configurations, coordinating with other instances of the Cluster service, handling event notification, facilitating communication among components, and performing failover operations. One instance of the Cluster service is running on each node in a cluster.
Resource Monitor	Acts as an intermediary between the Cluster service and a resource DLL. The Resource Monitor transfers requests from the Cluster service to the appropriate DLL and delivers status and event information from the DLL to the Cluster service.
Resource DLL	Manages cluster resources of a particular type. Each DLL can manage one or more resource types.
Cluster Administrator	Management application used to configure, control, and monitor clusters. Cluster Administrator allows you to manage cluster objects, establish resource groups, initiate failover, handle maintenance, and monitor cluster activity.
Cluster automation server	Exposes a set of 32-bit Component Object Model (COM) objects to any scripting language that supports automation. Cluster automation server enables object-oriented design and the use of high-level languages, simplifying the process of creating a cluster management application.
Cluster database	Resides in the Windows 2000 registry on each cluster node and is also known as the cluster hive. The cluster database contains information about all physical and logical elements in a cluster, including cluster objects, their properties, and configuration data. Each node’s Cluster service maintains a consistent, updated image of the cluster database through global updates and periodic checkpointing. The quorum resource contains a copy of the cluster database as well.
Network and disk drivers	Monitors the status of all network paths between nodes, detects communication failure, and routes messages. Each node in the cluster runs an instance of the Cluster Network Driver.
Cluster application programming interface (API)	Acts as an interface to the Cluster Administrator, Cluster automation server, the Cluster service, and cluster-aware applications.
Resource API	Acts as an interface to any Resource Monitors and Resource DLLs.
IP Address and Disk	Two types of resources.

Server Cluster Objects

The Cluster service manages physical and logical units known as cluster objects. Each object is associated with the following attributes, controls, and functions:

One or more properties, or attributes, that define the object and its behavior within the cluster
A set of cluster control codes used to manipulate the object’s properties
A set of object management functions used to manage the object through the Cluster service

This section provides an overview of the objects managed by the Cluster service.

Networks

A network (sometimes called an interconnect) performs one of the following roles in a cluster:

A private network that carries internal cluster communication
A public network that provides client systems with access to cluster application services
A public-and-private network that carries internal cluster communication and that provides client systems to cluster application services
A network that is neither public nor private that carries traffic unrelated to cluster operation

Preventing Network Failure

The Cluster service uses all available private and public-and-private networks for internal communication. You should configure multiple networks as private or public-and-private to protect the cluster from a single network failure. If only one such network is available and it fails, the cluster nodes stop communicating with one another. When two nodes are unable to communicate, they’re said to be partitioned. After two nodes become partitioned, the Cluster service automatically shuts down on one node to guarantee the consistency of application data and the cluster configuration. If the Cluster service were not shut down on one node, cluster resources could become unavailable to client systems.

Node-to-Node Communication

The Cluster service doesn’t use public networks for internal communication, even if a public network is the only available network. For example, suppose a cluster has Network A configured as private and Network B configured as public, and Network A fails. Because Network B is public, the Cluster service doesn’t use it, and the nodes stop communicating with each other.

Network Interfaces

A network interface is a card or other network adapter that connects a computer to a network. Windows 2000 Advanced Server and Windows 2000 Datacenter Server keep track of all network interfaces in a server cluster. As a result, you can view the state of all cluster network interfaces from a cluster management application, such as Cluster Administrator. Windows 2000 automatically detects the addition and removal of network interfaces.

Nodes

A node is a member of a server cluster. Windows 2000 Advanced Server sup-ports two nodes in a cluster, and Windows 2000 Datacenter Server supports four nodes. Nodes must be either domain controllers or member servers authenti-cated by domain controllers. Nodes have their own resources, such as a hard disk and a dedicated network interface card (NIC) for private cluster network communication. Nodes in a cluster also share access to cluster resources on an external disk storage system called the clustered disk.

Every node is attached to one or more cluster storage devices, either directly or through Fibre Channel hubs or switches. Each cluster storage device contains multiple disks arranged in sets or arrays, and each set or array is configured as a specific RAID type. These sets of disks or arrays are typically known as logical unit numbers (LUNs). Usually, each LUN, or virtual disk, will have a specific Small Computer System Interface (SCSI) ID associated with it, and Windows 2000 will interpret each LUN as a physical disk in Disk Manager. The disks store all the cluster’s configuration and resource data. Each disk can be owned by only one node at any one time, but ownership can be transferred between nodes. The result is that each node has access to all cluster configuration data.

All nodes in the cluster are grouped under a common name, the cluster name, which you use when accessing and managing the cluster.

Resource Groups

A resource group is a logical collection of cluster resources. Typically, a resource group is made up of logically related resources such as applications and their associated peripherals and data. However, it can contain cluster entities that are related only by administrative needs, such as an administrative collection of virtual server names and IP addresses. A resource group can be owned by only one node at a time, and individual resources within a group must exist on the node that currently owns the group. At any given instance, different servers in the cluster can’t own different resources in the same resource group.

Each resource group has an associated cluster-wide policy that specifies which server the group prefers to run on and which server the group should move to in case of a failure. Each group also has a network service name and address to enable network clients to bind to the services provided by the resource group. In the event of a failure, resource groups, which will typically failover, are moved as atomic units from the failed node to an available node in the cluster.

Each resource in a group may depend on other resources in the cluster. Dependencies are relationships between resources that indicate which resources need to be started and be available before another resource can be started. For example, a database application might depend on the availability of a disk, IP address, and network name to be able to start and provide services to other applications and clients.

The scope of any identified dependency is limited to resources within the same resource group. Cluster-managed dependencies can’t extend beyond the resource group, because resource groups are failed over, moved, and brought online and offline independently.

Three concepts important to the management of resource groups are virtual servers, failover, and failback. This section discusses these concepts and explains how they apply to resource groups.

Virtual Servers

One of the benefits of the Cluster service is that applications and services running on a server cluster can be exposed to users and workstations as virtual servers. To users and clients, connecting to an application or service running as a clustered virtual server appears to be the same process as connecting to a single physical server. In fact, any node in the cluster can host the connection to a virtual server. The user or client application doesn’t know which node is actually hosting the virtual server.

Any nonclustered service or application can run on a cluster node without being managed as a virtual server.

Multiple virtual servers representing multiple applications can be hosted in a cluster, as shown in Figure 4.2.

Figure 4.2 - Physical view of virtual servers under the Cluster service

Figure 4.2 illustrates a two-node cluster with four virtual servers; two virtual servers exist on each node. The Cluster service manages the virtual server as a resource group, with each virtual server resource group containing two resources: an IP address and a network name that’s mapped to the IP address.

Application client connections to a virtual server are made by a client session that knows only the IP address that the Cluster service publishes as the virtual server’s address. The client view is simply a view of individual network names and IP addresses. Using the example of a two-node cluster supporting four virtual servers, the client view of the cluster nodes and four virtual servers is illustrated in Figure 4.3.

As shown in Figure 4.3, the client is aware of only the IP addresses and names and doesn’t need to detect information about the physical location of any of the virtual servers. This allows the Cluster service to provide highly available support for the applications running as virtual servers.

Figure 4.3 - Client view of the Cluster service virtual servers

In the event of an application or server failure, the Cluster service moves the virtual server resource group to another node in the cluster. When such a failure occurs, the client detects a failure in its session with the application and attempts to reconnect in exactly the same manner as the original connection. The client is able to establish a new connection because the Cluster service maps the published IP address of the virtual server to a surviving node in the cluster during recovery operations. The client doesn’t need to know that the application is now physically hosted on a different node in the cluster.

Note that while this provides high availability of the application or service, session state information related to the failed client session is lost unless the application is designed or configured to store client session data on disk for retrieval during application recovery. The Cluster service enables high availability but doesn’t provide application fault tolerance unless the application itself supports fault-tolerant transaction behavior. For example, the Microsoft Dynamic Host Configuration Protocol (DHCP) service stores client data and can recover from failed client sessions. DHCP client IP address reservations are saved in the DHCP database. If the DHCP server resource fails, you can move the DHCP database to an available node in the cluster and restart the DHCP service and use restored client data from the DHCP database.

Failover

The Cluster service implements failover automatically when hardware or application failure occurs. You can also trigger failover manually. The algorithm for both situations is identical, except that in a manually initiated failover, resources are gracefully shut down. In the case of failure, they’re forcefully shut down.

When an entire node in the cluster fails, its resource groups are moved to one or more available servers in the cluster. Automatic failover is similar to planned administrative reassignment of resource ownership, except that automatic fail-over is more complicated because the normal shutdown phase isn’t gracefully performed on the failed node.

When automatic failover occurs, the Cluster service determines which resource groups were running on the failed node and which nodes should take ownership of the various groups. All nodes in the cluster that are capable of hosting the resource groups negotiate among themselves for ownership. This negotiation is based on node capabilities, current load, application feedback, or the node preference list. The node preference list is part of the resource group properties and is used to assign a resource group to a node. Once negotiation of the resource group is complete, all nodes in the cluster update their databases and keep track of which node owns the resource group.

For each resource group, you can specify (in a node preference list) a preferred server and one or more prioritized alternatives for each resource. This prioritiz-ing enables cascading failover, in which a resource group can survive multiple server failures, each time cascading, or failing over, to the next server on its node preference list. Cluster administrators can set up different node preference lists for each resource group on a server so that, in the event of a server failure, the groups are distributed among the cluster’s surviving servers.

An alternative to cascading failover is N+1 failover. In N+1, the node preference lists of all cluster groups identify the standby cluster nodes to which resources should be moved during first failover. The standby nodes are servers in the cluster that are mostly idle or whose workload can be easily preempted when a failed server’s workload must be moved to the standby node.

When choosing between cascading failover and N+1 failover, a key issue for cluster administrators is the location of the cluster’s excess capacity for accommodating the loss of a server. With cascading failover, the assumption is that every other server in the cluster has some excess capacity to absorb a portion of any other failed server’s workload. With N+1 failover, it’s assumed that the "+1" standby server is the primary location of excess capacity.

Failback

When a node comes back online, the Failover Manager (which manages resources and groups and initiates failover operations) can decide to move some resource groups back to the recovered node. This process is referred to as failback. A resource group’s properties must have a preferred owner defined in order to failback to a recovered or restarted node. Resource groups for which the recovered or restarted node is the preferred owner will be moved from the current owner to the recovered or restarted node. The Cluster service provides protection against failback of resource groups at peak processing times or for nodes that haven’t been correctly recovered or restarted. A resource group’s failback properties may include the hours of the day during which failback is allowed, plus a limit on the number of times failback is attempted.

Resources

A resource is any physical or logical component that has the following characteristics:

Can be brought online and taken offline
Can be managed in a server cluster
Can be hosted (owned) by only one node at a time

To manage resources, the Cluster service communicates to a resource DLL through a Resource Monitor. When the Cluster service makes a request of a resource, the Resource Monitor calls the appropriate entry-point function in the resource DLL in order to check and control the resource’s state.

Dependent Resources

A dependent resource is one that requires another resource, known as a dependency, to operate. For example, a network name must be associated with an IP address. Because of this requirement, a Network Name resource is dependent on an IP Address resource. Dependent resources are taken offline before their dependencies; likewise, they are brought online after their dependencies. A resource can specify one or more resources on which it’s dependent.

A resource can also specify a list of nodes, known as preferred nodes, on which it’s able to run. When you organize resources into groups, it’s important to consider preferred nodes and dependencies.

A dependency tree is a series of dependency relationships. For example, the SQL Server resource depends on the SQL Server Network Name resource, and the Network Name resource depends on the IP Address resource. A dependent resource and all of its dependencies must be in the same resource group.

Cluster Service Architecture

The Cluster service is designed as a separate, isolated set of components that work together with the Windows 2000 Advanced Server and Windows 2000 Datacenter Server operating systems. This design avoids introducing complex processing system schedule dependencies between the Cluster service and the operating system. However, some changes in the base operating system are required to enable cluster features. These changes include the following:

Support for dynamic creation and deletion of network names and addresses
Modification of the file system to enable closing open files during disk drive dismounts
Modifying the input/output (I/O) subsystem to enable sharing disks and volume sets among multiple nodes

Apart from these changes and other minor modifications, cluster capabilities are built on top of the Windows 2000 foundation.

The Cluster service is based on a shared-nothing model of cluster architecture. This model refers to how servers in a cluster manage and use local and common cluster devices and resources. In the shared-nothing cluster, each server owns and manages its local devices. Devices common to the cluster, such as a common disk array and connection media, are selectively owned and managed by a single server at any given time.

The shared-nothing model makes it easier to manage disk devices and standard applications. This model doesn’t require any special cabling or applications and enables the Cluster service to support standard Windows 2000–based and Windows NT–based applications and disk resources.

The Cluster service uses standard Windows 2000 Server drivers for local storage devices and media connections. In addition, the Cluster service supports several connection media for the external common devices that need to be accessible by all servers in the cluster. External storage devices that are common to the cluster require SCSI devices that support standard Peripheral Component Interconnect (PCI)–based SCSI connections as well as SCSI over Fibre Channel and SCSI bus with multiple initiators.

Fiber connections are SCSI devices hosted on a Fibre Channel bus instead of a SCSI bus. Conceptually, Fibre Channel technology encapsulates SCSI commands within the Fibre Channel and makes it possible to use the SCSI commands that the Cluster service is designed to support. Fibre Channel is a technology for 1-Gbps data transfer that maps common transport protocols such as SCSI and IP, merging networking and high-speed I/O in a single connectivity technology. Fibre Channel technology gives you a way to address the distance and the address-space limitations of conventional channel technologies.

Figure 4.4 illustrates components of a two-node server cluster that may be composed of servers running Windows 2000 Advanced Server, Windows 2000 Datacenter Server, or Windows NT Server 4, Enterprise Edition, with shared storage device connections that use SCSI or SCSI over Fibre Channel.

Figure 4.4 - Two-node server cluster running Windows 2000 Advanced Server, Windows 2000 Datacenter Server, or Windows NT Server 4, Enterprise Edition

Windows 2000 Datacenter Server supports four-node clusters and requires device connections that use a Fibre Channel connection, as shown in Figure 4.5.

Figure 4.5 - Four-node server cluster running Windows 2000 Datacenter Server

Quorum Disks

One of the most important components of the cluster storage system is the quorum disk, which is a single disk in the system designated as the quorum resource. The quorum disk provides persistent physical storage across system failures. The cluster configuration is kept on the disk, and all nodes in the cluster must be able to communicate with the node that owns it. The configuration data, in the form of recovery logs, contains details of all of the changes that have been applied to the cluster database. This process provides node-independent storage for cluster configuration and state data.

Quorum Resource and the Cluster Database

The cluster database is an integral part of the formation of a server cluster. When a node joins or forms a cluster, the Cluster service must update the node’s pri-vate copy of the cluster database. When a node joins a cluster, the Cluster service can retrieve the data from the other active nodes. However, when a node forms a cluster, no other node is available. The Cluster service uses the quorum resource’s recovery logs to update the node’s cluster database. To ensure cluster unity, Windows 2000 uses the quorum resource to ensure that only one set of active, communicating nodes is allowed to operate as a cluster. A node can form a cluster only if it can gain control of the quorum resource. A node can join a cluster or remain in an existing cluster only if it can communicate with the node that controls the quorum resource.

Lesson Summary

A server cluster is a group of independent computer systems working together as a single system to ensure that mission-critical applications and resources remain available to clients. The Cluster service refers to the collection of components on each node that perform cluster-specific activity, and resource refers to the hardware and software components within the cluster that are managed by the Cluster service. A server cluster runs several pieces of software that fall into two categories: the software that makes the cluster run (clustering software) and the software that you use to administer the cluster (administrative software). Windows 2000 Advanced Server and Windows 2000 Datacenter Server use the server cluster components to create server clusters. These components include the Cluster service, Resource Monitor, resource DLL, Cluster Administrator, Cluster Automation Server, the cluster database, and network and disk drivers. Cluster objects are the physical and logical units that the Cluster service manages. Objects include server cluster networks, network interfaces, nodes, resource groups, and resources. The Cluster service is based on a shared-nothing model of cluster architecture. One of the most important components of the cluster storage system is the quorum disk, which is a single disk in the system designated as the quorum resource. The cluster database is an integral part of the formation of a server cluster.