Lesson 2: Planning Server Cluster Configurations

The Cluster service in Windows 2000 Advanced Server and Windows 2000 Datacenter Server provides a foundation for server clusters. When one server in a cluster fails or is taken offline, another server in the cluster takes over the operations of the failed server. Clients using server resources experience little or no interruption of their work because support for resources is moved from one server to the other. When implementing Windows clustering into a network design, you must consider many factors and prepare the environment that supports the clusters. For example, you must select which applications to run on a server cluster, and you must determine failover policies for resource groups. This lesson focuses on those aspects of planning a server cluster that you should consider when designing your network.

After this lesson, you will be able to

  • Plan a server cluster and identify the possible failures that can interrupt access to resources
  • Choose applications to run on the cluster
  • Determine failover policies for resource groups, choose a domain model, and plan resource groups

Estimated lesson time: 25 minutes

Planning a Server Cluster

You should consider a number of steps when planning a server cluster, including identifying network risks, choosing applications to run on the cluster, choosing a domain model, choosing a cluster model, planning resource groups, determining failover policies, planning fault-tolerant storage, and determining capacity requirements. This section discusses each of these steps.

Identifying Network Risks

With Windows 2000, you can use server clusters to provide increased availability. However, server clusters aren’t designed to protect all components of your workflow in all circumstances. For example, clusters aren’t an alternative to backing up data; they protect the availability of data only, not the data itself.

When you configure a cluster, you should identify any possible single points of failure in your network environment. In general, you should try to minimize those points of failure and provide mechanisms that will maintain service when a failure occurs.

Windows 2000 Advanced Server and Windows 2000 Datacenter Server include built-in features (in addition to the Cluster service) that protect certain computer and network processes during failure. These features include two redundant array of independent disks (RAID) implementations: mirroring (RAID-1) and striping with parity (RAID-5). You should note, however, that software implementations of RAID are used to protect a computer’s internal drives, not the external storage used by the cluster.

To further increase the availability of network resources and prevent the loss of data, do the following:

  • Consider having replacement disks and controllers available at your site. Always make sure that any spare parts you keep on hand exactly match the original parts, including network and SCSI components. The cost of two spare SCSI controllers can be a small fraction of the cost of having hundreds of clients unable to use data.
  • Consider providing uninterruptible power supply (UPS) protection for individual computers and the network itself, including hubs, bridges, and routers. Computers running Windows Server support UPS. Many UPS solutions provide power for 5 to 20 minutes, which is long enough for the operating system to do an orderly shutdown when power fails.

Choosing Applications to Run on the Cluster

You can adapt many, but not all, applications to run on a server cluster. Of those that can, you don’t need to set them all up as cluster resources. The following criteria determine whether an application can adapt to server clustering failover mechanisms:

  • The application must use Transmission Control Protocol/Internet Protocol (TCP/IP) or Distributed Component Object Model (DCOM), Named Pipes, or remote procedure call (RPC) over TCP/IP for its network communications in order to run on a server cluster. Applications that use only NetBIOS Enhanced User Interface (NetBEUI) or Internetwork Packet Exchange (IPX) protocols can’t take advantage of cluster failover.
  • The application must be able to store its data in a configurable location—that is, on the disks attached to shared buses. Some applications that can’t store their data in a configurable location can still be configured to failover. However, in such cases access to the application data is lost at failover because the data is available only on the disk of the failed node.
  • The application must support NTLM authentication. Clients can’t use Kerberos to authenticate a connection to a virtual server.

In addition to these specifications, client applications that connect to the server application must be able to retry and recover from temporary network failures. During failover, client applications experience a temporary loss of network connectivity. If the client application is configured to recover from temporary network connection problems, it’s able to continue operating after a server failover.

Cluster-Aware and Cluster-Unaware Applications

Applications that can be failed over can be divided into two groups: those that support the Cluster API and those that don’t. Applications that support the Cluster API are defined as cluster-aware. These applications can register with the Cluster service to receive status and notification information, and they can use the Cluster API to administer clusters. Applications that don’t support the Cluster API are defined as cluster-unaware. If cluster-unaware applications meet the TCP/IP and remote-storage criteria, you can still use them in a cluster and often configure them to failover.

In either case, applications that keep significant state information in memory aren’t the best applications for clustering because information that’s not stored on disk is lost at failover.

Choosing a Domain Model

Nodes in a server cluster must belong to the same domain. The cluster nodes, which must be configured with Windows 2000 Advanced Server or Windows 2000 Datacenter Server, can be either member servers or domain controllers. If you configure your cluster nodes as domain controllers, you must account for the additional overhead that’s incurred by the domain controller services. If you configure the cluster nodes as member servers, the cluster’s availability depends on the availability of the domain controller, which must be high.

In large networks running on Windows 2000, domain controllers can require substantial resources to replicate the directory and authenticate clients. For this reason, many applications, such as Microsoft SQL Server and Message Queuing, should not be installed on domain controllers in order to maximize performance. However, if you have a very small network in which account information rarely changes and in which users don’t log on and off frequently, you can use domain controllers as cluster nodes.

Choosing a Server Cluster Model

Server clusters can be categorized into different configuration models. You should choose a cluster model that best matches your organization’s needs. Cluster models are discussed in more detail in Lesson 3, "Choosing a Server Cluster Model."

Planning the Resource Groups

You can take six steps to organize your applications and other resources into groups. This section reviews each of these steps.

Listing All Server-Based Applications

Make a list of all applications that will run on the cluster nodes, regardless of whether or not you plan to use them with the Cluster service. You can determine your capacity needs by adding up the resources necessary for each resource group and the resources necessary for those applications and services that will run independently of the Cluster service.

Sorting the List of Applications

Determine which applications on your list can use failover and which applications will reside on cluster nodes but won’t use failover (because it’s inconvenient, unnecessary, or impossible to configure). Although you don’t set failover policies for these applications or arrange them in groups, they still use a portion of the server capacity.

Before clustering an application, review the application license or check with the application vendor. Each application vendor sets its own licensing policies for applications running on clusters.

Listing All Other Resources

Determine which hardware, connections, and operating system software a server cluster can protect in your network environment. For example, the Cluster service can failover print spoolers to protect client access to printing services and failover file-server resources to maintain client access to files. In both cases, capacity is affected, such as the random access memory (RAM) required to service the clients.

Listing All Dependencies for Each Resource

Once you have a complete list of all the resources, determine which ones are your core resources, and then determine which ones support the core resources. For example, a SQL Server resource would be your core resource, and Network Name, IP Address, and Disk resources would support the SQL Server resource. All these resources must be in the same group to ensure that the Cluster service keeps interdependent resources together at all times.

Making Preliminary Grouping Decisions

Once you’ve listed all your resources and their dependencies, you’re ready to make a preliminary decision about how to group these resources. In many cases resource groupings are very apparent because dependencies restrict how you can group some resources.

When grouping together resources, you should adhere to these guidelines:

  • A resource and its dependencies must be together in a single group.
  • A resource can’t span groups.

For example, if several applications depend on a particular resource, you must include all of those applications with that resource in a single group. Suppose, for example, a Web-server application provides access to Web pages and that those Web pages provide result sets that clients access by querying an SQL-database application through the use of Hypertext Markup Language (HTML) forms. If you put the Web server and the SQL database in the same group, the data for both core applications can reside on a specific disk volume. Because both applications exist within the same group, you can also create an IP address and network name specifically for this resource group.

When not restricted by resource dependencies, you can organize groups by administrative convenience. For example, you might put file-sharing and print-spooling resources (along with their dependencies) into one group because viewing those particular applications as a single entity makes it easier to administer the network. You can give this group a unique name for the part of your organization it serves, such as Accounting File and Print. Whenever you need to intervene with that department’s file- and print-sharing activities, you’d look for this group in Cluster Administrator.

Making Final Grouping Assignments

After you list the resources that you want to group together, assign a different name to each group and create a dependency tree. A dependency tree is useful for visualizing the dependency relationships between resources.

To create a dependency tree, first write down all the resources in a particular group. Then draw arrows from each resource to each resource on which the resource directly depends.

A direct dependency between resource A and resource B means that no intermediary resources are between the two resources. An indirect dependency occurs when a transitive relationship exists between resources. For example, if resource A depends on resource B and resource B depends on resource C, there’s an indirect dependency between resource A and resource C, rather than a direct one.

Figure 4.6 shows the resources in a final grouping assignment in a dependency tree.

Figure 4.6 - A simple dependency tree

In Figure 4.6 the File Share resource depends on the Network Name resource, which in turn depends on the IP Address resource. However, the File Share resource doesn’t directly depend on the IP Address resource.

Determining Failover Policies for Groups

You must assign failover policies for each group of resources in your cluster. These policies determine exactly how a group behaves when failover occurs. You can choose which policies are most appropriate for each resource group you set up.

Failover policies for groups include three settings:

  • Failover timing You can set a group for immediate failover when a resource fails, or you can instruct the Cluster service to try to restart the group a designated number of times before failover occurs. If it’s possible to overcome the resource failure by restarting all resources within the group, then set the Cluster service to restart the group.
  • Preferred node You can set a group so that it always runs on a designated node whenever that node is available. This is useful if one of the nodes is better equipped to host the group.
  • Failback timing You can set a group to fail back to its preferred node as soon as the Cluster service detects that the failed node has been restored, or you can instruct the Cluster service to wait until a specified hour of the day, such as after peak business hours.

Planning Fault-Tolerant Storage

Many groups include disk resources for disks on shared buses. In some cases, these are simple physical disks, but in other cases they’re complex disk subsystems containing multiple disks. Almost all resource groups depend on the disks on the shared buses. An unrecoverable failure of a disk resource results in certain failure of all groups that depend on that resource.

For these reasons, you might decide to use special methods to protect your disks and disk subsystems from failures. One common solution is the use of a hardware-based RAID solution. RAID support ensures the high availability of data contained on disk sets in your clusters. Some of these hardware-based solutions are considered fault tolerant, which means that data isn’t lost if a member of the disk set fails. You might also use a storage area network (SAN), which can be located on- or off-site.

You can’t use software fault-tolerant disk sets for cluster storage.

Hardware RAID

The Microsoft Windows Hardware Compatibility List contains many different hardware RAID configurations for clusters. Because many hardware RAID solutions provide power, bus, and cable redundancy within a single cabinet and track the state of each component in the hardware RAID firmware, they provide data availability with multiple redundancy, protecting against multiple points of failure. Hardware RAID solutions can also use an onboard processor and cache. Windows 2000 can use these disks as standard disk resources.

When implementing hardware RAID, you should use redundant RAID controllers to make sure that the controller won’t be a single point of failure.

Storage Area Networks

A SAN is a high-speed, special-purpose network (or subnetwork) that interconnects different kinds of data storage devices with an associated data server on behalf of a larger network of users. Typically, a SAN is often part of the overall network of computing resources and it’s usually clustered in close proximity to other computing resources. However, a SAN can extend to remote locations for backup and archival storage, using WAN carrier technologies such as Asynchronous Transfer Mode (ATM) or Synchronous Optical Network (SONET).

SANs support disk mirroring, backup and restore, the archival and retrieval of archived data, data migration from one storage device to another, and the sharing of data among different servers in a network. SANs can incorporate subnetworks with network-attached storage systems.

Determining Capacity Requirements

After you assess your clustering needs, you’re ready to determine how many servers you need and with what specifications, such as memory and hard disk storage. Capacity planning for clusters is discussed in Chapter 7, "Capacity Planning."

Making a Decision

The process of planning your server configuration has several steps. In each of these steps you must decide which configuration is best suited to your organization. Table 4.3 describes the decisions that you must make for each of these steps.

Table 4.3 Planning Your Server Cluster

Step Description

Identifying network risks

When implementing clusters and the environment in which they’re located, you should minimize the number of single points of failure and provide mechanisms that maintian ser- vice when a failure occurs or that minimize the amount of unscheduled downtime.

Choosing applications to run on the cluster

The server applications must use TCP/IP and be able to specify where application data is stored. Client applications that connect to the server applications must be able to retry the connection and recover from temporary network failures. Server applications that keep significant state information in memory aren’t good candidates for clustering.

Choosing a domain model

You can configure nodes in a cluster as member servers or domain controllers. In either case the nodes must belong to the same domain. If you configure the nodes as member servers, the availability of the cluster depends on the availability of the domain controller.

Choosing a server cluster model

Server clusters can be categorized into different configuration models. Clustering models are discussed in more detail in Lesson 3, "Choosing a Server Cluster Model."

Planning the resource groups

You should follow six steps when organizing resource groups: listing applications, sorting applications, listing other resources, listing dependencies, making preliminary grouping decisions, and making final grouping decisions.

Determining failover policies for groups

You can assign failover policies to each group of resources in a cluster. Failover policies include three settings: Failover Timing, Preferred Node, and Failback Timing.

Planning fault-tolerant storage

You should protect the clustering shared storage from failures; however, you can’t use software fault-tolerant disks in that storage. Hardware-based RAID and SAN, along with redundant controllers, offer a highly available solution for cluster data.

Determining capacity requirements

After you assess your clustering needs, you should determine your capacity requirements. This process is discussed in Chapter 7, "Capacity Planning."

Recommendations

When planning a Windows 2000 Advanced Server or Windows 2000 Datacenter Server cluster, you should adhere to the following guidelines:

  • Eliminate single points of failure in hardware, software, and external dependencies. Use redundant services, components, and network connections, and keep replacement components on hand.
  • Choose server applications that use TCP/IP, that allow you to specify where application data is stored, and that don’t keep significant amounts of state data in memory.
  • Configure failover policies to meet your organization’s specific needs. Configure the Failover Timing setting to restart applications if all resources within the group can be restarted to overcome failure. If one host is better equipped to host a group, configure the Preferred Node setting so that a group always runs on a designated node if that node is available. In this case you should also configure the Failback Timing setting to failback to the preferred node as soon as it has been restored.
  • When configuring nodes as member servers, ensure that domain controller services for those nodes are highly available.
  • When planning resource groups, be sure to group together a resource and its dependencies into one group and don’t allow a resource to span groups. If several applications depend on a single resource, all those applications and the additional resource must be in one group. Group planning should also take into consideration administrative efficiency, such as combining file-sharing resources and print-spooling resources into a single group.
  • Implement hardware-based RAID or a SAN and redundant controllers to make the cluster storage fault tolerant.

Example: A Server Cluster for Northwind Traders

Northwind Traders imports gift items from Southeast Asia into the United States. The company sells these items to wholesale outlets in the United States and Europe. The company is setting up a Web-based system that will allow whole-sale customers to place orders online. The site’s goal is to be available all day, every day to accommodate various time zones and work schedules. The network includes a database that contains customer, product, and order information. Northwind Traders plans to use the Clustering service in Windows 2000 Advanced Server to provide highly available data.

Before implementing the cluster, Northwind Traders will use the planning process outlined in this lesson to determine how to set up the cluster. The first step is to ensure that any single point of failure in the network is eliminated. The Web site and its network infrastructure will use redundancy throughout the network to achieve high availability. For example, redundant LANs and power sources will be used to prevent failure.

Northwind Traders is using SQL Server 2000 to manage the database because SQL Server uses TCP/IP and the application is able to specify where application data is stored. A resource group will be created that contains SQL Server and any dependent resources. The failover policies for the group will be configured as follows: the Failover Timing setting will be configured to first try restarting resources before failover occurs. A preferred node, however, won’t be designated. The servers are configured as member servers, so the domain controller services for that domain are designed to be highly available. Each cluster node will utilize redundant Fibre Channel host bus adapters (HBAs) to connect to a SAN. Each Fibre Channel HBA will be cross-connected to separate switches for redundancy. Each switch will also be connected to redundant Fibre Channel controllers on the external Fibre Channel storage array. The storage array itself should already have redundant internal components and built-in fault tolerance. This configuration eliminates any single point of failure throughout the entire SAN.

Figure 4.7 shows how the two servers are connected to the corporate network and the SAN. Notice that dual NICs are used for network connectivity: one for client communication and one for the private cluster communication.

Figure 4.7 - Cluster configuration with SAN

Lesson Summary

When implementing Windows clustering into a network design, you must plan the configuration of specific components within the Cluster service and prepare the environment that supports the clusters. You should minimize the number of single points of failure in your environment and provide mechanisms that maintain service when a failure occurs. Clustering applications must use TCP/IP and be able to specify where the application data is stored. You should assign the failover policies for each group of resources in your cluster to determine how a group behaves. Nodes in a server cluster can be either member servers or domain controllers, and server clusters can be categorized into different configuration models. You should follow six steps when organizing resource groups: listing applications, sorting applications, listing other resources, listing dependencies, making preliminary grouping decisions, and making final grouping decisions. You can use hardware-based RAID and redundant controllers to make cluster storage fault tolerant. After you assess your clustering needs, you’re ready to determine how many servers you need and with what specifications, such as memory and hard disk storage.



Microsoft Corporation - MCSE Training Kit. Designing Highly Available Web Solutions with Microsoft Windows 2000 Server Technologies
MCSE Training Kit (Exam 70-226): Designing Highly Available Web Solutions with Microsoft Windows 2000 Server Technologies (MCSE Training Kits)
ISBN: 0735614253
EAN: 2147483647
Year: 2001
Pages: 103

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net