What Affects Terminal Server Availability? | Terminal Services for Microsoft Windows Server 2003: Advanced Technical Design Guide (Advanced Technical Design Guide series)

In order to determine what is required for your environment, let's look at the various components of a Terminal Server system from the perspective of redundancy. Figure 7.1 (next page) shows a generic view of the technical components required for a user to access an application hosted on Terminal Server.

click to expand
Figure 7.1: The Terminal Server components that must be functional

The majority of this chapter will focus on how to configure your Terminal Servers in a load-balanced group. However, as Figure 7.1 illustrates, when creating a highly available environment, you can't just focus on the Terminal Servers. It does no good to build a fantastic load-balanced Terminal Server environment only to have your web server go down. Therefore, we'll first discuss high-availability strategies for several components of a Terminal Server environment, including:

Client devices.
Network connections.
Web interface servers.
Terminal Servers.
User and application data.

Throughout the remainder of this section, we'll analyze how each component in Figure 7.1 affects overall redundancy and what can be done to that component to strengthen the redundancy of the overall system. All of these components work together as a system. Refrain from thinking about the redundancy of one component without considering the redundancy of others. Your Terminal Server environment is only as strong as its weakest link.

Client Device Redundancy

By this point you are well aware that a major architectural advantage to thin client environments is that any user can connect from any client device. If a client device ever fails, a user can begin using a different device and pick up right where he left off.

Chapter 9 will present the issues to consider when designing your client device strategy. These issues can be summarized as follows: With RDP clients, apply "high availability" not by changing anything on the client itself, but rather by having a spare client device available to quickly replace a failed unit. Many companies that employ true thin clients at business locations will keep one or two spares at each site as quick replacements in the event of device failures.

Network Connection Redundancy

In a Terminal Services environment (or any thin client environment), users instantly become unproductive if their network connection is lost. Short of running dual network cables to every user's client device, you can configure clients to point to multiple Terminal Servers on multiple network segments (as covered in Chapter xx).

If your users connect to Terminal Servers over a WAN link, it's a good idea to provide some type of failover should that line go down. Generally the decision to do this will be dependent on the cost of the failover connection.

The costs associated with network bandwidth in a large Terminal Server environment can be complex. Terminal Server users require approximately 15 to 30 Kbps per concurrent user of bandwidth between their client and the Terminal Server. While fairly efficient on its own, the amount of bandwidth required to provision hundreds or thousands of end users can be considerable.

Think about a single remote site environment. Assume you have a remote site connected to your LAN via a T1. There are 30 users at that remote site using Terminal Server to access one of their primary applications. If that link is down due to either a router failure or line failure, how much down time would be acceptable for those users?

If only a few users required immediate access to the application and the rest could be down for several hours then a backup ISDN dialup line might be acceptable. However, if all 35 users required immediate access to the application, you may have to configure a full T1 as a backup line.

This example is very simplistic. To make any network system fully redundant you must ensure that all of the components in the chain are redundant, including the WAN lines, the routers the lines are connected to, and the switches that the routers are connected to. If any of the components between the client and the server were to fail, connectivity would still be available.

On Terminal Servers you should also ensure that the network components are redundant all the way to the server. This process would include putting dual network cards in your Terminal Servers and configuring them for failover in case one stopped working. Best practices suggest that each network card be connected to a different switch so that the servers can still function if a switch is lost.

Web Interface Server Redundancy

If the TS advanced client or a custom portal is used to provide access to applications (as covered in Chapter 11), you must take the necessary steps to make certain that a working web page appears whenever users enter the URL into their browsers. Let's examine the steps that can be taken to ensure that the web server does not become the Achilles' heel of your Terminal Server environment.

Ensuring Users can find a Web Server

How will you guarantee that your users will always be able to connect to a functioning web server, even if your primary server is down? Luckily, people have been focusing on creating redundant websites for years, and there's nothing proprietary about Terminal Server environments that would prevent your web site from working like any other. Four common ways of ensuring website availability are:

Connection to the server via a DNS name.
Use of smart DNS or load balancing for server connections.
Clustering the web servers.
Creation of a manual backup address.

Option 1. Use a DNS Name to Connect to the Web Server

By connecting to an RDP website via a DNS name rather than an IP address, the DNS name can be configured to point to any IP address. If something happens to the main server, the DNS table can be modified to point to a backup server. The disadvantage here is that the failover must be done manually.

Advantages of Using a DNS Name for Redundancy

Quick to implement
Transparent to end users
Inexpensive

Disadvantages of Using a DNS Name for Redundancy

Manual failover

Option 2. Use Load Balancing or Third-Party Smart DNS to Connect to the Web Server

Windows web servers can be configured in a load balanced cluster just like Terminal Servers, allowing multiple servers to respond to web page requests. In the event that one server goes down, the other will accept all the requests. (Load balancing is covered later in this chapter.)

Some third-party load balancers allow for "health checks" to be performed constantly on the servers they are load balancing. These products can generally be configured to poll the service you are attempting to load balance (such as IIS) to ensure that the server is alive and responding to the proper requests.

In addition, some third-party products offer what they call "Smart DNS." These packages are a step up from normal load balancing that usually will only work when the servers are on the same subnet. These types of products (such as an F5 Big IP DNS controller) tie in at the DNS level where the URL is resolved and provide the ultimate availability since they can load balance across IP subnets. In addition to being able to lose a web server, an entire site or connection to a site can be lost and your users will still be able to connect to another server.

The disadvantages to using this type of product are that it must tie into your DNS and is generally very expensive if you are only going to use it for a set of web servers.

Advantages of Using Third Party load balancing or Smart DNS

Provides health checks that native load balancing does not.
Transparent to end users.
The ultimate in cross-site redundancy

Disadvantages of Using Third Party load balancing or Smart DNS

Extremely expensive for a simple solution.
Must tie into your DNS solution.
Can be complicated to configure.

Option 3. Create a Web Server Cluster

Many web servers can be configured in a cluster, allowing one web server to take over if the other fails. Cluster failover is automatic, although the hardware and software needed to run them can be expensive. (Web server clustering with Internet Information Services requires the Enterprise edition of Windows 2003 which is much more expensive than the standard edition.)

Advantages of Building a Web Cluster for Redundancy

Fast, automatic failover.

Disadvantages of Building a Web Cluster for Redundancy

Specialized cluster hardware and software can be pricey.

Option 4. Manual Backup Address

Some people decide to configure two identical web servers and instruct their users to try the alternate address if the first is not available. This option is cheap and easy to implement, although it requires that your users remember a second address.

Advantages of Using a Manual Address for Redundancy

Inexpensive.

Disadvantages of Using a Manual Address for Redundancy

Requires user competence.

Terminal Server Redundancy

Two different strategies can be used to increase the redundancy and availability of your actual Terminal Servers:

Try to make each individual server's hardware as redundant as possible.
View each Terminal Server as "expendable." Build redundancy with extra servers.

Chapter 5 outlined strategies for the "cluster / silo" model of deploying Terminal Servers and detailed the advantages and disadvantages of building large or small servers. This section builds upon those two chapters by addressing the design option of whether you should approach server redundancy with "quality" or "quantity."

The exact approach that you take will depend on your environment. How would you define "high availability" as it relates to your environment? Does it mean that users' sessions can never go down, or does it mean that they can go down as long as they are restored quickly?

Option 1. Build Redundancy with High Quality Servers

One approach to making Terminal Servers highly available is to increase the redundancy of the systems themselves. This option usually involves servers with redundant hardware including disks, power supplies, network cards, fans, and memory. (Today's newest servers have RAID-like configurations for redundant memory banks.)

Advantages of Building Servers with Redundant Hardware

By using redundant server hardware, you're assured that a simple hardware failure will not kick users off the system.

Disadvantages of Building Servers with Redundant Hardware

No economies of scale. Every server must contain its own redundant equipment.
Employing this strategy still doesn't mean that your servers are bullet-proof.
What happens if you lose a server even after all of your planning? Will you have the capacity to handle the load?

Option 2. Build Redundancy with a High Quantity of Servers

As outlined in Chapters xx and xx, you'll most likely need to build multiple identical servers to support all of your users and their applications regardless of your availability strategy. In most cases it's more efficient to purchase an extra server (for N+1 redundancy) than it is to worry about many different redundant components on each individual server.

Advantages of Building Extra Servers

Better economies of scale.
You'll have the capacity to handle user load shifts after a server failure.

Disadvantages of Building Extra Servers

If a simple failure takes down a server, all users on that server will need to reconnect to establish their RDP sessions on another server.
An extra server might cost more than simply adding a few redundant components as needed.

If you have applications that cannot go down (because users would lose work), you'll have to spend money buying redundant components for individual servers. However, if it's okay to lose a server as long as the user can instantly connect back to another server, you can use the "high quantity" approach. Even without redundant components, losing a server is a rare event. Users are always safer on a server than on their workstations since the configuration and security rights are configured properly on the server. Traditional environments don't have redundant components on every single desktop and are still widely accepted. It should be acceptable not to have redundant components on servers as long as users can connect back in as soon as a server fails.

User and Application Data

To correctly determine which actions you should take to ensure that your data is highly available, you must first classify your data. All data can be divided into two categories:

Unique data is crucial to your business and unique to your environment. This category includes user profiles, home drives, databases, and application data.
Non-unique data is anything that you can load off of a CD from a vendor, such as Windows Server 2003, SQL Server, and your applications.

As outlined in previous chapters, you must ensure that your Terminal Servers only contain non-unique data. Unique data should be stored elsewhere, such as on a SAN or NAS device, as shown in Figure 7.2.

click to expand
Figure 7.2: Redundant servers with data on a SAN

In this environment, your data is protected if you lose one or more Terminal Servers. Your SAN should have the necessary redundancy built into it, such as RAID, multiple power supplies, multiple controller cards, and multiple interfaces to the servers. Instead of using a SAN, you can use a standard Windows 2000 Server file share driven by a Microsoft Cluster.

Advantages of using a SAN or RAIS for Data Redundancy

Quick recovery in the event of a failure.

Disadvantages of using a SAN or RAIS for Data Redundancy

Doesn't work in smaller environments.
Requires an "extra" server (for N+1 redundancy).
Since all your non-unique data is on a SAN or NAS, you'd better make sure that's backed up.

Terminal Services License Service

As you know from Chapter 4, after your Terminal Server has been active for 120 days it must be able to contact a license server. If it can't, users' connections are refused. Make sure that you have redundant license servers since a failure there could render all of your Terminal Servers useless.

Configure two license servers, one as a "primary" and one as a backup. The Primary server will contain all TS CALs for the site, and the secondary or backup server will contain no licenses.

The primary server services all license requests since it has licenses that are installed and ready to hand out to the servers. The backup will be there as a "just in case" and acts as a place to which you can restore the license server database if required.

If the primary license server fails, the secondary license server will provide temporary services until the primary is available again. There are three possible outcomes in this situation:

Clients that already have licenses will continue to connect since Terminal Server does not look to the licensing server unless it requires a license.
Clients with temporary licenses that are expiring are giving a seven day grace period to contact a licensing server with active licenses. You have a full week to get your primary license server running again.
Clients without any type of license will be issued a temporary license by the secondary licensing server. These licenses (as stated in Chapter 4) allow for full use for the system for 90 days.

In the event of a failure of your primary license server, you have a minimum of seven days to restore your primary license server and its license database or to restore your license database to the secondary server.

When designing your redundant license server solution, resist the urge to install licenses on multiple license servers (unless you have business reasons for doing so as outlined in Chapter 4). The scenario laid out in the pervious paragraphs represents a tried and true method.