Let's look now at the actual mechanisms that can be used to load balance your Servers. There are three different ways this can be achieved:
Windows Network Load Balancing.
Third-party hardware load balancing routers.
Third-party software load balancing products
Microsoft Windows Network Load Balancing ("NLB") is the "free" out-of-the-box software load balancing solution available for Windows 2003-based Terminal Servers. NLB is available with all editions of Windows Server 2003, although your Terminal Servers must be running at least the Enterprise edition of Windows to use the Session Directory.
Network Load Balancing works by assigning a single virtual IP address to those multiple servers that can respond. You then assign a DNS name to the virtual IP address. RDP clients connect to this DNS name, and the system responds by automatically connecting the user to the least-busy server.
Under the hood, Network Load Balancing enables all of the configured nodes on a single subnet to detect incoming network traffic for the cluster's virtual IP address. (When using Windows NLB, all servers must be on the same subnet.) On each Terminal Server in the cluster, the Network Load Balancing driver acts as a layer residing between the cluster driver and the TCP/IP stack. A portion of the incoming network traffic can be received by the host.
Windows Network Load Balancing works at the network level by distributing the network client request between hosts. Windows NLB is limited to a maximum number of 32 possible hosts in any one cluster.
Also, as its name implies, Windows Network Load Balancing is only able to determine which server is the least-busy based on network load. If one server has failed but is still responding to the network, the NLB system will continue to send users to it.
It's the "free" solution that's built-in to Windows.
Load calculations are only based on network load.
You can't natively load-balance more than 32 servers.
All servers must be located on the same subnet.
What if you need to load balance more than 32 Terminal Servers?
One major limitation of Windows Network Load Balancing is that you can only use it to load balance 32 servers. If you need more than 32 servers in your cluster, you must implement one of the following options:
Move to a third-party hardware or software load balancing solution as described later in this chapter.
Combine multiple groups of NLB clusters with round robin DNS servers.
Let's take a closer look at this second option. In this case, your DNS servers should be configured with entries for both of the clusters' virtual IP addresses in a round robin entry so that clients connect to either one in a one to one ratio. Make sure that each cluster has the same number of servers, or adjust your round robin ratio accordingly.
At this point you may be thinking that a DNS round robin solution could suffice for simple load balancing. Before you go down that path, remember that there are reasons why it's called DNS round robin and not DNS load balancing.
If a server failure in an NLB cluster will be detected by the other servers (through the cluster's heartbeat packets), new RDP connections will be distributed only among the remaining Terminal Servers. However, a DNS round robin scheme will continue to send connections to the server that has failed until a change is manually made to the DNS entry.
This book is not meant to an exhaustive study of Windows Network Load Balancing. However, we'll cover some of the Terminal-Server specific items that you probably won't find in other papers covering NLB.
There are only a few requirements that all servers must meet to use Windows NLB:
Have at least one network interface configured for Load Balancing.
Be on the same subnet.
Share a common (virtual) IP address.
In an ideal world, each of your Terminal Servers within in the cluster would have two network cards. The first would be used for the "front-end" RDP traffic between clients and server. The second would be used for "back-end" services and data access.
All versions of Windows Server 2003 come with Network Load Balancing installed. To use it, all you have to do is enable it on the network card that you intend to use for RDP connections (Control Panel | Network Connections | Right-click on your network card | Properties | Check the box next to the "Network Load Balancing" option).
Once you enable NLB, you must configure it (Network adapter properties | Highlight "Network Load Balancing" | Click the "Properties" button). There are several configuration options to understand when using NLB in a Terminal Server environment.
The Properties button leads you to a window with three tabs—Cluster Parameters, Host Parameters, and Port Rules.
On the Cluster Parameters tab, you'll first enter the virtual IP address, subnet mask, and DNS name that your cluster will use. These should be the same on all Terminal Servers in the cluster.
Then you'll select a cluster operation mode. Windows NLB has the ability to work in two different modes: "unicast" and "multicast."
Regardless of the mode you choose, NLB creates a new virtual MAC address assigned to the network card that has NLB enabled, and all hosts in the cluster share this virtual MAC. Then, all incoming packets are received by all servers in the cluster, and each server's NLB drivers are responsible for filtering which packets are for that server and which are not.
When in unicast mode, NLB replaces the network card's original MAC address. When in multicast mode, NLB adds the new virtual MAC to the network card, but also keeps the card's original MAC address.
Both unicast and multicast modes have benefits and drawbacks. One benefit of unicast mode is that it works out of the box with all routers and switches (since each network card only has one MAC address). The disadvantage is that since all hosts in the cluster all have the same MAC and IP address, they do not have the ability to communicate with each other via their NLB network card. A second network card is required for communication between the servers.
Multicast mode does not have the problem that unicast operation does since the servers can communicate with each other via the original addresses of their NLB network cards. However, the fact that each server's NLB network card operating in multicast mode has two MAC addresses (the original one and the virtual one for the cluster) causes some problems on its own. Most routers reject the ARP replies sent by hosts in the cluster, since the router sees the response to the ARP request that contains a unicast IP address with a multicast MAC address. The router considers this to be invalid and rejects the update to the ARP table. In this case you'll need to manually configure the ARP entries on the router. (Don't worry if you're lost at this point. Just be aware that if you're using multicast mode, you'll need to get one of your network infrastructure people involved.)
The bottom line is that you don't want to use unicast in a Terminal Server environment unless you have two network cards. (That way, you can still connect to a specific Terminal Server if you need to via another adapter and another IP address.) If your servers have only a single network card, then you'll want to use the multicast mode.
The "Host Priority" is a unique number assigned to each server in the cluster. This number (an integer) identifies the node in the cluster and determines the order in which traffic is delivered to the servers by default. The priority is organized by lowest to highest with the lowest number handling all traffic not otherwise handled by the set of load balancing rules.
The Port Rules tab allows you to configure how load-balancing works within the cluster. By default, a rule is created that equally balances all TCP/IP traffic across all servers. To use NLB for a Terminal Server cluster, you'll need to change some settings.
First add a new rule (Port Rules tab | Add button) that will specify how RDP traffic is to be load-balanced. Configure the port range for 3389 to 3389 to ensure that this new rule only applies to RDP traffic. Select the "TCP" option in the protocols area and the "Multiple Host" as your filtering mode.
The "Affinity" determines if a specific client's requests will continue to be routed to a specific server (such as the first server they were connected to) based on the client's IP address. If you're using the Session Directory then a specification here is not required or can be set to "none." If you are not using the Session Directory, set this rule to "single affinity" so that a client will always be serviced by the same server and users can reconnect to their disconnected sessions.
Finally, the "Load weight" setting determines the amount of users/load this server should handle. The cluster algorithm will divide the server's load weight setting by the total of all the servers' settings to calculate a load index value for each server, allowing you to route more connections to larger servers.
A simple example is a two-server cluster, the first server having a quad processor configuration and the second having a dual processor configuration. Through load testing, you have determined that the quad can handle exactly twice the number of users as the dual. One server (the dual) can be configured with a load weight of 50 while the other server (the quad) can be configured with a load weight of 100. In this configuration, the second server would receive twice as much traffic as the first. The default load weight setting is "Equal" and assumes all servers in the cluster can handle an equal amount of load.
As we discussed earlier, NLB clustering is extremely complex. Nevertheless, you should be able to create a basic configuration for lab testing fairly simply. The following settings will work for almost every environment and allow you to easily configure RDP load balancing:
Cluster Parameters Tab
Cluster IP Address
Common IP shared between all servers
DNS name of cluster
Shared DNS name (should refer to the Common cluster IP)
Host Parameters Tab
Start at 1 and work up as you add servers. Each must be unique
IP Address of NIC that will accept load balanced requests
Subnet mask of NIC configured for Load Balancing.
Cluster IP Address
If only using one, leave the default at "All"
3389 to 3389
Default of "Both" will work so will "TCP"
Multiple Hosts, Affinity set to None. (If you're not using Session Directory you can set this to "single.")
Leave the remaining settings at their default values. (You can also use these settings for load balancing your web servers. Just change the port rule from 3389 to 80.)
Once your cluster is up and running:
Check that each server's dedicated IP address must be unique, and the cluster IP address must be identical for each server in the cluster.
Verify that any load-balanced applications are installed and configured on all cluster servers. Remember that Windows NLB is not aware higher level applications and does not start or stop applications or services on each server.
Ensure that the dedicated IP address is always listed first (before the cluster IP address) in the Internet Protocol (TCP/IP) Properties dialog box to ensure that responses to connections originating from a host will return to the same host.
Make sure that both the dedicated IP address and the cluster IP address are static IP addresses. They cannot be DHCP addresses.
Do not enable Network Load Balancing on a computer that is part of a "real" Microsoft cluster services cluster. Microsoft does not support this configuration.
Even though it's "free," Network Load Balancing has some weaknesses. In addition to the disadvantages listed previously, some people want load-balancing tools to check the health of individual servers or create load indexes based on CPU utilization or the number of active sessions.
For this functionality, you'll need to turn to third-party tools. There are hardware-and software-based solutions for load balancing Windows 2003 Terminal Servers.
One alternative to Windows Network Load Balancing is to use a hardware load balancing device. This is a piece of hardware that sits between your Terminal Servers and your users and is able to intelligently route users to the least-busy Terminal Server.
Often referred to as "layer 7 switches" or "layer 7 load balancers" (due to the layer of the OSI model they in which they operate), examples include Cisco's LocalDirector, F5 Networks' BIG-IP, Nortel's Alteon, and Foundry's ServerIron.
RDP traffic is sent to the switch using a single IP address (a virtual IP for the cluster). The switch then load balances the traffic between the Terminal Servers based on the algorithms programmed into the device. Additionally, the manufacturers will often use a heartbeat ping against the servers in the cluster to make sure they're still available.
Some of the more advanced hardware load balancers come with software agents that can be installed on the Terminal Servers. These agents can return even more information about the server's current load to the load Balancer. F5 has an agent that is installed on your servers that performs health monitoring of the devices in the cluster, including CPU, memory, and disk utilization, helping to ensure the most efficient load balancing of RDP traffic. NLB doesn't look at any of these components.
Using a hardware load balancer is ultimately more stable and scalable than using NLB or Round Robin DNS. They are expensive, however (some are in the $30,000 to $35,000 range), and you'll need multiple switches to avoid making the switch a single point of failure for your Terminal Server cluster.
Load balancing calculations are based on several server metrics, including CPU usage and the number of current user sessions.
As "closed" systems they are extremely reliable.
They are expensive.
You'll need more than one to achieve true redundancy
When dealing with a load balancing device, it's important to understand how that device works on the network since it affects how your users connect to the Terminal Server and how session reconnection with the Session Directory service works.
Figure 7.4 (next page) outlines the process that takes place when a user connects to a Terminal Server through a hardware load balancer.
Figure 7.4: The user connection process through a hardware load balancer
A client connects to a cluster via the cluster's DNS name, in this case "clus01."
The load balancer routes the client to the least busy Terminal Server within the cluster—TS01.
The client logs onto the server.
TS01 authenticates the user and then queries the Session Directory to see if that user has a disconnected session on any other server in the cluster.
In this case, the Session Directory indicates that the user has a disconnected session on TS05.
TS01 sends its authentication information back to the client in an encrypted format. It also sends back a load balance packet containing the IP address of TS05.
The client uses this information to seamlessly connect to TS05.
TS05 then reconnects the user to their disconnected session.
As you'll learn in Chapter 12, sometimes Terminal Servers are on a private network behind a firewall. (A firewall might use NAT with your Terminal Servers all on the private 10.x subnet.) In these cases, you'll need to make special provisions to use the Session Directory.
The IP address of the Terminal Server that's recorded in the Session Directory is not valid to the RDP client device. To address this issue, configure your Terminal Servers so that they pass a routing token back to the client device instead of the actual server's IP address. In turn, the RDP client presents this token to the load balancer, and the load balancer deciphers it and routes the user to the proper server. This process is outlined in Figure 7.5 (facing page).
Figure 7.5: Load balancing in NAT environments
The user has an existing disconnected session on TS03.
When the user reconnects, the load balancer decides that TS01 has the least load, and the user is routed to it.
Since TS01 is configured to use a Session Directory, it queries the Session Directory once the user authenticates and discovers that the user has a disconnected session on TS03.
TS01 passes the new server information to the client. The IP address is the same (the IP address of the load balancer), but it also embeds a routing token into the package for the client.
Internally, the load balancer associates this routing token with TS03.
The client reconnects to the cluster's virtual IP address, but this time it also provides the routing token.
The load balancer notices that the client has presented a routing token. Therefore, instead of sending the user to the least-busy server, it routes the user to the server it has associated with that particular routing token—TS03 in this case.
To configure your Terminal Servers to use a routing token with a hardware load balancing, uncheck the "IP Address Redirection (uncheck for token redirection" box in the properties of a server's Session Directory configuration (Terminal Services Configuration MMC snap-in | Server Settings | Session Directory).
Hardware load balancing, while a huge improvement over Windows NLB, has some significant drawbacks. Primarily, the hardware load balancers are very expensive. If your company is lucky enough to have existing hardware load balancers then you can simply "add" your Terminal Servers to them (if you have enough spare ports, that is).
Alternately, many people choose to use third-party software products to add sophisticated load balancing capabilities to Terminal Server. Among these products are:
Citrix MetaFrame Presentation Server
DAT Panther Server
Tarantella New Moon Canaveral iQ
Terminal-Services.NET WTSportal Pro
All of these software products improve upon Terminal Server's "out of the box" functionality. The prices vary drastically. Some of these products cost almost $400 per user, but they add functionality across the board. Other products only add load-balancing features to Terminal Server for only $100 per server (with unlimited users).
Since many of these products are useful in many different ways (in addition to load balancing), they are covered fully (and are compared to each other) in the Appendix.