Network Load Balancing Enhancements


Let’s conclude this chapter with a brief look at enhancements to Network Load Balancing (NLB) in Windows Server 2008. This particular list of new features and enhancements is shorter than others in this chapter.

First, although the overall architecture and functionality of NLB remains the same as far as deploying and managing this feature are concerned, the picture under the hood is quite different: the NLB driver has essentially been rewritten to conform with the new NDIS 6.0 filter driver model used in Windows Server 2008. As shown in Figure 9-4, the NLB driver is a kernel-mode driver that runs on each server in an NLB cluster, and this is essentially the same as in previous versions of Windows Server.

image from book
Figure 9-4: How NLB works

The biggest reason for rewriting the NLB driver from scratch is that now the NLB driver is an NDIS 6.0 lightweight filter module. This means that it’s a cleaner, lighter, and faster driver when compared with the NDIS 5.1 intermediate driver that NLB had in Windows Server 2003.

One of the most valued improvements done in Windows Server 2008 was to provide full IPv6 support for NLB servers. In other words, IPv6 nodes can now join an NLB cluster and IPv6 traffic can be load-balanced between nodes. There is also support now for multiple dedicated IP addresses (DIPs). This also means that NLB clusters can now have multiple IPv6 DIPs in addition to the support for multiple virtual IP addresses (VIPs) that existed in previous versions.

Another helpful improvement has to do with consolidated management using Network Load Balancing Manager-you no longer need to work with the network configuration user interface on every single node of the cluster. This welcome change will ultimately minimize NLB configuration problems. NLB Manager is also more reliable because of WMI enhancements that enable auto recovery of the repository when it becomes corrupted or accidentally deleted.

Other NLB enhancements include the following:

  • Improved DoS attack protection for interested apps. Using a public callback interface, NLB can notify applications of SYN attacks so that steps can be taken to remediate the problem.

  • Support for a rolling upgrade of NLB clusters from Windows Server 2003 to Windows Server 2008.

  • Support for unattended installation of NLB.

  • Support for NLB in Server Core.

Let’s end this chapter with a couple of insights from experts at Microsoft regarding new features and enhancements to NLB in Windows Server 2008. First let’s learn how you can use the public WMI provider to add health monitoring and dynamic load balancing to applications running on your NLB cluster:

image from book
From the Experts: Add Health Monitoring to Your NLB App!

The Network Load Balancing (NLB) service does not monitor the health of your application. Instead, it allows the application developer to determine how healthy a load-balanced application is. Since each application has its own notion of load and health, measuring and monitoring these quantities is best achieved by the application itself. By using collected measurements from your application and NLB’s public WMI provider, it is a relatively simple task to add load and health monitoring to your load-balanced application.

If your application has a service that runs on each node of the NLB cluster, or a service that runs on a single (master) node that can communicate with the other nodes in the cluster, this service can double as a monitoring service that periodically queries each node for performance data and application-specific load and health information. Queries for performance data can be made locally or remotely using WMI. For example, you can query a particular node for its CPU load or the number of active TCP connections (the latter can also be determined by running the nlb params command locally and parsing the output). Queries for application-specific data can be made locally or remotely using the application’s protocol. For example, you can send a request to a particular node targeted at the port the application is listening on and measure the amount of time it takes to get a response. Even if your load-balanced application does not have its own service to issue these queries from–this is generally true of Web sites that run on Microsoft Internet Information Services (IIS) or some other web server–you can still gather load and health data by writing a script that periodically issues queries to each node. A VBScript script running in a loop on one node, for example, can issue WMI or application-specific queries to every other node in the cluster. The ultimate goal is to gather enough data to determine how healthy and loaded each instance of the application is.

Once you have gathered all the appropriate load and health metrics from each node, you need to act on this information. If you find that a given node has become unresponsive- either because the application instance is experiencing problems or the machine itself has died-you may want to remove this node from the NLB cluster. You can do this by executing the DrainStop or Stop method on the instance of the MicrosoftNLB_Node class running on that node (refer to MSDN documentation of the MicrosoftNLB_Node class). Keep in mind that these operations will affect all traffic being handled by the node and will eventually remove it from the cluster. If the problem is confined to a particular port or virtual IP address-port combination, you can use the Drain/DrainEx or Disable/ DisableEx methods to drain or disable the affected port rule instead. Once the problem goes away or the machine has been recovered, you can use the Enable/EnableEx methods to resume traffic handling on a per-port rule basis, or the Start method to restart cluster operations on a previously stopped node. Congratulations-you have added a simple but effective health monitoring scheme to your load-balanced application!

It may not always be the case that you want to drain or disable all traffic associated with a port rule. For example, you may find that a given application instance is responsive but severely overloaded, in which case the best course of action might be to temporarily reduce the amount of load it is configured to handle, and restore this amount only after things have subsided. You can achieve this by adjusting the LoadWeight property of the MicrosoftNLB_PortRuleEx class running on that node (refer to MSDN documentation of the MicrosoftNLB_PortRuleEx class). By changing this quantity, you can decrease/ increase the amount of future traffic handled by the node on that port rule. Congratulations-you have added a simple but effective dynamic load balancing scheme to your load-balanced application!

By monitoring the health of your application across the cluster, and making appropriate adjustments to the load handled by each node, you will increase the overall responsiveness, reliability, and performance of your load-balanced application – all in a way that makes sense to your app.

–Siddhartha Sen

Software Design Engineer, Clustering & High Availability Group, Windows Server

image from book

And last but not least, here are some helpful troubleshooting tips when you have Network Load Balancing deployed in your environment:

image from book
From the Experts: Tips on Troubleshooting NLB Issues

If you see that some of your clients are not getting serviced by NLB hosts, you can take the following steps to isolate the issue:

  1. The first thing to check is whether the application running on top of all hosts in a cluster is behaving as expected. When a application running on top of a host dies, NLB doesn’t automatically move the traffic to a different host in the cluster. The trick to narrow down the problem is to first see if you see the issue with one node NLB cluster (stop all other hosts and then the one being tested). If you can isolate the host, try to reproduce the problem without NLB bound.

  2. Next start Network Load Balancing Manager from a client/host that has access to all the hosts in the cluster. If Network Load Balancing Manager gives you any errors, try to fix them. The errors shown by Network Load Balancing Manager can be fixed most of the time by reapplying the last known configuration on the host one connects. This can be done by right-clicking on the cluster name in Network Load Balancing Manager, selecting cluster properties, and clicking OK.

  3. Make sure next that all the port rules you want are correct by re-verifying your port rules. To do this, right-click the cluster, select cluster properties, and take a look at the Port Rules tab. Many times rules are incorrectly defined, so make sure you read the description about how various port rules behave and be sure you understand the difference between single affinity, no affinity, diabled rules, rules with different weight, default host rules, and so on.

  4. The next step in troubleshooting would be to check whether the information shown by Network Load Balancing Manager is consistent with the output of command-line utilities like the nlb params and nlb display commands.

  5. The next step in triaging would be to make sure each host in the cluster is seeing all the incoming traffic. This can be done by sending ICMP ping commands to the cluster from a few clients. If ping works then also make sure you can connect to other services (RPC, WMI, and so on) on each host. This can be done by starting Network Monitor on each host. Network Monitor can be downloaded fromhttp://www.microsoft.com/downloads/details.aspx?FamilyID=&displaylang=en. You should see client traffic received on each host. In your network capture you should also see NLB heartbeats (an Ethernet broadcast packet with the bytes 0x886f after the source address in the Ethernet frame) being exchanged among the hosts. If traffic is being handled by only one host, make sure that your switch has not learned the MAC address of the cluster.

    –Amit Date

    Software Design Engineer in Test, Clustering & High Availability Group, Windows Server

image from book




Microsoft Windows Server Team - Introducing Windows Server 2008
Introducing Windows Server 2008
ISBN: 0735624216
EAN: 2147483647
Year: 2007
Pages: 138

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net