Server Health Checking | Optimizing Network Performance with Content Switching: Server, Firewall and Cache Load Balancing

We saw in the previous section that the distribution metrics are a very important factor in a content switch deployment; now, let's consider another ”that of determining the health status of the associated real servers or network resources. Clearly, the distribution metric you select to implement will determine how a certain server or resource is selected, but what's equally important is deciding on a set of criteria that show the server to be " healthy ." Before we look at some of the commonly available health checking mechanisms, we should first consider what it means for a server to be healthy.

In the context of an HTTP server, for example, we might consider that if we can open a TCP connection to the server, that this is sufficient. This view, however, gives us little indication of the server's ability to actually serve the content that the user requires. In considering this, we're introducing the concept of what layer we health check at. Clearly, sending an ICMP request, commonly known as a "ping," to the interface of the device really proves nothing more than the IP stack on that interface is operating and there is a valid IP route between the source and the server. By moving our health check intelligence up the seven-layer model to the Transport and Application layers , we're able to associate the health check with the very thing that the user is interested in: content.

Most modern content switches are able to offer health checking at Layers 2 through 7 and use techniques such as scripting to extend this even further. The type of health check that you choose to deploy will depend largely on the nature of the application and how it is deployed. For many implementations , Layer 4 health checks, such as opening and closing TCP sockets, are sufficient. For more complex deployments where the application is potentially spread across multiple server resources, the use of Layer 7 health checks and beyond might be required. Let's look at some of the most common approaches.

Link-Based Health Checks

The most basic form of health check works effectively at Layer 2 and checks the link status on the port through which the real server can be reached. This type of mechanism offers no insight into the IP layer or above on the real server, but might be useful for devices that operate in stealth mode and do not operate an IP stack on their network interfaces such as IDSs.

ARP Health Checks

Still working at Layer 2, an ARP health will cause the content switch to send an ARP request for the real server's IP address, or that of the next hop route if the server is off subnet. Again, the ARP health check offers little insight into the correct operation of the real server from an application point of view. ARP health checks are useful for devices such as firewalls that will typically block all non-essential forms of IP traffic. Even firewalls with the strictest of enforcement policies will accept ARP requests for their interfaces. Obviously, an ARP health check does not extend further than a local subnet, so if the real servers are located another routed hop away, then the health check will only extend to the router interface leading to the real server.

ICMP Health Checks

Ping is a commonly used network diagnostic and troubleshooting utility. ICMP Echo Request packets are sent from the content switch to the real server and, if the IP stack is operating correctly, the real server will send an Echo Reply to indicate that the path is up. ICMP health checks offer the first real indication that the path between the content switch and the real server is operational from an IP perspective, but still does not prove that a particular application will be available on that server. ICMP health checks are useful for health checking IP devices and paths where there is no concept of an application endpoint to test.

TCP Health Checks

With TCP health checking, we're able to begin to determine the health of not only the path to the real server, but also which applications will be listening there. Typical TCP health checks will initiate a TCP three-way handshake between the content switch and the real server to prove that the application is listening correctly on the assigned port. The content switch will then either perform a graceful TCP FIN and FIN-ACK teardown , or will simply issue a TCP RESET to end the connection.

TCP health checks will presuppose one thing: that the application will close the TCP listening port if it is terminated. It is possible that an application that is not terminated gracefully, or has "crashed," will leave an erroneous TCP listener on its application port. In this instance, a TCP-based health check will be unable to determine that the application itself has failed and will probably continue to send user requests to that server.

Application Health Checks

Many content switches now offer pre-written application health checks for many of the well-used Internet type applications, such as HTTP, SMTP, FTP, POP3, and DNS, although this is by no means an exhaustive list. Application health checks work by allowing the administrator to define a piece of application data that is to be used in order to interact with the application on the server. In HTTP terms, for example, this might be the URL of a test Web page that should be retrieved. In this instance, the content switch will open a TCP connection to the real server on port 80 (or another defined port) as described earlier and send an HTTP GET request for the configured URL. In terms of FTP, this might be a username and password combination to log on to the FTP server and perform a directory listing.

Application health checks are useful for determining things such as inter-server dependencies. Imagine, for example, that a Web server requires the functionality of an associated application or database server in order to accept incoming user requests. By defining the application check URL as a CGI or ASP page, for example, the Web server can be instructed to do further checks that can be called through such a page. This creative use of application health checks gives some indication of their usefulness in understanding the user application on the server and using this to determine the true server health.

Health Check Scripting

If application health checks allow some level of flexibility in how the content switch interacts with the real servers, then scripted health checks take this even further. Many content switches also offer the ability to write custom scripts to interact with the real server in opening and closing TCP ports, sending and receiving application data and a whole host of other options. Scripting languages and capabilities differ between different vendor products, but in general, they offer the flexibility to interact with the real servers and resources in order to get around problems presented by even the most complex applications.