Why High Availability and Load Balancing Are Different

One of the keys to successful systems is to clearly define the requirements before embarking on the implementation path. Determining whether a system needs high availability or load balancing or both is essential. So, what is the difference between high availability and load balancing? Well, aside from fundamentally different definitions, nothing.

High availability: Remaining available despite the failure of one's components; usually used to describe a system or a service; above and beyond fault tolerant.

Load balancing: Providing a service from multiple separate systems where resources are effectively combined by distributing requests across these systems.

Almost all load-balancing hardware vendors in existence advertise their devices as providing high availability and load balancing. They take a single device and magically make a set of 10 servers work together for a tenfold performance increase. Pay the price, buy the product, and all your pain will disappear. Yeah. Right.

Aside from the fundamental foresight that statement lacks, this is an absolute falsehood. The idea that a web cluster of 10 machines can use each box to its optimal capacity is a misconception. That would require knowing all the requests that will come in the future, how long they will take, what resources they will occupy, and exactly what the available resources are on every machine in the cluster at the time that a request arrives to be load balanced. That information isn't available, so obviously you cannot build an optimal system. In fact, most solutions don't even use best-of-breed performance metrics in their decision making, which makes them more a simple "load distributor" than a "load balancer."

Keep in mind that the goal is not to build an optimal system (that is impossible), but rather to run the business well. A system that meets the business needs is efficient and has a total cost of ownership that is reasonable over the short and long term.

Load Balancing Is Not High Availability

Load balancing has the effect of unifying separate systems to accomplish a common goal. Available resources from each machine contribute to a virtual resource pool, and that pool is drawn upon to accomplish a computing task (for example, serving a web page). Each contributing machine can have a different amount of available resources. Here is where it starts to touch on high availability.

If a machine in that pool suddenly has no resources to provide, tasks can still be completed using the resources available from other machines. So in effect, if a machine crashes, a valid load-balancing configuration will refrain from allocating tasks to that machine (or using its resources) until it becomes available once again.

This is not a solution to high availability. In fact, it doesn't solve the problem at all; rather, it moves the problem into a different tier of components. Often, this just shifts the responsibility from systems administrator (SA) to network administrator (NA). Although this is good for the SA and bad for the NA (from a responsibility perspective), the business has not gained or lost any availability assurancesthough it has likely lost a good chunk of money on building the solution.

High Availability Is Not Load Balancing

High availability simply means resilience in light of failures. Nowhere in its definition does it imply efficient resource usage or increased capacity. Specifically in hot-standby systems, one system is performing all the work while the other waits idle for the active system to crash. In these configurations, one system is contributing 100% of its resources, whereas all others are contributing 0%.

Although some two-node failover and peer-based, highly available systems offer the flexibility of using both components at the same time, this should not be considered a load-balancing solution. A legitimate load-balancing scheme must have some intelligent mechanism of allocating the available resources. These systemsalthough capable of providing concurrent service on more than one componentdo not incorporate any intelligence into the process of allocating their resources.

It is possible to build a system that so tightly couples high availability and load balancing that they are indistinguishable, but, armed with the previous information, I challenge you to find a web-centric high availability/load balancing device where you can't partition the feature sets as clearly as the concepts of high availability and load balancing are presented here.