Load Balancing


OpenBSD's load balancing integrated with PF allows you to share multiple NAT addresses among multiple users, both entering and leaving the network. This lets you both make multiple application servers show up as a single host, or even share several IP addresses amongst your NAT users. This is also called "address pooling," because several IP addresses are in a common pool.

In many situations, IP-based load balancing is more akin to "load scattering"; traffic is almost randomly sprayed across the addresses pooled together. OpenBSD includes the ability to specify exactly how you want your load balancing to work.

Types of Load Balancing

PF supports four distinct types of load balancing: round-robin, random, source-hash, and bitmask.

The default, round-robin, sends each connection to a different IP address in the group, looping back to the beginning once it has gone through every IP. The first incoming connection goes to the first address in the pool, the second to the second address, and so on. This is quite quick and easy to implement, but may not be suitable for all applications being load-balanced. You may have problems if using round-robin load balancing for SSL web servers or certain web applications that track user information by IP address. (A properly written enterprise-level web application won't do that, but those applications are not as common as you might think.) The advantage to round-robin load balancing is that you can use any combination of IP addresses in your pool. For example, you could have three identical web servers with IP addresses of 192.168.0.2, 192.168.0.8, and 192.168.0.9 behind your OpenBSD firewall, and want them to share the IP address of 209.69.178.26. This is the textbook application for load balancing.

Random load balancing sends each connection to a random IP address in the pool. This has all of the disadvantages of round-robin load balancing, with the additional restriction that the address pool must be a proper network block instead of just a list of IP addresses. You can use an address pool of 192.168.1.1/ 26, but not the IP addresses 192.168.1.2, 192.168.1.3, and 192.168.1.44 — the latter is not a network block. See Chapter 5 for a reminder of network blocks and netmasks. For example, if you have four identical Web servers behind your firewall, with IP addresses of 192.168.1.4, 192.168.1.5, 192.168.1.6, and 192.168.1.7 (also known as the netblock 192.168.1.4/30), you could use random load balancing to assign these to a single public IP address.

Source-hash load balancing uses a hash of the IP address to assign connection requests to a pool address. This means that every machine that connects is always assigned to the same address, which will alleviate many application problems of round-robin and random load balancing. The catch is that the address pool must be a proper network block, just as in the random example. PF by default uses a random hash value each time you reload the ruleset, but you can specify a static hash value if you like. Using a fixed hash value means that each client gets an identical IP address even across firewall reloads.

The bitmask load-balancing method isn't actually load-balancing, but instead a way of doing NAT or redirection across two blocks of IP addresses of identical size. This might look similar to bi-directional NAT on a massive scale, but doesn't map connections in the other direction and only works if you have two network blocks. Technically, this "grafts" the network portion of one address onto the host portion of the other. In practice, this means that the first address of one block is permanently mapped to the first address of the second block. The second address of the first block is mapped to the second address of the second block, and so on. For example, you might have several servers on the inside of your firewall, all with IP addresses in the range 192.168.0.16/28. If you want to "static NAT" all of these servers to the public IP addresses 209.69.178.16/28, you could use bitmask load-balancing. 192.168.0.16 would be visible at 209.69.178.16, 192.168.0.22 would be at 209.69.178.22, and so on. While I chose these network numbers to be easy to understand, you could just as well map addresses between wildly different blocks such as 192.168.0.224/28 and 209.69.178.16/28; 192.168.0.224 (the first number in the first block) would become 209.69.178.16 (the first number in the second block) in that case.

Now that you understand what sorts of load balancing are available, let's see how they work in practice.

Outbound Load Balancing

OpenBSD's default NAT conceals an entire network behind a single public IP address, which is suitable for most home offices and small businesses. Each TCP/IP connection uses a unique port number, however, and a single IP address has only 65,536 ports. If you have a large network with a lot of active Internet users, you may run out of TCP/IP ports on that single IP! That's where you need to implement outbound load balancing.

For round-robin load balancing the syntax is very similar to the standard NAT rules, but you specify your public IP addresses in brackets.

 1 nat on 2 fxp0 from 3 10.0.0.0/8 to any -> { 4 209.69.178.25, 5 209.69.178.26 } 

We use the standard 1 nat keyword to indicate that this is a NAT rule, and give our 2 external interface name and the 3 internal IP address range. The last part of the rule is the new part, where we list 4 5 both of our public NAT IP addresses in brackets. Outbound traffic on this network will undergo round-robin NAT.

As discussed under the round-robin load balancing discussion earlier, this can cause problems for programs that track connections by IP address. If you have enough client computers that you must use outbound load balancing, some user almost certainly has a business-critical Internet application that will have grief with this setup. Random load balancing will cause the same problem, and if you have this many IP addresses in use bitmask load balancing is absurd. Using hash load balancing will give a simulation of persistence, assigning each NAT client to the same public IP address. Specify source-hash at the end of the configuration line to use source hashing. Remember to use only a valid network block for your translated addresses, or you will get an error like this.

 # pfctl -f pf.conf pf.conf.lb:8: only round-robin valid for multiple redirection addresses pfctl: Syntax error in file: pf rules not loaded # 

Instead, drop the brackets and just list the network block.

 nat on fxp0 from 10.0.0.0 to any -> 209.69.178.18/31 source-hash 

Each time you load the ruleset, OpenBSD will hash your private IP address with a random value and use the result to assign the client a public IP address. If you want the client to always get the same public IP address, you can specify a hash value. This hash can either be a string, or a hex number. If you use hex, the value must be 32 characters, but you can use a string of any length.

 nat on fxp0 from 10.0.0.0 to any -> 209.69.178.18/31 source-hash SomeRandomHashString 

Outbound bitmask-style load balancing looks very similar, even though the effects are quite different.

 nat on fxp1 from 10.0.1.0/24 to any -> 209.69.178.0/24 bitmask 

This will map each IP address beginning with 10.0.1 to an IP address beginning with 209.69.178.

Inbound Load Balancing

The most common use for incoming load balancing is to balance load between several Web or application servers, concealing them behind a single IP address. For example, No Starch Press could have four high-capacity web servers providing their content and ordering site, all behind the public web server at 66.80.60.21.

 rdr on fxp1 proto tcp from any to 66.80.60.21 port 80 -> { 192.168.1.4, 192.168.1.5} 

While this quickly gets long when you have many web servers, you can of course use macros to simplify it.

 $webservers = { 192.168.1.4, 192.168.1.5, 192.168.1.6, 192.168.1.7 } rdr on fxp1 proto tcp from any to 66.80.60.21 port 80 -> $webservers 

This defaults to providing round-robin load balancing. The good part about this is that you can easily change the servers in the pool. If one of your content servers fails, you can remove its IP from the pool, reload your ruleset, and business will proceed normally. This is an excellent trick to meet high service level agreements for Windows web servers. The catch is, with round-robin redirection your application might have problems, and SSL may also have problems. You can work around the latter problem by using a SSL accelerator to provide session negotiation, but rewriting your application may be more problematical.

If you have an even number of web servers, you can use the source-hash option to get around some of these problems. The catch with these is that you won't have the ability to easily move servers in to or out of the pool — while you can renumber application servers, or give one server multiple IP addresses, it's not nearly as simple as just editing the configuration file and rebooting. Still, it's an option. Here, we split the load on http://www.nostarch.com amongst application servers at 192.168.1.4, 192.168.1.5, 192.168.1.6, and 192.168.1.7, using source-hash load balancing:

 rdr on fxp1 proto tcp from any to 60.80.60.21 port 80 -> 192.168.1.4/29 source-hash 

Again, you could add a static hashing value to make every application client always visit the same web server.

Unlike some commercial load balancers, PF does not check to confirm that individual application servers are up and running, and will not automatically remove failed application servers from the pool. There is wide interest in implementing this functionality, but it hasn't happened yet. It may be available by the time you read this, so I advise you to check pf.conf(5) and the OpenBSD mailing list archives to learn the latest.




Absolute Openbsd(c) Unix for the Practical Paranoid
Absolute OpenBSD: Unix for the Practical Paranoid
ISBN: 1886411999
EAN: 2147483647
Year: 2005
Pages: 298

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net