Deploying Firewall Load Balancing | Optimizing Network Performance with Content Switching: Server, Firewall and Cache Load Balancing

Firewall load balancing comes in many guises, and dependent on your requirements, the basic method may suit you. It is important to note that content switches are not always used for firewall load balancing, and we will briefly discuss the other methods available to give you a better insight into the benefits of using content switches.

Using VRRP

By using a mechanism such as VRRP (or in some cases, their own proprietary implementation of this standard), firewall manufacturers allow for firewalls to be deployed and still provide the resilience that organizations demand. While this allows for resilience, it does very little for performance. VRRP is a standards-based method by which multiple devices share a virtual address. This address is typically a Layer 2 MAC and Layer 3 IP address and can either be a different address or one associated with one of the devices.

When deployed on a firewall cluster, all packets are sent to the VRRP address and, based on who is the master, that device will answer the ARP request and ensure that all traffic is sent via its interface. In the case of a failure of the master device, the other devices participating in the VRRP process will negotiate with one another on who will take over as master. This process typically takes about three seconds. In most scenarios, there are only two devices running VRRP, so on the failure of the master the secondary device will detect no response and after three seconds will take over as master. If other devices are involved, this may increase the fail-over time by one second.

It is important to understand that the state table on the firewalls will need to be synchronized, otherwise the fail-over will result in dropped packets. Synchronization of this is typically configured within the firewall software or can be achieved using a third-party software package. This is an effective solution for low usage sites requiring resilience but not greatly concerned with increasing throughput.

Using Software-Based Solutions

Many different third-party manufacturers provide software to allow for load balancing and load sharing of firewalls. These have their merits, but also rely heavily on VRRP or some form of multicast or proprietary method by which to communicate. One thing often overlooked when using software to provide resilience is that firewalls are traditionally areas where bottlenecks occur due to the process needed to inspect and enforce the policies. Adding additional overheads to the device to maintain state tables, share load information, and redirect traffic to the correct nominated firewall impacts the very performance that you are trying to increase. There will always be a place for this type of deployment, but what should be weighed is the cost of implementing this software versus the cost of implementing content switches, which allow for not only firewall load balancing but dependent on the content switch used can also provide value added features such as server load balancing, WCR, and other popular applications.

Using Content Switches

Content switches provide the most effective method to deploy comprehensive security while still ensuring excellent security. With the majority of content switch manufacturers, firewall load balancing requires the use of a clean and dirty switch placed on either side of the firewalls. This can be a single switch on either side or in pairs as we shown in Figure 9-1.

Figure 9-1. Clean and dirty side switches needed for firewall load balancing to maintain session state.

graphics/09fig01.gif

The reason for this is that maintaining state through the firewalls is critical. As discussed earlier, a stateful firewall needs to see the entire session to ensure that it is valid. Therefore, by providing a content switch at the top and bottom of the firewalls creating what is often called a firewall load balancing sandwich, the switches can ensure that the traffic is sent into the network via one firewall and the return path is sent out through the same firewall, thus achieving statefulness.

Achieving this is relatively simple, and each content switch manufacturer has their own method by which to accomplish this, but the basic concept is very similar. The need to have a preconfigured path is essential.

Creating the Paths

Different content switch manufacturers can support different numbers of firewalls that can be load balanced within a firewall load balancing sandwich. This can vary from a maximum of 16 up to 256. While we need at least two to make effective use of load balancing, the number that you want to use is relative to the requirements of your network. Regardless of how many we want to load balance, one thing will always be the samehow do we ensure traffic flows through a particular firewall? We have to create a path that the content switch understands.

One method is to configure a path by detailing every IP interface that the traffic will pass through, be they on the ingress or egress of a device, and then configure the switch to forward the packets from a session through that pathsimilar to "connect the dots." Likewise, the bottom switch will obviously need to have the same path configured to maintain a stateful session. This is a fairly effective method allowing multiple firewalls to be configured. However, it doesn't scale well when using content switches and VRRP in providing resilience.

The other process, which allows for multiple content switches on the clean and dirty side, is to use the interfaces on the clean and dirty switches as the end points of the paths. Configuring these interfaces as real servers allows the content switch to health check these servers ensuring that the path is active and usable. The other reason is that by using load balancing techniques within the configuration, the ability to dynamically change the path based on a link, firewall, or switch failure is achieved. However, there is still an amount of static configuration that is needed in order to get this process to operate efficiently . First, a routing table needs to be defined from the perspective of the content switch. This needs to have all the routes from the clean to the dirty side configured. For each firewall, there needs to be at least two routes configured, so the more firewalls within the sandwich, the more routes need to be configured. Once these routes have been configured, the health check is run through the different firewalls, thus ensuring that they are active and passing data. Figure 9-2 shows how we would set up interfaces as real servers and how the content switches see the different paths.

Figure 9-2. Creating paths for firewall load balancing to ensure that the traffic flow is stateful from the clean to dirty side, and vice versa.

graphics/09fig02.gif

Let's look at the paths through the firewall from the viewpoint of content switch 1.

Traffic will flow from Real 1 (R1) Interface 1 (IF1) through Firewall 1 (IF2) to content switch 2 to R1 (IF1). Alternatively, should this link fail or the firewall fail, then traffic from this switch will flow from R2 (IF2) via Firewall 2 (IF2) and through to content switch 2 to R2 (IF2). The same method would be used from content switch 3. The reverse is true from content switches 2 and 4. As another level of redundancy should a content switch fail, the real servers are backed up by the real servers on the secondary content switch. In other words, R1 and R2 are backed up by R3 and R4. This provides a very resilient and dynamic configuration.

This method allows for multiple firewalls to be deployed, as additional firewalls can be added to the configuration at a later date. All that is required is that additional paths for the new firewalls need to be added to the configuration. It provides the ability to scale to many firewalls while still maintaining just two content switches (or four if resilience is required).

This example used Layer 2 switches, but that is not necessary, as we will see later in this chapter.

Health Checking Firewalls

The key feature of firewall load balancing is the content switch's ability to health check through the firewall. This ensures that not only is the interface active, but the actual firewall is passing traffic. Typically, ICMP (Internet Control Message Protocol) is used, but often HTTP is preferred by the security administrators. The reason for this is that ICMP is often seen as stateless and not something that a security administrator would necessarily allow through the firewall due to its perceived vulnerability. Regardless as to which health check is used, it is important to ensure that a policy allowing the health check between all interfaces is configured with the initiating interfaces (or real servers) being the source IP address. The policy only needs to allow data between the clean and dirty side of the firewall sandwich and can be tightened as SIP, DIP, and protocol type are known, and can therefore totally eradicate any security threat.

Traffic Flow through a Firewall Load Balanced Sandwich

Redirection is what makes firewall load balancing work. It is the feature, discussed in Chapter 7, Persistence, Security, and the Internet , that allows us to load balance multiple firewalls. By creating filters that use the redirection function, we can redirect traffic through any firewall and still maintain state. We will of course have had to configure the paths prior to this, as it is this information that the content switch uses when making the load balancing decision. Figure 9-3 illustrates this.

Traffic enters the dirty side content switch and hits the redirection filter. The SIP and DIP are used to determine a value. The algorithm used to create this value is similar to that discussed in Chapter 5, Basic Server Load Balancing . This value is then associated to a real server on the clean side of the sandwich. The real server that is selected for that particular SIP/DIP combination will always be the same due to the algorithm.
As the content switch knows about the real servers but only has one preconfigured route to it, it will select that route and forward the packet to the next hop. In this case, that is the firewall.
The firewall receives this packet still with the original SIP and DIP intact and makes a decision based on its policy whether to allow or deny the packet. On validation, it will then forward the packet to the destination address via traditional routing. In this case, it is the VRRP master that will forward on the packet.
All subsequent packets are routed through, and as the content switch is state aware, the session table has been updated with which real server has been selected, thus ensuring that the firewall load balancing is performed as quickly as possible

Figure 9-3. Inbound traffic flow in a typical firewall load balanced configuration showing how the path is selected.

graphics/09fig03.gif

It is important to understand that this DIP could be any address anywhere within the network. It could be another router or appliance or even another firewall. Most probably it would be a VIP for some load-balanced service. On high-end content switches, this could be configured on the clean side switches. It should be pointed out here that not all content switch manufacturers can support multiple applications on their devices, so additional hardware may be required to allow for server load balancing. While this is not ideal and increases network complexity, this is a limitation of software. Therefore, when deploying firewall load balancing and server load balancing, make sure that the content switch can support both simultaneously if required.

The return traffic flow is illustrated in Figure 9-4.

Returning traffic enters the clean side content switch and hits the redirection filter. At this stage, the SIP and DIP have now been reversed . As we use both the SIP and DIP it is important that the algorithm used to create this value will yield the same result when the two addresses are interchanged. If this were not the case, then firewall load balancing would not work. This value will be the same as that on the inbound session, thus selecting the same real server. The content switch will update its session table for all subsequent packets within that session.
As the content switch knows about the real server but only has one route to it, it will select that route and forward the packet to the next hop. In this case, that is the firewall. As can be seen, it is critical that the return path is configured through the same firewall for the corresponding real server.
The firewall receives this packet with the SIP and DIP intact. It checks its session table and based on the validity forwards the packet to the next hop. In this case, the dirty side content switches. They in turn will forward them on to the correct next hop. In this case, the VRRP master will forward on the packet to the correct next hop.

Figure 9-4. Outbound traffic flow in a typical firewall load balanced configuration.

graphics/09fig04.gif

One thing to remember when understanding traffic flow is that the dirty side switches do not actually send the data to the clean side interface (or real server). They merely use this as a mechanism to create a load balanced scenario, and it is this that allows multiple firewalls to be configured. Once the data arrives at the selected firewall, it will be handled by that firewall and routed or denied based on the policy. The firewall will not send it on to the real servers unless the DIP is for that server. This would obviously be the case for the health checks.

NATing Firewalls

NATing firewalls create all sorts of issues when performing firewall load balancing. The majority of early content switch designs preferred not to use NAT on load-balanced firewalls. The reason for this is that the load balancing metric used to determine the route uses both SIP and DIP. For ingress traffic, the SIP/DIP hash will create one value and the packet will be routed through the selected firewall. This firewall then NATs the SIP, DIP, or both and routes the packet onward. On the return, the content switch performs the hash based on SIP/DIP. Unfortunately, these addresses have not been NATed back to the original so the value could potentially provide a different value to that of the ingress packet. The route selected will be through another firewall and this firewall will drop the session due to no corresponding entry in its state table. As is plain to see in Figure 9-5, NAT causes the firewall load-balancing sandwich to break. Fortunately, some content switch manufacturers have realized the necessity of NATing firewalls and have created a method by which to overcome this limitation.

Figure 9-5.a. Breaking the sandwich with NAT.

graphics/09fig05a.gif

Figure 9-5.b. Breaking the sandwich with NAT.

graphics/09fig05b.gif

The method by which this is achieved is configuring firewall load balancing as per normal on the dirty side switches. On the clean side switches, configure the switch to record the source MAC address in addition to all other session information. This feature is typically a single configuration statement. What this does is allow the clean content switch to document the source MAC of all traffic that ingresses the switch from the firewall. As it is the firewall that has routed the traffic, the SMAC will be that of the firewall. On the return path, the content switch will look up the session entry in its session table, see that there is a MAC address associated with it, and substitute the destination MAC address with the recorded SMAC. Layer 2 processing will then forward the packet to the correct firewall. This is illustrated in Figure 9-6.

Figure 9-6. MAC address persistence shows how to overcome NATing firewalls for inbound and outbound traffic.

graphics/09fig06.gif

This ingenious method is often called MAC address persistency and allows network administrators to use content switches when NAT or proxy type devices are used that rely on IP addresses to ensure that traffic is passed back to them. MAC address persistency can be used with a multitude of different applications and is not just associated with firewall load balancing.