Gathering Design Requirements | Inside Network Perimeter Security (2nd Edition)

Whether you are designing a new network or working with an existing infrastructure, it helps to treat components and requirements of your environment as elements of a unified perimeter security architecture. Doing so allows you to identify scenarios in which devices might be configured in an inconsistent or even conflicting manner, and lets you tune the design to match your needs. Based on specifics of your environment, you will decide, for instance, whether a single packet-filtering router on the edge of your network will provide sufficient protection, or whether you need to invest in multiple firewalls, set up one behind another, to properly segment your network. A good place to start designing your perimeter architecture is determining which resources need to be protected.

Determining Which Resources to Protect

In the realm of network security, we focus on ensuring confidentiality, integrity, and availability of information. However, the notion of information is too general and does not really help to make decisions that account for specifics in a particular situation. For some organizations, the information that needs to be protected is credit card and demographics data; for others, it might be legal agreements and client lists. Information can also take the form of application logic, especially for websites that rely on dynamically generated content. To decide what kind of network perimeter will offer adequate protection for your data, you need to look at where the information is stored and how it is accessed.

Servers

Modern computing environments tend to aggregate information on servers. This makes a lot of sense because it is much easier to keep an eye on data that is stored centrally; some of the problems plaguing peer-to-peer file-sharing systems such as Kazaa demonstrate difficulties in providing reliable access to information that is spread across many machines. Because all of us face limitations of overworked administrators and analysts, we often benefit from minimizing the number of resources that need to be set up, monitored, and secured.

The Case of the Missing Server

InformationWeek once published a story about a missing server that had been running without problems for four years. System administrators, unaware of the server's whereabouts, had to resort to manually tracing the network cable until it led them to a wall. Apparently, "the server had been mistakenly sealed behind drywall by maintenance workers."¹ (Alas, some claim that this is just an urban legend,² but there is some truth to all tales.) Do you know where your servers are?

Make sure you know what servers exist on your network, where they are located, what their network parameters are, and what operating systems, applications, and patches are installed. If you have sufficient spending power, you might consider taking advantage of enterprise system management software such as HP OpenView, Microsoft Systems Management Server (SMS), and CA Unicenter to help you with this task.

If the infrastructure you are protecting is hosting multitier applications, you need to understand the role of each tier, typically represented by web, middleware, and database servers, and their relationship to each other. In addition to documenting technical specifications for the server and its software, be sure to record contact information of the person who is responsible for the business task that the system is performing. You will probably need to contact him when responding to an incident associated with this system.

Workstations

End-user workstations serve as an interface between the technical infrastructure that powers the computing environment and the people who actually make the business run. No matter how tightly you might want to configure the network's perimeter, you need to let some traffic through so that these people can utilize resources on the Internet and function effectively in our web-dependent age. To create a perimeter architecture that adequately protects the organization while letting people do their work, you need to make sure that the design reflects the way in which the workstations are used and configured.

For example, if your Windows XP workstations never need to connect to Windows servers outside the organization's perimeter, you should be able to block outbound Server Message Block (SMB) traffic without further considerations. On the other hand, if your users need to access shares of a Windows server over the Internet, you should probably consider deploying a VPN link across the two sites.

Similarly, evaluating the likelihood that your system administrators will routinely patch the workstations might help you decide whether to segment your environment with internal firewalls. Organizations that can efficiently distribute OS and application updates to users' systems are less likely to be affected by an attacker gaining access to a workstation and then attacking other internal systems. At the same time, if the centralized control channel is compromised in this configuration, the attack could affect many systems.

When examining your workstations, look at the kind of applications and operating systems they are running and how patched they are. Is data stored only on your servers, or do users keep files on their desktops? Don't forget to take into consideration personal digital assistant (PDA) devices; corporate users don't hesitate to store sensitive information on their Palms, BlackBerries, Pocket PCs, and smartphones. You also need to make special provisions for traveling and telecommuting users; by going outside of your core network, telecommuters will unknowingly expand your defense perimeter.

Networking Gear

Bridges, switches, and routers interconnect your computing resources and link you to partners and customers. In earlier chapters, we talked about the role of the router and explained the need to secure its configuration. Modern high-end switches offer configuration complexities that often rival those of routers, and they should be secured in a similar manner. Moreover, Virtual LAN (VLAN) capabilities of such switches might require special considerations to make sure attackers cannot hop across VLANs by crafting specially tagged Ethernet frames.

Note

Is this process beginning to look like an audit of your environment? In many respects, it is. Given budget and time limitations, you need to know what you are protecting to determine how to best allocate your resources.

Make sure you know what devices are deployed on your network, what function they serve, and how they are configured. Hopefully, you will not keep finding devices you did not think existed on the network. It is not uncommon to see an organization with "legacy" systems that were set up by people who are long gone and that everyone is afraid to touch for fear of disrupting an existing process. Be mindful of network devices that terminate private or VPN connections to your customers or partners inside your network. You need to evaluate your level of trust with the third party on the other end of the link to determine the impact of such a device on your network's perimeter.

Modems

When looking for possible entry points into your network, don't forget about modems that might be connecting your servers or workstations to phone lines. Modems offer the attacker a chance to go around the border firewall, which is why many organizations are banishing modems from internal desktops in favor of VPN connections or centralized dial-out modem banks that desktops access over TCP/IP. With the increasing popularity of VPNs, the need to connect to the office over the phone is gradually decreasing. Yet it is common to find a rogue desktop running a remote control application such as pcAnywhere or Remote Desktop over a phone line.

Sometimes, a business need mandates having active modems on the network, and your security policy should thoroughly address how they should and should not be used. For example, modems that accept inbound connections might need to be installed in data centers to allow administrators out-of-band access to the environment. In such cases, your security architecture should take into account the possibility that the modem might provide backdoor access to your network. To mitigate this risk, pay attention to host hardening, consider deploying internal firewalls, and look into installing hardware authentication devices for controlling access to the modems. (Such devices are relatively inexpensive and block calls that do not present a proper authentication "key.")

Controlling Modem Connections

Several vendors offer devices that control access to telephone lines in a manner reminiscent of traditional firewalls. Instead of keeping track of protocols and IP addresses, telephone firewalls look at call type (voice, data, or fax) and phone numbers to determine whether to let the call through. Products in this category may also allow administrators to automatically disconnect workstations from the LAN when they establish modem connections. Examples of such products are SecureLogix TeleWall (http://www.securelogix.com) and CPS Basic Mini Firewall (http://www.cpscom.com).

Other Devices

When looking for devices that can be used to store sensitive data or provide access to information, private branch exchange (PBX) systems often slip people's minds. However, even back in the era when voice communications were completely out of band with IP networks, attackers knew to tap into organizations' voice mail systems over the phone, fishing for information and setting up unauthorized conference calls and mail boxes. With the growing popularity of IP-based telephony systems, the distinction between voice and data is beginning to fade. For instance, in a hybrid system offered by ShoreTel, traditional analog phones are employed, but the dialing process can be controlled using a desktop agent over TCP/IP, and voice mail messages are stored as .wav files on a Windows-based server.

Tip

If it can be accessed over the network, it should be accounted for in the design of the perimeter.

Don't forget to consider other devices that comprise your computing infrastructure, such as modern printers and copiers. Manufacturers of these systems often embed advanced services into these devices to ease management without regard for security. Don't be surprised if your Sharp AR-507 digital copier comes with built-in FTP, Telnet, SNMP, and HTTP servers, all with highly questionable authentication schemes.³

Indeed, modern networks are heterogeneous and support communications among a wide range of resources. We need to understand what servers, workstations, and various network-aware devices exist on the network to determine what defenses will provide us with adequate protection. Another factor that contributes toward a properly designed security architecture is the nature of threats we face. Potential attackers who could be viewed as a threat by one organization might not be of much significance to another. The next section is devoted to determining who it is we are protecting ourselves from, and it's meant to help you assess the threat level that is appropriate for your organization.

Determining Who the Potential Attackers Are

We think we know who our enemy is. We're fighting the bad guys, right? The perception of who is attacking Internet-based sites changes with time. Sometimes we look for attackers who are specifically targeting our sites, from inside as well as outside, to get at sensitive information. In other situations, we feel inundated with "script kiddy" scans where relatively inexperienced attackers are running canned scripts in an attempt to reach the "low hanging fruit" on the network. Lately, automated agents such as remote-controlled Trojans, bots, and worms have been making rounds and threatening our resources in new and imaginative ways that we will discuss in this section.

In reality, only you can decide what kind of attacker poses the greatest threat to your organization. Your decision will depend on the nature of your business, the habits and requirements of your users, and the value of the information stored on your systems.

Each category of attacker brings its own nuances to the design of the network's defenses. When operating under budget constraints, you will find yourself assigning priority to some components of your security infrastructure over others based on the perceived threat from attackers that worry you the most.

Determined Outsider

Why would anyone be targeting your network specifically? Presumably, you have something that the attacker wants, and in the world of computer security, your crown jewels tend to take the form of information. A determined outsider might be looking for ways to steal credit card numbers or other sensitive account information about your customers, obtain products at a cost different from what you are offering them for, or render your site useless by denying service to your legitimate customers. The threat of a denial of service (DoS) attack is especially high for companies with relatively high profiles.

An Attack on Authorize.Net

Authorize.Net, a popular Internet payment processing company, fell victim to a distributed denial of service (DDoS) attack in September 2004. The attack, which began after the company refused to meet extortion demands, led to service disruptions to approximately 90,000 of its customers. Roy Banks, the company's general manager, admitted that they were caught off guard by this attack. "We've invested heavily in defense, and we thought we were prepared," he said. "But the nature of this attack was something we had never experienced."⁴ When planning your defense infrastructure, be sure to consider all attack vectors that may influence the availability of your service.

If your organization provides services accessible over the Internet, a determined outsider might be interested in obtaining unauthorized access to such services, instead of specifically targeting your information. In some cases, you might be concerned with corporate espionage, where your competitors would be attempting to obtain your trade secrets, important announcements that have not been released, your client lists, or your intellectual property. Such an attack is likely to have significant financing, and might even incorporate the help of an insider.

The difficulty of protecting against a determined outsider is that you have to assume that with sufficient money and time to spare, the attacker will be able to penetrate your defenses or cause significant service disruptions. To counteract such a threat, you need to estimate how much the potential attacker is likely to spend trying to penetrate your defenses, and build your perimeter with this threat profile in mind. Additionally, intrusion detection presents one of the most effective ways of protecting against a determined attacker because it offers a chance to discover an attack in its reconnaissance state, before it escalates into a critical incident. A properly configured intrusion detection system (IDS) also helps to determine the circumstances of an incident if an attacker succeeds at moving beyond the reconnaissance state.

Determined Insider

The threat of a determined insider is often difficult to counteract, partly because it is hard to admit that a person who is working for the organization might want to participate in malicious activity. Nonetheless, many high-profile criminal cases involve a person attacking the organization's systems from the inside. With a wave of layoffs hitting companies during economic downturns, disgruntled ex-employees have been causing something of a stir at companies that have not recognized this as a potential risk.

Insiders at Cisco

On August 20, 2001, two former Cisco accountants admitted to exploiting an internal "Sabrina" system, used by the company to manage stock options, to illegally issue themselves almost $8 million of Cisco shares. They used access to the system to "identify control numbers to track unauthorized stock option disbursals, created forged forms purporting to authorize disbursals of stock," and "directed that stock be placed in their personal brokerage accounts."⁵

The insider does not need to penetrate your external defense layers to get access to potentially sensitive systems. This makes a case for deploying internal firewalls in front of the more sensitive areas of your network, tightening configurations of corporate file and development servers, limiting access to files, and employing internal intrusion detection sensors. Note that because an internal attacker often knows your environment, it is much harder to detect an insider attack early on, as opposed to an attack from the outside.

Even without getting into the argument of whether insider attacks are more popular than the ones coming from the outside, the ability of an internal attacker to potentially have easy and unrestricted access to sensitive data makes this a threat not to be taken lightly.

Script Kiddy

The term script kiddy is at times controversial due to its derogatory nature. It typically refers to a relatively unsophisticated attacker who does not craft custom tools or techniques, but instead relies on easy-to-find scripts that exploit common vulnerabilities in Internet-based systems. In this case, the attacker is not targeting your organization specifically, but is sweeping through a large number of IP addresses in hopes of finding systems that are vulnerable to published exploits or that have a well-known backdoor already installed.

Scanning for SubSeven

In 2001, SubSeven was one of the most popularly probed for Trojan horse programs on the Internet. Knowing that SubSeven often listened on TCP port 27374 by default, attackers who were looking for easily exploitable computers scanned blocks of IP addresses in hopes of finding a computer with an already-installed instance of SubSeven. The scanning tool then tried to authenticate to the Trojan using common backdoor passwords built in to SubSeven. After the attacker was authenticated, he had virtually unrestricted access to the victim's system.

The nature of script kiddy attacks suggests that the most effective way of defending against them involves keeping your system's patches up to date, closing major holes in network and host configurations, and preventing Trojans from infecting your internal systems.

A hybrid variant of the "script kiddy" attack might incorporate some of the elements of a determined outsider threat and would involve an initial sweep across many network nodes from the outside. This activity would then be followed up by a set of scripted attacks against systems found to be vulnerable to canned exploits. In such scenarios, IDSs are often effective at alerting administrators when the initial exploratory phrase of the attack begins.

In one example of a hybrid script kiddy attack, Raphael Gray (a.k.a. "Curador") harvested tens of thousands of credit card numbers in 2000. He started the attack by using a search engine to locate potentially vulnerable commerce sites and then exploited a known vulnerability to gain unauthorized access to those sites. Referring to the attack, Raphael said, "A lot of crackers don't like what I did. They consider me to be a script kiddy, someone who can't program in any language, because I used an old exploit instead of creating a new one. But I've been programming since I was 11."⁶ By the way, one of the credit card numbers Ralph obtained belonged to Bill Gates.

Automated Malicious Agents

The beginning of the century shifted the spotlight away from attacks performed directly by humans to those that were automated through the use of malicious agents. Fast-spreading worms such as Code Red and Nimda demonstrated the speed with which malicious software can infect systems throughout the Internet and our inability to analyze and respond to these threats early in the propagation process. By the time we had detected and analyzed the Nimda worm, it had infected an enormous number of personal and corporate systems. Persistent worms such as Beagle and NetSky have demonstrated the difficulty of eliminating worm infections on the Internet scale, even after they have exhibited their presence for months. The "success" of Beagle and NetSky is, in part, due to their email-based propagation mechanisms, which allowed malicious code to slip through many perimeter defenses.

Worms and Script Kiddies

The Nimda worm had several propagation vectors that allowed it to spread across a large portion of the Internet in a matter of hours. One such mechanism allowed the worm to scan Internet hosts for vulnerable IIS servers and infect machines it came across.

This technique closely resembles actions of a script kiddy because the worm was preprogrammed to scan for several well-known exploits and backdoors. The advantage that the worm had over a script kiddy is that by infecting a corporate system via one vector, it could continue spreading internally to hosts that were not accessible from the outside.

Maintaining a perimeter that is resilient against worm-based attacks requires keeping abreast of the latest vulnerabilities and applying patches as soon as they are released. To further limit the scope of the agent's potential influence, you should consider segmenting your infrastructure based on varying degrees of security levels of your resources. Antivirus products can also be quite effective at dampening the spread of malicious programs, but they are limited by their ability to recognize new malicious code. Unfortunately, as we have learned from the past worm experiences, virus pattern updates might not be released in time to prevent rapid infection of vulnerable systems.

Defining Your Business Requirements

When designing the perimeter defense infrastructure, we need to keep in mind that the purpose of addressing information security issues is to keep the business running. Considering that security is a means, not an end. The design must accommodate factors such as services provided to your users or customers, fault tolerance requirements, performance expectations, and budget constraints.

Cost

Cost is an ever-present factor in security-related decisions. How much are you willing to spend to protect yourself against a threat of an attack or to eliminate a single point of failure? For instance, SANS Institute spent three months and significant resources putting all their servers in a highly secure Network Operations Center (NOC). Then when the Code Red worm hit, they experienced 50 times more traffic in 24 hours than during any previous peak. They were secure against intrusions, but the architecture became a single point of failure.

How much should you spend? As we discussed earlier, this depends on the perceived value both for access and control of the resources you are protecting. Making informed choices regarding the need to spend less on one component of a defense infrastructure allows us to spend more on another layer that might require additional funding. When looking at the cost of a security component, we should examine the following cost factors:

Initial hardware
Initial software
Initial deployment (time to deploy)
Annual software support and updates
Maintenance and monitoring of the component

Cost and Risk Mitigation

When considering whether to invest in a set of intrusion detection sensors at a relatively large organization, I had to present a detailed cost analysis report that included all factors outlined in the preceding list. Each sensor cost around $9,000 to obtain (software and hardware), $5,400 to deploy (let's say three days of work at the rate of $1,800 per day), and $2,000 per year for technical support and software upgrades. I also had to take into account the cost of having the device monitored and supported by in-house or outsourced analysts. Associating specific numbers with risk mitigation options allowed us to make an informed decision regarding our ability to purchase the IDS.

When calculating the cost of adding a perimeter security component, you might conclude that mitigating the risk that it would protect you against is not worth the money it would cost to deploy and maintain it. In that case, you might consider employing a less thorough but more affordable solution. For example, alternatives for purchasing a relatively expensive commercial IDS product might include obtaining open source packages such as Snort, an outsourced security monitoring solution, or an additional firewall. (An additional firewall might mitigate the same risk in a less expensive manner, depending on your environment.) Even when using "free" software such as Snort or an academic version of Tripwire, be sure to take into account the cost of the administrator's time that will be spent installing and maintaining the new system.

Business-Related Services

Of course, when setting up your network's perimeter, you need to know what services have to be provided to your users and customers. In a typical business, you will probably want to block all nonessential services at the network's border. In a university, you might find that unrestricted access is one of the "services" provided to the university's users. In that case, you will probably be unable to block traffic by default; instead, you will filter out only traffic that is most likely to threaten your environment.

Protecting your resources against threats that come through a channel that needs to be open for business use is not easy. For instance, if you decide to allow ICMP through your network for network troubleshooting purposes and are fighting an ICMP flood attack, blocking ICMP traffic at your ISP's upstream routers is likely to free up some bandwidth for services that absolutely must be accessible. If, on the other hand, you are being flooded with SYN packets or HTTP requests targeting TCP port 80 and you are a web-based e-commerce site, asking the ISP to block this traffic at their routers is probably not an option. (You can try blocking traffic from specific sources, but in a DDoS attack, you might have a hard time compiling a complete list of all attacking addresses.) Only a defense-in-depth architecture has a chance of protecting you from attacks that might come through legitimately open channels.

Performance

When analyzing your business requirements, you need to look at the expected performance levels for the site that are protected by the security infrastructure. As we add layers to perimeter defense, we are likely to impact the latency of packets because they might be traversing through multiple filtering engines, sanity checks, and encryption mechanisms. If performance is a serious consideration for your business, you might be able to justify spending money on equipment upgrades so that the impact of additional security layers is minimized. Alternatively, you might decide that your business hinges on fast response times and that your budget does not allow you to provision appropriately performing security hardware or software. In the latter case, you might need to accept the risk of decreased security for the sake of performance.

Establishing performance expectations in the design phase of your deployment, before the actual implementation takes place, is a difficult but necessary task. Indeed, it's often hard to estimate how much burden an IPSec encryption tunnel will put on your router, or how many milliseconds will be added to the response time if a proxy server is in place, instead of a stateful firewall. If your design goes over the top with the computing power required to provide adequate performance, you might not have much money left to keep the system (or the business) running. At the same time, not allocating proper resources to the system early on might require you to purchase costly upgrades later.

Inline Security Devices

Consider the architecture that incorporates multiple inline firewalls located one behind another. For a request to propagate to the server located the farthest from the Internet, the request might need to pass through the border router and several firewalls. Along the packet's path, each security enforcement device will need to make a decision about whether the packet should be allowed through. In some cases, you might decide that the security gained from such a configuration is not worth the performance loss. In others, you might be willing to accept the delay in response to achieve the level of security that is appropriate for your enterprise. Or you might devise alternative solutions, such as employing a single firewall, to provide you with a sufficient comfort level without employing inline firewalls.

The Use of Encryption

Other significant effects on performance are associated with the use of encryption due to the strain placed on the CPU of the device that needs to encrypt or decrypt data. You might be familiar with SSL and IPSec accelerator cards or devices that you can employ to transfer the encryption duties to a processor that is dedicated and optimized for such tasks. These devices are not cheap, and the business decision to purchase them must take into account the need to provide encryption, the desired performance, and the cost of purchasing the accelerators.

For example, most online banking applications require the use of SSL encryption to protect sensitive data as it travels between the user's browser and the bank's server. Usually, all aspects of the user's interaction with the banking application are encrypted, presumably because the bank was able to justify the expense of purchasing the computing power to support SSL. Cost is a significant factor here because you need to spend money on SSL accelerators to achieve the desired performance.

At the same time, you will probably notice that SSL encryption is not used when browsing through the bank's public website. This is probably because the information provided on the public site is not deemed to be sensitive enough to justify the use of encryption.

Detailed Logging

Among other considerations relating to the site's performance is the system's ability to handle large amounts of log data. Availability of detailed logs is important for performing anomaly detection, as well as for tuning the system's performance parameters. At the same time, enabling verbose logging might inundate a machine's I/O subsystem and quickly fill up the file system. Similarly, you must balance the desire to capture detailed log information with the amount of network bandwidth that will be consumed by logs if you transport them to a centralized log archival system. This is an especially significant issue for situations in which you need to routinely send log data across relatively slow wide area network (WAN) links.

Fault Tolerance

The amount of fault tolerance that should be built in to your environment depends on the nature of your business. The reason we discuss fault tolerance in an information security book is because eliminating single points of failure is a complex task that strongly impacts the architecture of your network.

When designing fault tolerance of the infrastructure, you need to look at the configuration of individual systems, consider the ways in which these systems interact with each other within the site, and, perhaps, offer geographic redundancy to your users.

Intrasystem Redundancy

Looking at a single machine, you examine its disk subsystem, number of processors, redundancy of system boards and power supplies, and so on. You need to decide how much you are willing to pay for the hardware such that if a disk or a system board fails, the machine will continue to function. You will then weigh that amount against the likelihood this will happen and the extent of damage it will cause.

You might also ask yourself what the consequences will be of a critical process on that system failing, and what is involved in mitigating that risk. The answer might be monitoring the state of that process so that it is automatically restarted if it dies. Alternatively, you might consider running a program that duplicates the tasks of the process; for instance, you might want to run both LogSentry and Swatch to monitor your log files. Keep in mind that many applications were not designed to share the host with multiple instances of themselves. Moreover, running multiple instances of the application on the same system does not help when the host fails and eliminates all processes that were supposed to offer redundancy. When running a duplicate process on the same host is not appropriate or sufficient, look into duplicating the whole system to achieve the desired level of redundancy.

Intrasite Redundancy

Redundant components that are meant to fulfill the same business need are usually considered to be operating in a cluster. We might set up such clusters using hardware and software techniques for the most important systems in the environment, such as the database servers. Similarly, network and security devices can operate in a cluster, independently of how other components of your infrastructure are set up.

If a clustered component is actively performing its tasks, it is considered to be active; if the component is ready to take on the responsibility but is currently dormant, it is considered to be passive. Clusters in which all components are active at the same time provide a level of load balancing and offer performance improvements, albeit at a significant cost. For the purpose of achieving intrasite redundancy, it is often sufficient to deploy active-passive clusters where only a single component is active at one time.

Redundancy of the network is a significant aspect of intrasite redundancy. For example, if Internet connectivity is very important to your business, you may need to provision multiple lines linking the network to the Internet, possibly from different access providers. You may also decide to cluster border routers to help ensure that failure of one will not impact your connection to the Internet. Cisco routers usually accomplish this through the use of the Hot Standby Router Protocol (HSRP). With HSRP, multiple routers appear as a single "virtual" router.

Firewall Redundancy

Many commercial firewall and VPN products also provide clustering mechanisms that you can use to introduce high availability to that aspect of your perimeter, should your business justify the added cost. Check Point, for example, offers the ClusterXL add-on to provide automated failover and load-balancing services for its FireWall-1/VPN-1 products. A popular third-party solution for Check Point's products that achieves similar results is StoneBeat FullCluster. Nokia firewall/VPN appliances also offer their own failover and load-balancing solutions. Cisco PIX clusters can also ensure high availability of the firewall configuration, as well as active-active load-balancing clusters.

Note

When researching firewall clustering solutions, keep in mind the difference between technologies that provide failover and load-balancing capabilities, and those that only support failover. Failover mechanisms ensure high availability of the configuration, but they do not necessarily focus on improving its performance through load balancing.

For stateful firewalls to function as part of a unified cluster, one of the following needs to take place:

Clustered devices need to share state table information.
Packets that are part of the same network session need to flow through the same device.

If one of these requirements is not met, a stateful communication session may be interrupted by a firewall device that doesn't recognize a packet as belonging to an active session. Modern firewall clustering methods typically operate by sharing the state table between cluster members. This setup allows stateful failover to occurif one of the clustered devices fails, the other will be able to process existing sessions without an interruption. Let's take a closer look at why sharing state information is vital to a firewall cluster.

Consider a situation where two FireWall-1 devices, set up as an active-active ClusterXL cluster, are processing an FTP connection. The cluster member the communication traveled through initially preserves FTP transaction information in its state table. Because ClusterXL can share state table information across the cluster, even if the returning FTP data channel connection came through the other cluster member, it would be processed correctlythe device would compare the connection to the state able, recognize that it is part of an active session, and allow it to pass. This process functions similarly in an active-passive cluster failover. When the formerly passive firewall receives the communication that is part of an active session, it can process this communication as if it was the cluster member that originally handled the session.

Another way to support stateful failover with redundant firewalls is to ensure that a given network session always goes through the same device in a pair of firewalls that are independent of each other, but act as if they are in a cluster. Before products that allow state table sharing were available, the only way to achieve redundancy with two active firewalls was to create a "firewall sandwich.," which refers to a design where two independent firewall devices are sandwiched between two sets of load balancers. In this case, the load balancers are responsible for making sure that a single session always flows through the same firewall device.

Whether you're using a firewall sandwich or a firewall solution that allows the sharing of state information, firewall redundancy is an important way to ensure availability in highly critical network environments.

Switch Redundancy

When planning for intrasite redundancy, also consider how the availability of your systems will be impacted if one of the internal switches goes down. Some organizations address this risk by purchasing a standby switch that they can use to manually replace the failed device. If the delay to perform this manually is too costly for your business, you may decide to invest in an automatic failover mechanism. High-end Cisco switches such as the Catalyst 6500 series can help to achieve redundancy in the switching fabric, although at a significant expense. They can be set up with redundant power supplies and supervisor modules, which provide the intelligence for the switch, in a single chassis.

For additional switch redundancy, consider adding an extra switch chassis to the design. In this scenario, the switches can be interconnected using trunks, and the hosts can be connected to both switches using network card teaming, a function of the network card software that allows several network cards to be virtually "linked" so that if one of the cards fails, the other can seamlessly take over communications. With each of the teamed network cards connected to one of the clustered switches, multiple failures need to occur to cause a connectivity outage. Clustering switches in this manner is a strong precautionary measure that may be appropriate for mission-critical infrastructure components.

Geographic Redundancy

Sometimes, achieving intrasite redundancy is not sufficient to mitigate risks of system unavailability. For example, a well-tuned DoS attack against a border's routers or the ISP's network equipment might not allow legitimate traffic to the site. Additionally, a disaster might affect the building where the systems are residing, damaging or temporarily disabling the equipment. To mitigate such risks, consider creating a copy of your data center in a geographically distinct location.

Much like clusters, the secondary data center could be always active, sharing the load with the primary site. Alternatively, it could be passive, activated either manually or automatically when it is needed. In addition to considering the costs associated with setting up the secondary data center, also look at the administrative efforts involved in supporting and monitoring additional systems. You might also need to accommodate data sharing between the two sites so that users have access to the same information no matter which site services their requests.