Basic Firewall Troubleshooting | Firewall Fundamentals

Three predominant situations with firewalls require some form of troubleshooting:

Access to protected resources from unprotected networks is not functioning correctly.
Access to unprotected resources from protected networks is not functioning correctly.
Access to the firewall itself is not functioning correctly.

Understanding this, you can further narrow down the process to two things:

Traffic going through the firewall
Traffic going to the firewall.

To assist in troubleshooting these situations, implement your firewall troubleshooting checklist as it applies to the scenario in question.

Troubleshooting Connectivity Through the Firewall

No matter how well planned, tested, and implemented, sooner or later you will run into problems accessing resources through the firewall. There are any number of reasons for this, but the most common reasons involve problems with the firewall ruleset, problems with the firewall translation tables, problems with Network Address Translation (NAT), or problems with how the application communicates over the network. A good approach to troubleshooting connectivity through the firewall is to use the flowchart in Figure 13-2. The troubleshooting connectivity through the firewall flowchart is based on the general troubleshooting checklist but has been modified for this specific situation.

Figure 13-2. Troubleshooting Connectivity Through the Firewall

This section covers each step from the flowchart in turn as follows:

Step 1.	Verify the problem.
Step 2.	Test connectivity.
Step 3.	Verify that the remote application is running and accessible locally.
Step 4.	Check for recent changes.
Step 5.	Review the firewall ruleset.
Step 6.	Review the firewall translation configuration.
Step 7.	Check the firewall logs for errors.
Step 8.	Verify the firewall configuration.
Step 9.	Monitor network traffic.

Although these steps may seem to be many of the same steps as previously discussed, it is important to consider the context of the problem, namely passing traffic through the firewall, as you apply each step in the checklist.

Step 1: Verify the Problem

The first step of troubleshooting is to always verify that the problem being reported is the problem that is occurring, and not merely a symptom of the problem. In troubleshooting traffic through the firewall, this is particularly important because in most cases the user or technician reporting the problem likely has a limited understanding of what role the firewall plays in the communication process with the host on the other side of the firewall. Many times, all they know is that the traffic goes through the firewall (and therefore the firewall must be the cause of the problem). In most cases, it never dawns on them that the application itself may be experiencing problems (for example, if the server is down or the application itself is misconfigured).

Step 2: Test Connectivity

Testing connectivity for traffic passing through the firewall is easier said than done, particularly when troubleshooting traffic destined for a protected host from an unprotected network. The reason for this is simple: To protect the host, configure the firewall to provide the minimum required protocols and services necessary to allow access to the protected resource. In most cases, this means that traffic such as ICMP traffic is going to be blocked by the firewall. Consequently, trying to use tools and utilities such as PING and traceroute to verify connectivity can be difficult if not impossible to do.

In these situations, it is important to understand the nature of the application or resource that is being troubleshot and to think outside the box in terms of how to test connectivity. For example, if you are having difficulties accessing a website that is being protected by a firewall, a good idea to verify connectivity is to just attempt to telnet to TCP port 80. Doing so allows you to verify the fundamental ability to access the web server. If you can, it is a good bet that the problem has to do with the application itself, not the firewall. Another option is to attempt to access the server using a different, but permitted, protocol. For example, if the server in question is not only the web server but is also the ftp server, attempt to establish an FTP connection to the server. If that succeeds, you at least know that the host is up and responding to network traffic. Yes, you still do not know whether the firewall or the server is the problem, but you can at least rule out basic networking problems being the cause.

Step 3: Verify That the Remote Application Is Running and Accessible Locally

One of the most important steps of troubleshooting traffic through the firewall is to remove the firewall from the equation and determine whether you can successfully access the resource. I cannot express enough how quickly folks will look at the firewall for being the problem when often the application itself is having problems. Verifying local access is the easiest method of ruling the firewall in or out.

For example, if you have a web server in the DMZ that is not accessible from the Internet, attempt to access the website from a host on the same DMZ segment. If you cannot do that, a good chance exists that the problem has nothing to do with the firewall (because local traffic will not go through the firewall). Conversely, if you can access the resource from a local host, a good chance exists that the firewall is part of the problem in some way, which allows you to then start focusing your resources on the firewall itself.

Step 4: Check for Recent Changes

For the same reasons previously mentioned, it is a good idea to check to determine whether any recent changes have been made to the firewall. The same cautionary statement applies, however: If the changes that were made logically do not make sense as being related to the problem at hand, do not focus on those changes. Keep them in mind, but move on to other more likely causes. Apply the principle of Occam's razor, which just states that all things being equal, the simplest explanation is usually the correct explanation.

For example, if access to a web server is not functioning properly and the last change that was made was to the VPN configuration of the firewall, those changes probably do not have anything to do with the problem at hand, and you should pursue other more plausible explanations. Conversely, if the last change was an update to the static translation statements, that represents a much more likely source of the problem and should be investigated accordingly.

Step 5: Review the Firewall Ruleset

If properly configured, all traffic passing through a firewall should be processed by the firewall ruleset. Accordingly, the firewall ruleset is one of the most common causes of problems for traffic passing through the firewall. When reviewing the firewall ruleset, pay particular attention to the following elements:

Ensure that there are no typos in the rules.
Verify the protocol and port numbers referenced by the rule.
Verify the source and destination addresses referenced by the rule.
Verify the processing order of the rules to ensure that the rules are not being applied out of order.
Have someone else review the rules to ensure that the rule says what it is supposed to say, not what you think it says. If you have ever seen that e-mail where the entire e-mail is spelled incorrectly to demonstrate how your mind will read what it expects to read, that is what happens here. You know that access to the web server is supposed to occur over TCP port 80, which means that it is easy to look at a rule permitting traffic to UDP port 80 and not catch the problem with that because you expect (and read) TCP.

Although most firewalls today offer stateful translation, allowing return traffic to be automatically permitted by the firewall, some applications such as X Windows return traffic on a port other than what it was sent from. These applications can be particularly difficult to troubleshoot because the ruleset might appear to have what is necessary, only to find that you need an additional rule to explicitly permit the return traffic. In cases such as this, checking the error logs and monitoring the network traffic can quickly illustrate this kind of problem.

Step 6: Review the Firewall Translation Configuration

Because of the prevalence of NAT in most firewall implementations, reviewing the proper configuration of the NAT translation statements can be as critical as verifying the ruleset. After all, if the firewall does not know what systems it should be translating traffic to/from, it does not matter what the ruleset specifies, and therefore the traffic will not be able to reach its destination. Review the translation rules similarly to the firewall ruleset, paying particular attention to the following:

Ensure that there are no typos in the translation statement.
Verify the protocol and port numbers referenced by the translation statement.
Verify the source and destination addresses referenced by the translation statement.

Another area in reviewing the translation rule, especially for outbound connections, is to ensure that the translation pool has an adequate number of addresses for the number of hosts attempting to establish outbound connections. If the translation pool size is too small, hosts will be unable to obtain an IP address that they can use to establish connections to external hosts.

Step 7: Check the Firewall Logs for Errors

As with generic firewall troubleshooting, the firewall logs can provide a wealth of information for you when troubleshooting connectivity through the firewall, allowing you to identify problems with the firewall ruleset, translation statements, firewall configuration, or hardware. Therefore, review the firewall logs for the following:

Look for state errors State errors can be indicators of problems with the firewall translation tables (for example, if the Cisco Secure PIX Firewall has an incorrectly configured static translation value).
Look for denied traffic Denied traffic is the classic indicator of an incorrectly configured ruleset. Although virtually all firewalls include an implicit deny statement at the end of the firewall ruleset, to assist in troubleshooting it can be helpful to include an explicit deny and log statement to ensure that the denied traffic is logged accordingly.
Look for configuration errors Often, configuration errors are reported in the firewall logs as error events, allowing you to rapidly identify a configuration error without needing to review the configuration line by line.
Look for hardware errors Event logs are one of the best sources for discovering hardware-related errors because most firewall vendors log hardware error events in the firewall logs.

Step 8: Verify the Firewall Configuration

It is always a good idea when troubleshooting traffic passing through the firewall to look at the firewall configuration and confirm that everything is configured accordingly. For example, if the firewall is not configured to route traffic properly, that could prevent traffic passing through the firewall from reaching the intended destination.

Apply the same logic to verifying the configuration as was previously discussed, comparing the current configuration to a known good configuration and verifying that the firewall configuration is accurate with no typos or other errors.

Step 9: Monitor Network Traffic

If you still cannot determine the cause of the problem you are experiencing with the traffic passing through the firewall, the next logical step is to use a sniffer to monitor the network traffic to ensure that the traffic is acting exactly as you expect it is. For example, you can use the sniffer to verify that the traffic is actually using the ports that your firewall ruleset is configured to permit.

Another instance where monitoring the network traffic can assist in troubleshooting a problem is to provide evidence that the firewall is indeed passing traffic between the hosts, as evidenced by the network traffic, and thus any problems accessing the application on the host is likely going to be an application problem.

Troubleshooting Connectivity to the Firewall

Troubleshooting connectivity to the firewall uses the same processes that have been detailed in the chapter, the difference being what the destination of the traffic happens to be. One difference to be mindful of is that unlike traffic that is being passed through the firewall, which typically has a destination that is designed and intended to be accessible, the firewall is not always designed to be accessible. This is particularly true when referring to the external interface of the firewall, which in most cases should not be configured to accept any traffic destined for the actual interface. Consequently, it can be difficult to troubleshoot whether the firewall is accessible using conventional means. By that same token, however, if you can access a resource on the other side of the firewall, by virtue of that success the firewall is online and operational. Beyond these minor changes, however, the troubleshooting process is no different from the process detailed previously in this chapter and in Figures 13-1 and Figure 13-2.