Three predominant situations with firewalls require some form of troubleshooting:
Understanding this, you can further narrow down the process to two things:
To assist in troubleshooting these situations, implement your firewall troubleshooting checklist as it applies to the scenario in question. Troubleshooting Connectivity Through the FirewallNo matter how well planned, tested, and implemented, sooner or later you will run into problems accessing resources through the firewall. There are any number of reasons for this, but the most common reasons involve problems with the firewall ruleset, problems with the firewall translation tables, problems with Network Address Translation (NAT), or problems with how the application communicates over the network. A good approach to troubleshooting connectivity through the firewall is to use the flowchart in Figure 13-2. The troubleshooting connectivity through the firewall flowchart is based on the general troubleshooting checklist but has been modified for this specific situation. Figure 13-2. Troubleshooting Connectivity Through the Firewall
This section covers each step from the flowchart in turn as follows:
Although these steps may seem to be many of the same steps as previously discussed, it is important to consider the context of the problem, namely passing traffic through the firewall, as you apply each step in the checklist. Step 1: Verify the ProblemThe first step of troubleshooting is to always verify that the problem being reported is the problem that is occurring, and not merely a symptom of the problem. In troubleshooting traffic through the firewall, this is particularly important because in most cases the user or technician reporting the problem likely has a limited understanding of what role the firewall plays in the communication process with the host on the other side of the firewall. Many times, all they know is that the traffic goes through the firewall (and therefore the firewall must be the cause of the problem). In most cases, it never dawns on them that the application itself may be experiencing problems (for example, if the server is down or the application itself is misconfigured). Step 2: Test ConnectivityTesting connectivity for traffic passing through the firewall is easier said than done, particularly when troubleshooting traffic destined for a protected host from an unprotected network. The reason for this is simple: To protect the host, configure the firewall to provide the minimum required protocols and services necessary to allow access to the protected resource. In most cases, this means that traffic such as ICMP traffic is going to be blocked by the firewall. Consequently, trying to use tools and utilities such as PING and traceroute to verify connectivity can be difficult if not impossible to do. In these situations, it is important to understand the nature of the application or resource that is being troubleshot and to think outside the box in terms of how to test connectivity. For example, if you are having difficulties accessing a website that is being protected by a firewall, a good idea to verify connectivity is to just attempt to telnet to TCP port 80. Doing so allows you to verify the fundamental ability to access the web server. If you can, it is a good bet that the problem has to do with the application itself, not the firewall. Another option is to attempt to access the server using a different, but permitted, protocol. For example, if the server in question is not only the web server but is also the ftp server, attempt to establish an FTP connection to the server. If that succeeds, you at least know that the host is up and responding to network traffic. Yes, you still do not know whether the firewall or the server is the problem, but you can at least rule out basic networking problems being the cause. Step 3: Verify That the Remote Application Is Running and Accessible LocallyOne of the most important steps of troubleshooting traffic through the firewall is to remove the firewall from the equation and determine whether you can successfully access the resource. I cannot express enough how quickly folks will look at the firewall for being the problem when often the application itself is having problems. Verifying local access is the easiest method of ruling the firewall in or out. For example, if you have a web server in the DMZ that is not accessible from the Internet, attempt to access the website from a host on the same DMZ segment. If you cannot do that, a good chance exists that the problem has nothing to do with the firewall (because local traffic will not go through the firewall). Conversely, if you can access the resource from a local host, a good chance exists that the firewall is part of the problem in some way, which allows you to then start focusing your resources on the firewall itself. Step 4: Check for Recent ChangesFor the same reasons previously mentioned, it is a good idea to check to determine whether any recent changes have been made to the firewall. The same cautionary statement applies, however: If the changes that were made logically do not make sense as being related to the problem at hand, do not focus on those changes. Keep them in mind, but move on to other more likely causes. Apply the principle of Occam's razor, which just states that all things being equal, the simplest explanation is usually the correct explanation. For example, if access to a web server is not functioning properly and the last change that was made was to the VPN configuration of the firewall, those changes probably do not have anything to do with the problem at hand, and you should pursue other more plausible explanations. Conversely, if the last change was an update to the static translation statements, that represents a much more likely source of the problem and should be investigated accordingly. Step 5: Review the Firewall RulesetIf properly configured, all traffic passing through a firewall should be processed by the firewall ruleset. Accordingly, the firewall ruleset is one of the most common causes of problems for traffic passing through the firewall. When reviewing the firewall ruleset, pay particular attention to the following elements:
Although most firewalls today offer stateful translation, allowing return traffic to be automatically permitted by the firewall, some applications such as X Windows return traffic on a port other than what it was sent from. These applications can be particularly difficult to troubleshoot because the ruleset might appear to have what is necessary, only to find that you need an additional rule to explicitly permit the return traffic. In cases such as this, checking the error logs and monitoring the network traffic can quickly illustrate this kind of problem. Step 6: Review the Firewall Translation ConfigurationBecause of the prevalence of NAT in most firewall implementations, reviewing the proper configuration of the NAT translation statements can be as critical as verifying the ruleset. After all, if the firewall does not know what systems it should be translating traffic to/from, it does not matter what the ruleset specifies, and therefore the traffic will not be able to reach its destination. Review the translation rules similarly to the firewall ruleset, paying particular attention to the following:
Another area in reviewing the translation rule, especially for outbound connections, is to ensure that the translation pool has an adequate number of addresses for the number of hosts attempting to establish outbound connections. If the translation pool size is too small, hosts will be unable to obtain an IP address that they can use to establish connections to external hosts. Step 7: Check the Firewall Logs for ErrorsAs with generic firewall troubleshooting, the firewall logs can provide a wealth of information for you when troubleshooting connectivity through the firewall, allowing you to identify problems with the firewall ruleset, translation statements, firewall configuration, or hardware. Therefore, review the firewall logs for the following:
Step 8: Verify the Firewall ConfigurationIt is always a good idea when troubleshooting traffic passing through the firewall to look at the firewall configuration and confirm that everything is configured accordingly. For example, if the firewall is not configured to route traffic properly, that could prevent traffic passing through the firewall from reaching the intended destination. Apply the same logic to verifying the configuration as was previously discussed, comparing the current configuration to a known good configuration and verifying that the firewall configuration is accurate with no typos or other errors. Step 9: Monitor Network TrafficIf you still cannot determine the cause of the problem you are experiencing with the traffic passing through the firewall, the next logical step is to use a sniffer to monitor the network traffic to ensure that the traffic is acting exactly as you expect it is. For example, you can use the sniffer to verify that the traffic is actually using the ports that your firewall ruleset is configured to permit. Another instance where monitoring the network traffic can assist in troubleshooting a problem is to provide evidence that the firewall is indeed passing traffic between the hosts, as evidenced by the network traffic, and thus any problems accessing the application on the host is likely going to be an application problem. Troubleshooting Connectivity to the FirewallTroubleshooting connectivity to the firewall uses the same processes that have been detailed in the chapter, the difference being what the destination of the traffic happens to be. One difference to be mindful of is that unlike traffic that is being passed through the firewall, which typically has a destination that is designed and intended to be accessible, the firewall is not always designed to be accessible. This is particularly true when referring to the external interface of the firewall, which in most cases should not be configured to accept any traffic destined for the actual interface. Consequently, it can be difficult to troubleshoot whether the firewall is accessible using conventional means. By that same token, however, if you can access a resource on the other side of the firewall, by virtue of that success the firewall is online and operational. Beyond these minor changes, however, the troubleshooting process is no different from the process detailed previously in this chapter and in Figures 13-1 and Figure 13-2. |