Using a Sniffer to Diagnose Firewall Problems


Many problems can be isolated by running a packet sniffer on your firewall. Our favorite is tetheral, a part of the ethereal package, because it will put the packets into a more readable form than tcpdump, which is another good option. tetheral is also handy for command line diagnosis work because it works without all the fuss of a GUI and all the "voodoo" of a more lower-level sniffer such as tcpdump. tetheral is slower than tcpdump, however. Here is an example of a connection from behind our firewall to one of our hosts on the Internet:

[View full width]

tethereal -i eth1 host liberty.gmsociety.org and port 22 Capturing on eth1 0.000000 68.100.73.75 -> 216.218.240.134 TCP 47104 > ssh [SYN, ECN, CWR] Seq=0 Ack=0 Win=32440 Len=0 MSS=16220 TSV=31410938 TSER=0 WS=0 0.128939 216.218.240.134 -> 68.100.73.75 TCP ssh > 47104 [SYN, ACK, ECN] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=523459383 TSER=31410938 WS=0 0.135917 68.100.73.75 -> 216.218.240.134 TCP 47104 > ssh [ACK] Seq=1 Ack=1 Win=32440 Len=0 TSV=31411052 TSER=523459383 0.243449 216.218.240.134 -> 68.100.73.75 SSH Server Protocol: SSH-2.0-OpenSSH_3.8p1 0.246039 68.100.73.75 -> 216.218.240.134 TCP 47104 > ssh [ACK] Seq=1 Ack=23 Win=32440 Len=0 TSV=31411161 TSER=523459509 0.246187 68.100.73.75 -> 216.218.240.134 SSH Client Protocol: SSH-2.0-NOYB 0.359056 216.218.240.134 -> 68.100.73.75 TCP ssh > 47104 [ACK] Seq=23 Ack=14 Win=5792 Len=0 TSV=523459627 TSER=31411162 0.361615 68.100.73.75 -> 216.218.240.134 SSHv2 Client: Key Exchange Init 0.363396 216.218.240.134 -> 68.100.73.75 SSHv2 Server: Key Exchange Init

This example demonstrates how the sniffer caught the entire session and put it into an easier to read format for someone not familiar with raw packets. The three-way handshake is illustrated in this sniffer trace, and tetheral was even kind enough to translate the SSH protocol for users so you can see that the SSH connection is working correctly, what protocol it is using, and even what step in the SSH process is occurring. The observant reader might have also noticed the ECN flag, which is the explicit congestion notification flag. That system uses it because we don't have to worry about connectivity problems for that host with systems that do not understand ECN, as in the following example:

[View full width]

0.000000 68.100.73.75 -> 62.172.198.77 TCP 39114 > http [SYN, ECN, CWR] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=19362375 TSER=0 WS=0 0.086282 62.172.198.77 -> 68.100.73.75 TCP http > 39114 [RST, ACK] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=19362375 TSER=0 WS=0 0.087204 68.100.73.75 -> 62.172.198.77 TCP 34000 > http [SYN, ECN, CWR] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=19362384 TSER=0 WS=0 0.175656 62.172.198.77 -> 68.100.73.75 TCP http > 34000 [RST, ACK] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=19362384 TSER=0 WS=0 0.176618 68.100.73.75 -> 62.172.198.77 TCP 40151 > http [SYN, ECN, CWR] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=19362393 TSER=0 WS=0 0.263994 62.172.198.77 -> 68.100.73.75 TCP http > 40151 [RST, ACK] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=19362393 TSER=0 WS=0

This connection starts off innocently enough, with a standard SYN request but with the Explicit Congestion Notification (ECN) Echo flag set (ECE) and the Congestion Reduction Windows flag (CWR) set. There is some debate about what the right thing is for an IP stack that does not understand what ECN doesignore the flags and carry on or for the paranoid, drop the packet? Unfortunately for ECN, some vendors to chose to do the latter. In our example, you can see that the connection was immediately reset (RST) by the destination. Without a sniffer, it would have been difficult to see what was going on here.

What happens when we turn ECN off? The connection goes through without a hitch.

[View full width]

0.000000 68.100.73.75 -> 62.172.198.77 TCP 60276 > http [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=19370810 TSER=0 WS=0 0.094593 62.172.198.77 -> 68.100.73.75 TCP http > 60276 [SYN, ACK] Seq=0 Ack=1 Win=24616 Len=0 TSV=1381457590 TSER=19370810 WS=0 MSS=1460 0.095037 68.100.73.75 -> 62.172.198.77 TCP 60276 > http [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=19370819 TSER=1381457590 0.096459 68.100.73.75 -> 62.172.198.77 HTTP GET / HTTP/1.0 0.194562 62.172.198.77 -> 68.100.73.75 TCP http > 60276 [ACK] Seq=1 Ack=395 Win=24616 Len=0 TSV=1381457600 TSER=19370819 0.198269 62.172.198.77 -> 68.100.73.75 HTTP HTTP/1.1 200 OK (text/html) 0.198597 68.100.73.75 -> 62.172.198.77 TCP 60276 > http [ACK] Seq=395 Ack=384 Win=6432 Len=0 TSV=19370830 TSER=1381457600 0.452280 68.100.73.75 -> 62.172.198.77 HTTP GET /index.jsp HTTP/1.0 0.633635 62.172.198.77 -> 68.100.73.75 TCP http > 60276 [ACK] Seq=384 Ack=798 Win=24616 Len=0 TSV=1381457644 TSER=19370855 1.079874 62.172.198.77 -> 68.100.73.75 HTTP HTTP/1.1 200 OK (text/html)

The moral of the story is to look at the connections with a close eye and check for odd flags in the packets, such as ECN, if a connection is not working, again, before moving up in the OSI model. Always rule out lower-level problems before hypothesizing about the root cause of your problem.

The example above was not meant to single out ECN as a problematic extension to IP as we use ECN for all of our servers. What we're showing is how a simple change to the session can cause what appears to be a higher level problem, the need to truly isolate the root cause of a problem, and the tools to do so.

With ECN, however, we like to use it on our internal and Internet reachable machine but not as often for our firewalls because there are still too many sites out there that do not support ECN correctly. The good news is that you can run ECN to your heart's content behind your firewall, turn it off on your firewall, and filter out ECN packets on your outbound interfaces without causing any problems, while gaining the advantages of better congestion control. If you use squid as an HTTP proxy on your firewall, either transparently or not, the packets will also pack the ECN flag.

Let's take a look at another example. We have a firewall with two outbound Internet interfaces connected to two different ISPs. We want SMTP traffic to go over one of the interfaces, while the rest of the traffic goes over the other interface. To accomplish this we use the ROUTE patch-o-matic netfilter/iptables patch to give us the ability to route arbitrarily by source, destination, source or destination port, MAC address, and so on. The following is what the pertinent rules look like:

 iptables -A POSTROUTING -t mangle -s 192.168.10.0/24 \ -p tcp --dport 25 -j ROUTE --gw 1.2.3.4 --oif eth2 \ --continue 

gw is our upstream gateway for the second ISP connection on this firewall. When we try to connect out to port 25 on one of our servers, the connection does not go through. When we use a sniffer to look at the connection, we can see that the packets are not being NAT-ed:

[View full width]

tethereal -i eth2 host plesk.shinn.net Capturing on eth2 0.000000 192.168.10.12 -> 205.241.45.98 TCP 39945 > smtp [SYN, ECN, CWR] Seq=0 Ack=0 Win=32440 Len=0 MSS=16220 TSV=32726418 TSER=0 WS=0

What we need to add to our rules is a specific NAT rule for this route:

 iptables -t nat -A POSTROUTING -o eth1 -p tcp \ -s 192.168.10.0/24 --dport 25 -j SNAT --to-source 1.2.3.5 

This then tells the firewall to rewrite the packet so that its source address is 1.2.3.5, which is an address our remote server can locate on the Internet. Again, the intent is to move on to the sniffer after you have moved up the OSI model and have ruled out lower-level problems.



    Troubleshooting Linux Firewalls
    Troubleshooting Linux Firewalls
    ISBN: 321227239
    EAN: N/A
    Year: 2004
    Pages: 169

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net