How netfilter Works


With this bit out of the way, we can move on with the discussion of how netfilter works. Starting with 2.4 of the Linux kernel, a complete rewrite of the "firewalling" code was undertaken. Because of this, 2.4 and greater kernels now have the ability to do stateful packet management. This means no more strange rules to prevent SYN packets from going the wrong way; the kernel will maintain state on outbound (or inbound for that matter) connections and open external ports for you as needed and only allow back the packets that it's supposed to for that specific connection. From this comes the term "connection tracking," which you will hear about frequently in later chapters.

It also means things completely changed from the old style of userland management tools, ipchains, to the new standard tool, iptables.

First, let's briefly discuss the differences between ipchains and iptables. ipchains is the userland tool used under 2.0 and 2.2 Linux kernels to manage firewall rules. It is not stateful but still allows one to set up some fairly strong firewall rules. It's important to note that without the ability to manage state, the 2.0 and 2.2 firewalling code is limited in terms of the traffic it can adequately and safely control. It is for this reason that we strongly advise that you use the newer netfilter firewalling capabilities in the 2.4, 2.5, and 2.6 kernels. netfilter is such a fantastic improvement that we can't recommend it enough.

Because the internal firewalling code between 2.0 and 2.2 kernels is different from 2.4 and 2.6 kernels, the use of ipchains on 2.4 and 2.6 kernels is largely an illusion. When ipchains is used with 2.4 and 2.6 kernels, it is through a compatibility layer that basically attempts to translate ipchains userland rules into the netfilter format. This doesn't always work correctly, which is yet another reason to not use ipchains. Even though the internals are different, once you get a feel for the new features in iptables, it is not difficult to convert ipchains rules to iptables. Also given some of the powerful new features, there is all the more reason to make the switch.

How netfilter Parses Rules

The value of understanding how netfilter parses rules can't be understated. The rules are parsed in different order depending on which chain you choose to use. For instance, the SNAT, DNAT, and FORWARD chains are part of a different path in the kernel. Traffic destined for the firewall versus being routed by the firewall takes two independent paths through the netfilter decision tree.

This last point is really the most critical one with iptables. INPUT rules and FORWARD rules are on two totally different paths, and they make up the key split in what rules will be triggered as a packet moves through your firewall. Let's take a look at some examples.

Our first case is what happens when a packet is sent from some remote host to some service running on our firewall. This is different from what happens when a packet is sent through the firewall and then sent on to some different system for processing.

Packet Sent to Service Running on Firewall from Remote Host (INPUT) Steps:

1.

Packet is sent from remote client.

2.

Packet received by physical interface(s).

3.

Packet is processed by the kernel driver.

4.

Packet is processed by the networking protocol(s) layer(s).

5.

Packet is processed by netfilter in this order.

6.

PREROUTING mangle (TOS, etc.)

7.

PREROUTING NAT

8.

Routing decision: Is the packet for the local box, or is it to be forwarded to another system by us?

9.

INPUT mangle

10.

INPUT filter

11.

Application on our system processes the packet.

Figure 6.1. Packet sent to local process on firewall.


Packet Sent by Firewall from a Local Process to a Remote System (OUTPUT)

The next case covers what happens when a local process on the firewall tries to send a packet out. Such examples might be a DNS server running on the firewall or a proxy server. When a packet is generated by a local process, it follows a different and inverted path from the previous example.

Steps:

1.

Local program/process generates packet.

2.

The kernel makes a routing decisionwhere does this packet go?

3.

The netfilter OUTPUT mangle chain is triggered.

4.

OUPUT: NAT

5.

OUTPUT: filter

6.

POSTROUTING: mangle

7.

POSTROUTING: NAT

8.

Goes out firewall interface.

9.

On the wire (might not get to destination).

Packet Our Firewall Is Forwarding for Some Other Host to Some Host (FORWARD)

Finally, the process for handling packets that are only being forwarded by the firewall is through yet another different branch in the packet processing tree. In this example, the firewall does not trip the INPUT or OUTPUT rules and only trips the FORWARD rules.

Steps:

1.

Packet is sent by source to firewall.

2.

Packet is received by the physical interface(s) on firewall.

3.

Packet is processed by the kernel driver on firewall.

4.

Packet is processed by the networking protocol(s) layer(s) on firewall.

5.

Packet is processed by netfilter on firewall.

6.

Any PREROUTING mangle rules are applied. This might change the TTL or TOS of the packet.

7.

Any PREROUTING NAT rules are applied. This might change the source and/or destination of the packet.

8.

Kernel makes routing decisionwhere does this packet go? In this example, it's destined not for the firewall, but for some other system the firewall forwards packets for.

9.

Any FORWARD mangle rules are applied.

10.

Any FORWARD NAT rules are applied.

11.

Any FORWARD filtering rules are applied.

12.

Kernel makes another routing decisionwhere does this packet go? In this example, the packet is destined for another host.

13.

Any netfilter POSTROUTING mangle rules are applied; this might change TOS or TTL values in the packet.

14.

Final netfilter stepany POSTROUTING NAT rules are applied.

15.

Sent out of our interface(s) to destination.

16.

On wire to destination (might not arrive).

Figure 6.2. Packet from local process on firewall out to remote host.


Figure 6.3. Packet forwarding procedure.


Putting It All Together

Figure 6.4 shows what all of this looks like when put together. The important point in the decision-making process the kernel uses is the routing decision before either INPUT or FORWARD rules are used. Remember, INPUT rules do not affect packets that the firewall is forwarding to another host. If you want to deny a specific type of traffic and only apply that to the INPUT rules, then this traffic will continue to pass through your firewall to other hosts. When in doubt, do it twiceonce for the INPUT rules and once for the FORWARD rules.

Figure 6.4. netfilter packet path.


So to repeat, if a packet is being forwarded by our firewall, it will bypass the INPUT and OUTPUT rules. Remember this when you are constructing your rules and diagnosing problems with them. It also helps when creating logging functions, to detail if the rule set tripped was an INPUT, OUTPUT, or FORWARD rule. This is also where PREROUTING chains come into play. For instance, a packet might be destined for the firewall's external interface but is actually supposed to be sent to a host behind the firewall, and the PREROUTING rules change the packet's destination. NAT is one such case. In this case, the packet would be rewritten and would be sent down the FORWARD chains because the kernel would make a routing decision (this packet is not for us; it's for another system) and then try to route the packet. The netfilter rules would catch it as a forwarding event and process only FORWARD rules against it.

Another place you will see the opposite behavior is with application proxies. For instance, if you are using squid on your firewall, the FORWARD rules will not be called, even though your firewall is technically forwarding packets. However, it's the application, in this case squid, that is doing the actual forwarding. The kernel has no way of knowing this, so only the INPUT and OUTPUT rules will be tripped as squid is just a local process on the firewall.

netfilter maintains state on Packetsnetfilter, unlike previous firewalling systems in Linux, by having ability to track connections through their life cycles. It can automatically open ports on the firewall (if configured correctly) as needed to allow back dynamic protocols, such as FTP, without the need for complex and potentially dangerous rule sets that keep a "range" of ports open for such needs. It can track a session to ensure that an attacker cannot trick the firewall into passing traffic to a system. The connection tracking system can help to identify bogus or unclean packets that might be a result of an attack, and in general, the new statefulness of the Linux firewalling system can make your life much easier.

As the netfilter designers have been so kind to point out, you can refer to the elements of the Linux kernel that give us the ability to do this as either a connection tracking engine or a state engine. We, like the netfilter developers, will use both terms interchangeably.

The bottom line is that stateful firewalls are safer than non-stateful firewalls because we can write simpler and generally stricter rules. We don't have to open as much to get things to work. The firewall understands the lower-level protocols, and with the assistance of what are known as "helpers," can understand other higher-level protocols, such as FTP, IRC, and others. This allows the firewall to make decisions on the fly about its configuration based on its needs at the time. This also gives you greater control over the traffic moving through your firewall because instead of just trying to control things based on ports, you can now control them based on the protocol. For instance, if you wanted to block certain P2P traffic through your firewall you could do this using a helper module.

To accomplish all of these things, we need to understand the four states that the netfilter state engine recognizes. These are the states a packet is in based on the kernel's connection tracking capabilities. The four states are: NEW, INVALID, RELATED, and ESTABLISHED. At any given time a packet, when under the control of the state engine, is in one of these four states.

Netfilter States

  • NEW: A new connection. Only the first packet in a connection will meet this state. All subsequent packets, for that session, will not be considered NEW.

  • ESTABLISHED: A session that has been established and is being tracked by the state engine. All packets that follow after a packet labeled as NEW in a single session.

  • RELATED: A special state. A separate connection related to an ESTABLISHED session. This can happen when an ESTABLISHED connection spawns a new connection as part of its data communication process. An example of this is FTP-DATA connection spawned by an FTP connection. These types of connections are very complicated and almost always require a "helper" module that has been written to understand the underlying protocol. If there is no helper module to analyze a complex protocol that requires a RELATED connection to function, then that connection will not function properly.

  • INVALID: A packet that is otherwise not identified as having any state. It is always a good idea to DROP, not REJECT, this packet.

A connection's state, however, can change based on what happens with future packets in that connection. An example of this would be a new telnet connection. When the first packet comes in, it would be in the NEW state; this would create a new connection for the state engine to manage. The proceeding packets that create the session would move the connection from the NEW state to the ESTABLISHED state.

The specific piece of the netfilter code that handles all of this magic is called conntrack. Its job is to watch each session and determine if the packets are part of an existing session or a related one and to enforce the rules supplied by the userland tools around that information.

All of this magic occurs in the PREROUTING chain. All connection tracking occurs thereexcept for any packets generated by the firewall itself. For instance, if you are running a name server on your firewall, and the firewall needs to lookup a host name, that packet's state would be handled in the OUTPUT chain.

For example, when a packet goes out of the firewall, its initial state is set to NEW in the PREROUTING chains; when the return packets arrive back at the firewall, the PREROUTING chain changes the connection's status from NEW to ESTABLISHED. This state is then passed up the chain to other rules. So if you had a rule that allowed ESTABLISHED packets through on your FORWARD target, the return packets would be allowed back to the client that initiated the connection.

What about Fragmentation?

netfilter and iptables have removed the need to muck around with turning fragmentation on or off, as with 2.0 and 2.2 kernels. Defragmentation, as it were, is always done when the state engine is used. The brief explanation for this is that the state engine could not work otherwise. Some packets simply lack enough context to tell the kernel enough about the connection it might be associated with. The fact that the kernel does this for us is a good thing. It makes the firewall more secure, the rules can be that much simpler because we don't have to worry about fragments anymore, and we can detect other problems with the packet before passing it on to a protected resource.

But because we are paranoid people and we like to be safe, it never hurts to add in a rule to catch any fragments and drop them if something were to go horribly wrong with all of this. We've never seen this happen, but it's such a trivially easy thing to add to your firewall rules that we don't see why it would hurt.

To catch any fragments that slip through, which, again, shouldn't occur if you are using the state engine, simply add these rules to the top of your rule sets:

 $IPTABLES -N NOFRAGS $IPTABLES -A OUTPUT -p ip -f -j NOFRAGS $IPTABLES -A INPUT -p ip -f -j NOFRAGS $IPTABLES -A FORWARD -p ip  -f -j NOFRAGS $IPTABLES -A NOFRAGS   -m limit --limit 1/second \ -j LOG  --log-level info --log-prefix "Fragment \ -- DROP "  --log-tcp-sequence  --log-tcp-options  \ --log-ip-options $IPTABLES -A NOFRAGS  -j DROP 

Just to be clear, defragmentation happens automatically only when conntrack is used. If you are not using the state engine, then you have fragments aplenty to deal with. You also cannot turn this off if you are using conntrack. If you want fragments, then you cannot use the state engine.

Taking a Closer Look at the State Engine

Now that we have a basic understanding of how the state engine works, let's take a look at what it's tracking. The first place to look is in the most powerful file system on a Linux system, /proc. In case you are not aware of what /proc is, it's a virtual file system that lets you look at the devices in your system, to reconfigure some of them, and even to modify the kernels' and networks' behavior on the fly. This is an extraordinarily powerful feature of Linux that we highly recommend you read up on if you are not already familiar with all the things you can do under /proc. It's beyond the scope of this book to spend too much time on the rest of /proc, so we will limit ourselves to those parts that relate to firewalls and, for now, just the state engine. Packets moving through the state engine are in one of four states.

As to the state engine, let's take a look at /proc/net/ip_conntrack. This lists all the current connections the kernel is tracking and any information the system has on those connections. A typical entry might look like this:

 tcp      6 431890 ESTABLISHED src=68.100.73.75 dst=207.126.99.62 sport=47037 dport=80 src=207.126.99.62 dst=68.100.73.75 sport=80 dport=47037 use=1 bytes=3041 

All the information needed to determine the current state of this connection is provided by looking inside this /proc entry, Table 6.1 contains a breakdown of all the fields. You can dump the real-time contents of this by simply typing this command:

 cat /proc/net/ip_conntrack 

Table 6.1. Connection Tracking Fields

Protocol

Decimal
value for
Protocol
type

Time the
connection
has to live
in the
connection
engine
(Note 1)

Current
state of this
connection

Source IP

Destination
IP

Key word:
Expected
source
(Note 2)

Expected
Destination
(Note 2)

Expected
source
port
(Note 3)

Expected
source
port
(Note 3)


Note 1

This number is reset when a new packet arrives that is relevant to this connection. The value is then reset to the default value for this connection type.


Note 2

Depending on the direction of the connectionbe it an SNAT, MASQ, or DNATone of these may be modified to the NAT-ed IP. See Table 6.1 for an explanation of these fields.


Note 3

Same as Note 2 except for port numbers that might be different depending on how your firewall is configured.


Breaking Down Some Examples

To get a sense of the wide range of connections you might see in /proc/net/ip_conntrack, we will review some examples. In the next chapter we explore this system in even greater detail. The intent of this chapter is to cover broad concepts, so we will only cover two of the most common protocols, UDP and TCP.

The first example that follows is what an established TCP session for an HTTP server GET request might look like. The packet has also been mangled to allow the client, 172.31.254.2., to NAT its traffic through the firewall. The second hand of the IP pairs is the expected source and destination for any return packets.

 tcp      6 271 ESTABLISHED src=172.31.254.2 dst=1.2.3.4 sport=2503 dport=80 src=1.2.3.4 dst=EXTERNAL_IP_FOR_NAT_CLIENT sport=80 dport=2503 use=1 bytes=118740288 

This is what a fully TCP tracked NAT connection might look like. Before a TCP connection reaches the ESTABLISHED state, it has to go through the three-way handshaking process explained in Chapter 5. Within the netfilter state engine, this is reflected as part of three states: SYN_SENT, SYN_RECV, and ESTABLISHED. For example:

Step 1:

 tcp      6 39 SYN_SENT src=172.31.254.2 dst=172.140.27.241 sport=1848 dport=3855[UNREPLIED] src=172.140.27.241 dst=EXTERNAL_IP_FOR_NAT_CLIENT sport=3855 dport=1848 use=1 bytes=144 

Step 2:

 tcp      6 29 SYN_RECV src=1.2.3.4 dst=1.2.3.5 sport=1855 dport=25 src=1.2.3.5 dst=1.2.3.4 sport=25 dport=1855 use=1 bytes=192 

Step 3:

 tcp      6 271 ESTABLISHED src=172.31.254.2 dst=1.2.3.4 sport=2503 dport=80 src=1.2.3.4 dst=EXTERNAL_IP_FOR_NAT_CLIENT sport=80 dport=2503 use=1 bytes=118740288 

For UDP, the process looks a little different...

Step 1:

 udp      17 26 src=10.10.10.8 dst=10.10.10.253 sport=40082 dport=53 src=10.10.10.253 dst=10.10.10.8 sport=53 dport=40082 use=1 bytes=255 

Step 2:

 udp      17 26 src=10.10.10.8 dst=10.10.10.253 sport=40082 dport=53 src=10.10.10.253 dst=10.10.10.8 sport=53 dport=40082 use=1 bytes=255 

Step 3:

 udp      17 0 src=10.10.10.8 dst=10.10.10.253 sport=40005 dport=53 src=10.10.10.253 dst=10.10.10.8 sport=53 dport=40005 [ASSURED] use=1 bytes=761 

For a closed session, we will see a TIME_WAIT state. This is a post-closed state for this session. After a session has been terminated, the kernel waits, by default, for 180 seconds to ensure that any out-of-order packets that might have been part of this connection are properly sorted out and allowed to reach their destination. Once the timer reaches zero, the connection is removed from the state engine entirely. In this example, the connection has 45 seconds left.

 tcp      6 32 TIME_WAIT src=172.31.254.2 dst=80.57.62.61 sport=1849 dport=2794 src=80.57.62.61 dst=EXTERNAL_IP_FOR_NAT_CLIENT sport=2794 dport=1849 use=1 bytes=956 

You might notice that the states listed in these examples are sometimes different from the ones we explained in the first part of the chapter. This is because netfilter and iptables are different parts of the whole. iptables is a user interface to the netfilter code. Many of the internal netfilter states are not something the typical user will need to manipulate when using the state engine. For instance, when using iptables, we do not have to concern ourselves with the SYN_SENT and SYN_RECV states. They don't matter to us. We only care about the beginning or the end of that process. That isa NEW, ESTABLISHED, or RELATED connection. For almost every case, those three states are the ones you will use, but in case you want to manipulate packets at some other point along the way, you can use some of the more advanced features of iptables to grab these packets as they move through these intermediate states in the netfilter code. More about this subject is covered in Chapter 12, "$xref linkend="ch12" remap="NAT (Network Address Translation) and IP Forwarding" endterm="ch12.title"/$."

There are also extension modules for conntrack that track even more states depending on the protocol. What has been covered here is only a partial list of all the internal states that can exist for the conntrack engine. The important thing to remember here is that for the most part, you don't need to worry about these netfilter internal states. The NEW, RELATED, and ESTABLISHED states will get the work done for you through iptables.

There is one other state that we only briefly touched on, the INVALID state. This state is supposed to catch all packets that don't easily fall into the other states; for instance, packets that are out of order or mangled somehow are supposed to be represented as INVALID packets. For the most part, this is true, but in practice it depends on the version of the kernel and the connection tracking code being used. To clarify, there are two different connect-tracking "engines" available for the netfilter subsystem in the Linux kernel. Most users will only use the standard engine, which comes by default in all Linux kernels. There is also an enhanced connection-tracking engine that implements and extends TCP connection tracking based on the article, "Real Stateful TCP Packet Filtering in IP Filter," by Guido van Rooij. This engine is also referred to as the tcp-window-tracking patch. We bring this up because it is possible that a packet might trick the default state engine into thinking it's valid, which is why there is a newer version of the state-tracking engine available as a patch for the kernel. If you are interested in using this patch, you will find information about how to patch your kernel in later chapters. If you have trouble patching your kernel, you can also visit our Web site (www.gotroot.com) for assistance.

On the surface, it appears that the NEW state should do some sanity checking of the packet to make sure it's a proper "new" connection. However, iptables has some useful features that make this more complicated than it seems. NEW does not check to make sure the packet is part of a NEW connection in the sense that it's a "new" TCP or UDP connection; instead, it will just assume that any packet that is not assigned to some other state is NEW. This means that for the unwary firewall administrator, the kernel will happily pass a packet that isn't either the beginning of a three-way handshake, a SYN packet, or part of an otherwise existing session. For the average firewall, this is a bad thing. We don't want any old packet coming along and getting passed through as a NEW connection. We usually want to see a three-way handshake or otherwise for our firewall to truly investigate the "newness" of a connection.

This is actually a feature of the firewall that has a practical use. And, no, this is a not a feature in the "it's a bug, but let's call it a feature" sense. It's a darn useful set of behaviors to have in a high-end firewall. For example, assume you have two firewalls for your network and one of them goes down. The one that went down is immediately replaced by the backup or secondary firewall; it simply changes its ips and spoofs its MAC to appear to be the firewall that died. This is called failover, and it's very handy. If iptables required a NEW session to fully negotiate its sessions, all the existing connections from the former now-dead firewall would die. They wouldn't work because the firewall would drop them as not being NEW, but as something else. So to give you the ability to do fancy things such as failover, the default behavior is to pick up sessions, midstream as it were, and continue on as if nothing happened.

However, if you don't have this sort of setup (and most people do not), and you are concerned about the sorts of packets that might slip through your firewall due to this feature, SYN-ACK packets for instance, then you obviously want to turn this off. And it turns out that it's a relatively simple thing to do. Just add these rules to the top of your firewall rules before any ACCEPT rules, and you're set.

 $IPTABLES -N SYN_ONLY $IPTABLES -A INPUT -p tcp ! --syn -m conntrack \ --ctstate NEW -j SYN_ONLY $IPTABLES -A FORWARD -p tcp ! --syn -m conntrack \ --ctstate NEW -j SYN_ONLY $IPTABLES -A SYN_ONLY  -m limit --limit 2/second \ -j LOG  --log-level info --log-prefix \ "SYN ONLY -- DROP "  --log-tcp-sequence \ --log-tcp-options  --log-ip-options $IPTABLES -A SYN_ONLY -j DROP 

The behavior of the state tracking engine can be changed by installing a patch from the netfilter patch-o-matic archives called the "tcp-window-tracking extension." This change causes the tracking engine to maintain state by looking at the TCP window settings of the packet.

The tcp-window-tracking modification to netfilter/iptables expands the TCP connection tracking according to the article http://www.iae.nl/users/guido/papers/tcp_filtering.ps.gz by Guido van Rooij. In addition to the extended TCP connection tracking capabilities, it also supports TCP window scaling and SACK.

The next chapter will explore these behaviors and how to configure iptables in more depth.



    Troubleshooting Linux Firewalls
    Troubleshooting Linux Firewalls
    ISBN: 321227239
    EAN: N/A
    Year: 2004
    Pages: 169

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net