As presented earlier, iptables uses the concept of separate rule tables for different packet processing functionality. Nondefault tables are specified by a command-line option. Three tables are available:
filter Table CommandsThe filter table commands are provided by the ip_tables module. The functionality is enabled by loading the module, which is done automatically with the first invocation of the iptables command, or it could be compiled into the kernel itself, which means you don't need to worry about modules being loaded at all. filter TABLE OPERATIONS ON ENTIRE CHAINSTable 3.2 shows the iptables operations on entire chains.
The -h help command is obviously not an operation on a chain nor is --modprobe=<command>, but I didn't know where else to list the command. The list command takes additional options, as shown in Table 3.3.
filter TABLE OPERATIONS ON A RULEThe most frequently used commands to create or delete rules within a chain are shown in Table 3.4.
BASIC filter TABLE MATCH OPERATIONSThe basic filter match operations supported in the default iptables filter table are listed in Table 3.5.
tcp filter TABLE MATCH OPERATIONSTCP header match options are listed in Table 3.6.
udp filter TABLE MATCH OPERATIONSUDP header match options are listed in Table 3.7.
icmp filter TABLE MATCH OPERATIONSICMP header match options are listed in Table 3.8.
The major supported ICMP type names and numeric values are the following:
filter Table Target ExtensionsThe filter table target extensions include logging functionality and the capability to reject a packet rather than dropping it. Table 3.9 lists the options available to the LOG target. Table 3.10 lists the single option available to the REJECT target.
THE ULOG TABLE TARGET EXTENSIONRelated to the LOG target is the ULOG target, which sends the log message to a userspace program for logging. Behind the scenes for ULOG, the packet gets multicast by the kernel through a netlink socket of your choosing (the default is socket 1). The userspace daemon would then read the message from the socket and do with it what it pleases. The ULOG target is typically used to provide more extensive logging than is possible with the standard LOG target. As with the LOG target, processing continues after matches on a ULOG targeted rule. The ULOG target has four configuration options, as described in Table 3.11.
filter Table Match ExtensionsThe filter table match extensions provide access to the fields in the TCP, UDP, and ICMP headers, as well as the match features available in iptables, such as maintaining connection state, port lists, access to the hardware MAC source address, and access to the IP TOS field.
multiport filter TABLE MATCH EXTENSIONmultiport port lists can include up to 15 ports per list. Whitespace isn't allowed. There can be no blank spaces between the commas and the port values. Port ranges cannot be interspersed in the list. Also, the -m multiport command must exactly follow the -p <protocol> specifier. Table 3.12 lists the options available to the multiport match extension.
The multiport syntax can be a bit tricky. Some examples and cautions are included here. The following rule blocks incoming packets arriving on interface eth0 destined for the UDP ports associated with NetBIOS and SMB, common ports that are exploited on Microsoft Windows computers and targets for worms: iptables -A INPUT -i eth0 -p udp\ -m multiport --destination-port 135,136,137,138,139 -j DROP The next rule blocks outgoing connection requests sent through the eth0 interface to high ports associated with the TCP services NFS, socks, and squid: iptables -A OUTPUT -o eth0 -p tcp\ -m multiport --destination-port 2049,1080,3128 --syn -j REJECT What is important to note in this example is that the multiport command must exactly follow the protocol specification. A syntax error would have resulted if the --syn were placed between the -p tcp and the -m multiport. To show a similar example of --syn placement, the following is correct: iptables -A INPUT -i <interface> -p tcp \ -m multiport --source-port 80,443 ! --syn -j ACCEPT However, this causes a syntax error: iptables -A INPUT -i <interface> -p tcp ! --syn \ -m multiport --source-port 80,443 -j ACCEPT Furthermore, the placement of source and destination parameters is not obvious. The following two variations are correct: iptables -A INPUT -i <interface> -p tcp -m multiport \ --source-port 80,443 \ ! --syn -d $IPADDR --dport 1024:65535 -j ACCEPT and iptables -A INPUT -i <interface> -p tcp -m multiport \ --source-port 80,443 \ -d $IPADDR ! --syn --dport 1024:65535 -j ACCEPT However, this causes a syntax error: iptables -A INPUT -i <interface> -p tcp -m multiport \ --source-port 80,443 \ -d $IPADDR --dport 1024:65535 ! --syn -j ACCEPT This module has some surprising syntax side effects. Either of the two preceding correct rules produces a syntax error if the reference to the SYN flag is removed: iptables -A INPUT -i <interface> -p tcp -m multiport \ --source-port 80,443 \ -d $IPADDR --dport 1024:65535 -j ACCEPT The following pair of rules, however, does not: iptables -A OUTPUT -o <interface> \ -p tcp -m multiport --destination-port 80,443 \ ! --syn -s $IPADDR --sport 1024:65535 -j ACCEPT iptables -A OUTPUT -o <interface> \ -p tcp -m multiport --destination-port 80,443 \ --syn -s $IPADDR --sport 1024:65535 -j ACCEPT Note that the --destination-port argument to the multiport module is not the same as the --destination-port or --dport argument to the module that performs matching for the -p tcp arguments. limit filter TABLE MATCH EXTENSIONRate-limited matching is useful for choking back the number of log messages that would be generated during a flood of logged packets. Table 3.13 lists the options available to the limit match extension.
The burst rate defines the number of initial matches to be accepted. The default value is five matches. When the limit has been reached, further matches are limited to the rate limit. The default limit is three matches per hour. Optional time frame specifiers include /second, /minute, /hour, and /day. In other words, by default, when the initial burst rate of five matches is reached within the time limit, at most three more packets will match over the next hour, one every 20 minutes, regardless of how many packets are received. If a match doesn't occur within the rate limit, the burst is recharged by one. It's easier to demonstrate rate-limited matching than it is to describe it in words. The following rule will limit logging of incoming ping message matches to one per second when an initial five echo-requests are received within a given second: iptables -A INPUT -i eth0 \ -p icmp --icmp-type echo-request \ -m limit --limit 1/second -j LOG It's also possible to do rate-limited packet acceptance. The following two rules, in combination, will limit acceptance of incoming ping messages to one per second when an initial five echo-requests are received within a given second: iptables -A INPUT -i eth0 \ -p icmp --icmp-type echo-request \ -m limit --limit 1/second -j ACCEPT iptables -A INPUT -i eth0 \ -p icmp --icmp-type echo-request -j DROP The next rule limits the number of log messages generated in response to dropped ICMP redirect messages. When an initial five messages have been logged within a 20-minute time frame, at most three more log messages will be generated over the next hour, one every 20 minutes: iptables -A INPUT -i eth0 \ -p icmp --icmp-type redirect \ -m limit -j LOG The assumption in the final example is that the packet and any additional unmatched redirect packets are silently dropped by the default DROP policy for the INPUT chain. dstlimit filter TABLE MATCH EXTENSIONThe dstlimit match extension enables rate limiting on a per-destination basis, whether per IP address or per port. Note the difference between the dstlimit match extension and the limit match extension, which has one limit for packets of a certain type. Table 3.14 lists the options for the dstlimit match extension.
state filter TABLE MATCH EXTENSIONStatic filters look at traffic on a packet-by-packet basis alone. Each packet's particular combination of source and destination addresses and ports, the transport protocol, and the current TCP state flag combination is examined without reference to any ongoing context. ICMP messages are treated as unrelated, out-of-band IP Layer 3 events. The state extension provides additional monitoring and recording technology to augment the stateless, static packet-filter technology. State information is recorded when a TCP connection or UDP exchange is initiated. Subsequent packets are examined not only based on the static tuple information, but also within the context of the ongoing exchange. In other words, some of the contextual knowledge usually associated with the upper TCP Transport layer, or the UDP Application layer, is brought down to the filter layer. After the exchange is initiated and accepted, subsequent packets are identified as part of the established exchange. Associated ICMP messages are identified as being related to a particular exchange. (In computer terminology, a collection of values or attributes that together uniquely identify an event or object is called a tuple. A UDP or TCP packet is uniquely identified by the tuple combination of its protocol, UDP or TCP, the source and destination addresses, and the source and destination ports.) For session monitoring, the advantages of maintaining state information are less obvious for TCP because TCP maintains state information by definition. For UDP, the immediate advantage is the capability to distinguish responses from other datagrams. In the case of an outgoing DNS request, which represents a new UDP exchange, the concept of an established session allows an incoming UDP response datagram from the host and port the original message was sent to, within a certain time-limited window. Incoming UDP datagrams from other hosts or ports are not allowed. They are not part of the established state for this particular exchange. When applied to TCP and UDP, ICMP error messages are accepted if the error message is related to the particular session. In considering packet flow performance and firewall complexity, the advantages are more obvious for TCP flows. Flows are primarily a firewall performance and optimization technology. The main goal of flows is to allow bypassing the firewall inspection path for a packet. Much faster TCP packet handling is obtained in some cases because the remaining firewall filters can be skipped if the TCP packet is immediately recognized as part of an allowed, ongoing connection. For TCP connections, flow state can be a major win in terms of filtering performance. Also, standard TCP application protocol rules can be collapsed into a single initial allow rule. The number of filter rules is reduced (theoretically, but not necessarily in practice, as you'll see later in the book). The main disadvantage is that maintaining a state table requires more memory than standard firewall rules alone. Routers with 70,000 simultaneous connections, for example, would require tremendous amounts of memory to maintain state table entries for each connection. State maintenance is often done in hardware for performance reasons, where associative table lookups can be done simultaneously or in parallel. Whether implemented in hardware or software, state engines must be capable of reverting a packet to the traditional path if memory isn't available for the state table entry. Also, table creation, lookup, and teardown take time in software. The additional processing overhead is a loss in many cases. State maintenance is a win for ongoing exchanges such as an FTP transfer or a UDP streaming multimedia session. Both types of data flow represent potentially large numbers of packets (and filter rule match tests). State maintenance is not a firewall performance win for a simple DNS or NTP client/server exchange, however. State buildup and teardown can easily require as much processingand more memorythan simply traversing the filter rules for these packets. The advantages are also questionable for firewalls that filter primarily web traffic. Web client/server exchanges tend to be brief and ephemeral. Telnet and SSH sessions are in a gray area. On heavily trafficked routers with many such sessions, the state maintenance overhead may be a win by bypassing the firewall inspection. For fairly quiescent sessions, however, it's likely that the connection state entry will timeout and be thrown away. The state table entry will be re-created when the next packet comes along, after it has passed the traditional firewall rules. Table 3.15 lists the options available to the state match extension.
TCP connection state and ongoing UDP exchange information can be maintained, allowing network exchanges to be filtered as NEW, ESTABLISHED, RELATED, or INVALID:
Ideally, using the ESTABLISHED match allows the firewall rule pair for a service to be collapsed into a single rule that allows the first request packet. For example, using the ESTABLISHED match, a web client rule requires allowing only the initial outgoing SYN request. A DNS client request requires only the rule allowing the initial UDP outgoing request packet. With a deny-by-default input policy, connection tracking can be used (theoretically) to replace all protocol-specific filters with two general rules that allow incoming and outgoing packets that are part of an established connection, or packets related to the connection. Application-specific rules are required for the initial packet alone. Although such a firewall setup might very well work for a small or residential site in most cases, it is unlikely to perform adequately for a larger site or a firewall that handles many connections simultaneously. The reason goes back to the case of state table entry timeouts, in which a state entry for a quiescent connection is replaced because of table size and memory constraints. The next packet that would have been accepted by the deleted state entry requires a rule to allow the packet, and the state table entry must be rebuilt. A simple example of this is a rule pair for a local DNS server operating as a cache-and-forward name server. A DNS forwarding name server uses server-to-server communication. DNS traffic is exchanged between source and destination ports 53 on both hosts. The UDP client/server relationship can be made explicit. The following rules explicitly allow outgoing (NEW) requests, incoming (ESTABLISHED) responses, and any (RELATED) ICMP error messages: iptables -A INPUT -m state \ --state ESTABLISHED,RELATED -j ACCEPT iptables -A OUTPUT --out-interface <interface> -p udp \ -s $IPADDR --source-port 53 -d $NAME_SERVER --destination-port 53 \ -m state --state NEW,RELATED -j ACCEPT DNS uses a simple query-and-response protocol. But what about an application that can maintain an ongoing connection for extended periods, such as an FTP control session or a telnet or SSH session? If the state table entry is cleared out prematurely for some reason, future packets won't have a state entry to be matched against to be identified as part of an ESTABLISHED exchange. The following rules for an SSH connection allow for that possibility: iptables -A INPUT -m state \ --state ESTABLISHED,RELATED -j ACCEPT iptables -A OUTPUT -m state \ --state ESTABLISHED,RELATED -j ACCEPT iptables -A OUTPUT --out-interface <interface> -p tcp \ -s $IPADDR --source-port $UNPRIVPORTS \ -d $REMOTE_SSH_SERVER --destination-port 22 \ -m state --state NEW, -j ACCEPT iptables -A OUTPUT --out-interface <interface> -p tcp ! --syn \ -s $IPADDR --source-port $UNPRIVPORTS \ -d $REMOTE_SSH_SERVER --destination-port 22 \ -j ACCEPT iptables -A INPUT --in-interface <interface> -p tcp ! --syn \ -s $REMOTE_SSH_SERVER --source-port 22 \ -d $IPADDR --destination-port $UNPRIVPORTS \ -j ACCEPT mac filter TABLE MATCH EXTENSIONTable 3.16 lists the options available to the mac match extension.
Remember that MAC addresses do not cross router borders (or network segments). Also remember that only source addresses can be specified. The mac extension can be used only on an in-interface, such as the INPUT, PREROUTING, and FORWARD chains. The following rule allows incoming SSH connections from a single local host: iptables -A INPUT -i <local interface> -p tcp \ -m mac --mac-source xx:xx:xx:xx:xx:xx \ --source-port 1024:65535 \ -d <IPADDR> --dport 22 -j ACCEPT owner filter TABLE MATCH EXTENSIONTable 3.17 lists the options available to the owner match extension.
The match refers to the packet's creator. The extension can be used on the OUTPUT chain only. These match options don't make much sense on a firewall router; they make more sense on an end host. So, let's say that you have a firewall gateway with a monitor, perhaps, but no keyboard. Administration is done from a local, multiuser host. A single user account is allowed to log in to the firewall from this host. On the multiuser host, administrative access to the firewall could be locally filtered as shown here: iptables -A OUTPUT -o eth0 -p tcp \ -s <IPADDR> --sport 1024:65535 \ -d <fw IPADDR> --dport 22 \ -m owner --uid-owner <admin userid> \ --gid-owner <admin groupid> -j ACCEPT mark filter TABLE MATCH EXTENSIONTable 3.18 lists the options available to the mark match extension.
The mark value and the mask are unsigned long values. If a mask is specified, the value and the mask are ANDed together. In the example, assume that an incoming telnet client packet between a specific source and destination had been marked previously: iptables -A FORWARD -i eth0 -o eth1 -p tcp \ -s <some src address> --sport 1024:65535 \ -d <some destination address> --dport 23 \ -m mark --mark 0x00010070 \ -j ACCEPT The mark value being tested for here was set at some earlier point in the packet processing. The mark value is a flag indicating that this packet is to be handled differently from other packets. tos filter TABLE MATCH EXTENSIONTable 3.19 lists the options available to the tos match extension.
The tos value can be one of either the string or numeric values:
The TOS field has been redefined as the Differentiated Services (DS) field for use by the Differentiated Services Control Protocol (DSCP). For more information on Differentiated Services, see these sources:
unclean filter TABLE MATCH EXTENSIONThe specific packet-validity checks performed by the unclean module are not documented. The module is considered to be experimental, and the iptables authors recommend against its use for now. The following line shows the unclean module syntax. The module takes no arguments: -m | --match unclean The unclean extension might be "blessed" by the time this book is published. In the meantime, the module lends itself to an example of the LOG options: iptables -A INPUT -p ! tcp -m unclean \ -j LOG --log-prefix "UNCLEAN packet: " \ --log-ip-options iptables -A INPUT -p tcp -m unclean \ -j LOG --log-prefix "UNCLEAN TCP: " \ --log-ip-options \ --log-tcp-sequence --log-tcp-options iptables -A INPUT -m unclean -j DROP addrtype filter TABLE MATCH EXTENSIONThe addrtype match extension is used to match packets based on the type of address used, such as unicast, broadcast, and multicast. The types of addresses include those listed in Table 3.20.
Two commands are used with the addrtype match, as listed in Table 3.21.
iprange filter TABLE MATCHSometimes defining a range of IP addresses using CIDR notation is insufficient for your needs. For example, if you need to limit a certain range of IPs that don't fall on a subnet boundary or cross that boundary by only a couple addresses, the iprange match type will do the job. Using the iprange match, you specify an arbitrary range of IP addresses for the match to take effect. The iprange match can also be negated. Table 3.22 lists the commands for the iprange match.
length filter TABLE MATCHThe length filter table match examines the length of the packet. If the packet's length matches the value given or optionally falls within the range given, the rule is invoked. Table 3.23 lists the one and only command related to the length match.
NAT Table Target ExtensionsAs mentioned earlier, iptables supports four general kinds of NAT: source NAT (SNAT); destination NAT (DNAT); masquerading (MASQUERADE), which is a specialized case of the SNAT implementation; and local port direction (REDIRECT) to the local host. As part of the NAT table, each of these targets is available when a rule specifies the nat table by using the -t nat table specifier. SNAT NAT TABLE TARGET EXTENSIONSource Address and Port Translation (NAPT) is the kind of NAT people are most commonly familiar with. As shown in Figure 3.5, Source Address Translation is done after the routing decision is made. SNAT is a legal target only in the POSTROUTING chain. Because SNAT is applied immediately before the packet is sent out, only an outgoing interface can be specified. Figure 3.5. NAT packet traversal.Some documents refer to this form of source NAT (the most common form) as NAPT, to acknowledge the port number modification. The other form of traditional, unidirectional NAT is basic NAT, which doesn't touch the source port. That form is used when you are translating between the private LAN and a pool of public addresses. NAPT is used when you have a single public address. The source port is changed to a free port on the firewall/NAT machine because it's translating for any number of internal computers, and the port that the internal machine is using might already be in use by the NAT machine. When the responses come back, the port is all that the NAT machine has to determine that the packet is really meant for an internal computer rather than itself and then to determine which internal computer the packet is meant for. The general syntax for SNAT is as follows: iptables -t nat -A POSTROUTING --out-interface <interface> ... \ -j SNAT --to-source <address>[-<address>][:<port>-<port>] The source address can be mapped to a range of possible IP addresses, if more than one is available. The source port can be mapped to a specific range of source ports on the router. MASQUERADE NAT TABLE TARGET EXTENSIONSource Address Translation has been implemented in two different ways in iptables, as SNAT and as MASQUERADE. The difference is that the MASQUERADE target extension is intended for use with connections on interfaces with dynamically assigned IP addresses, particularly in the case in which the connection is temporary and the IP address assignment is likely to be different at each new connection. As discussed previously, in the section "NAT Table Features," MASQUERADE can be useful for phone dial-up connections in particular. Because masquerading is a specialized case of SNAT, it is likewise a legal target only in the POSTROUTING chain, and the rule can refer to the outgoing interface only. Unlike the more generalized SNAT, MASQUERADE does not take an argument specifying the source address to apply to the packet. The IP address of the outgoing interface is used automatically. The general syntax for MASQUERADE is as follows: iptables -t nat -A POSTROUTING --out-interface <interface> ... \ -j MASQUERADE [--to-ports <port>[-<port>]] The source port can be mapped to a specific range of source ports on the router. DNAT NAT TABLE TARGET EXTENSIONDestination Address and Port Translation is a highly specialized form of NAT. A residential or small business site is most likely to find this feature useful if its public IP address is dynamically assigned or if the site has a single IP address, and the site administrator wants to forward incoming connections to internal servers that aren't publicly visible. In other words, the DNAT features can be used to replace the previously required third-party port-forwarding software, such as ipmasqadm. Referring back to Figure 3.5, Destination Address and Port Translation is done before the routing decision is made. DNAT is a legal target in the PREROUTING and OUTPUT chains. On the PREROUTING chain, DNAT can be a target when the incoming interface is specified. On the OUTPUT chain, DNAT can be a target when the outgoing interface is specified. The general syntax for DNAT is as follows: iptables -t nat -A PREROUTING --in-interface <interface> ... \ -j DNAT --to-destination <address>[-<address>][:<port>-<port>] iptables -t nat -A OUTPUT --out-interface <interface> ... \ -j DNAT --to-destination <address>[-<address>][:<port>-<port>] The destination address can be mapped to a range of possible IP addresses, if more than one is available. The destination port can be mapped to a specific range of alternate ports on the destination host. REDIRECT NAT TABLE TARGET EXTENSIONPort redirection is a specialized case of DNAT. The packet is redirected to a port on the local host. Incoming packets that would otherwise be forwarded on are redirected to the incoming interface's INPUT chain. Outgoing packets generated by the local host are redirected to a port on the local host's loopback interface. REDIRECT is simply an alias, a convenience, for the specialized case of redirecting a packet to this host. It offers no additional functional value. DNAT could just as easily be used to cause the same effect. REDIRECT is likewise a legal target only in the PREROUTING and OUTPUT chains. On the PREROUTING chain, REDIRECT can be a target when the incoming interface is specified. On the OUTPUT chain, REDIRECT can be a target when the outgoing interface is specified. The general syntax for REDIRECT is as follows: iptables -t nat -A PREROUTING --in-interface <interface> ... \ -j REDIRECT [--to-ports <port>[-<port>]] iptables -t nat -A OUTPUT --out-interface <interface> ... \ -j REDIRECT [--to-ports <port>[-<port>]] The destination port can be mapped to a different port or to a specific range of alternate ports on the local host. BALANCE NAT TABLE TARGET EXTENSIONThe BALANCE target enables a round-robin method of sending connections to more than one target host. The BALANCE target uses a range of addresses for this purpose and thus provides a rudimentary load-balancing. The general syntax for BALANCE is as follows: iptables -t nat -A PREROUTING -p tcp -j BALANCE \ --to-destination <ip address>-<ip address> The CLUSTERIP target also provides some of these same options. mangle Table CommandsThe mangle table targets and extensions apply to the OUTPUT and PREROUTING chains. Remember, the filter table is implied by default. To use the mangle table features, you must specify the mangle table with the -t mangle directive. mark mangle TABLE TARGET EXTENSIONTable 3.24 lists the target extensions available to the mangle table.
There are two mangle table target extensions: MARK and TOS. MARK contains the functionality to set the unsigned long mark value for the packet maintained by the iptables mangle table. An example of usage follows: iptables -t mangle -A PREROUTING --in-interface eth0 -p tcp \ -s <some src address> --sport 1024:65535 \ -d <some destination address> --dport 23 \ -j MARK --set-mark 0x00010070 TOS contains the functionality to set the TOS bits in the IP header. An example of usage follows: iptables -t mangle -A OUTPUT ... -j TOS --set-tos <tos> The possible tos values are the same values available in the filter table's TOS match extension module. |