21.3 Implementing the NAT Module

   


This section first introduces important data structures to manage session flows, allocations, and address bindings. Subsequently, it will explain the functions used to establish and tear down address bindings, to actually translate addresses, and to handle ICMP error messages.

21.3.1 Important Data Structures

All session flows are completely managed by the connection-tracking module described in Chapter 20. A structure of the type ip_conntrack is stored for each session flow. (See Section 20.2.2.) This structure includes two data structures of the type ip_conntrack_tuple_hash, representing the forward and reverse directions of a session flow. If a session flow is translated by the NAT module, then the ip_conntrack_tuple_hash structure for the reverse direction is adapted so that reply packets can be allocated to it properly.

In the example discussed in Section 21.1.6, where the internal address 192.168.1.1 is translated into the global address 199.10.42.1, a connection from port 1200 to port 80 in the WWW server 100.1.1.1 would be represented by the following entries:

  • Forward: 192.168.1.1:1200 100.1.1.1:80

  • 199.10.42.1:1200

The connection-tracking module stores a pointer to the relevant data structure of the type ip_conntrack in the sk_buff of each packet. If the NAT module wants to allocate an IP packet to a session flow, it invokes its own ip_conntrack_get() function, which returns the matching ip_conntrack structure.

struct ip_nat_expect

linux/netfilter_ipv4/ip_nat_rule.h


The NAT module has an ordered list, nat_expect_list, with data structures of the type ip_nat_expect, to enable protocol-specific NAT modules (e.g., for FTP see Section 21.1.7) to decide when a new session flow was expected, so that it requires special handling. Each of these structures consists essentially of a pointer to a function that actually makes that decision:

struct ip_nat_expect { struct list_head list;        /* Returns 1 (and sets verdict) if it has setup NAT for this           connection */        int (*expect) (struct sk_buff **pskb,                       unsigned int hooknum,                       struct ip_conntrack *ct,                       struct ip_nat_info *info,                       struct ip_conntrack *master,                       struct ip_nat_info *masterinfo,                       unsigned int *verdict); }; 

struct ip_nat_multi_range

linux/netfilter_ipv4/ip_nat.h


The ip_nat_multi_range structure is used mainly to specify the set of addresses available for address translation. It contains one or several structures of the type ip_nat_range, each specifying a continuous IP address range, and a rangesize field that takes the number of contained ip_nat_range structures:

 struct ip_nat_multi_range {        unsigned int rangesize;        /* hangs off end. */        struct ip_nat_range range[1]; }; 

struct ip_nat_range

linux/netfilter_ipv4/ip_nat.h


The ip_nat_range structure serves to represent a continuous IP address range; the range boundaries are specified in the min_ip and max_ip fields:

 /* Single range specification. */ struct ip_nat_range {              /* Set to OR of flags above. */              unsigned int flags;              /* Inclusive: network order. */              u_int32_t min_ip, max_ip;              /* Inclusive: network order */              union ip_conntrack_manip_proto min, max; }; 

If the flags bit vector contains the value IP_NAT_RANGE_PROTO_SPECIFIED, then the min and max fields additionally specify a protocol-specific address range: a port-number range for TCP and UDP, a value range from the ICMP ID field for ICMP.

NAT fully relies on the means offered by the netfilter architecture to manage the selection rules needed to specify packets requiring an address translation. For this purpose, two new branch destinations, SNAT and DNAT, are defined with the functions ipt_snat_target() and ipt_dnat_target(), which do the actual address-translation work. A multipurpose parameter, targinfo, is used to pass the address range available for translation to these functions in the form of an ip_nat_multi_range structure.

struct ip_nat_info

linux/netfilter_ipv4/ip_nat.h


As was explained in Chapter 20, the connection-tracking module creates a data structure of the type ip_conntrack for each session flow, to store all relevant information. If the NAT functionality was activated in the Linux kernel (CONFIG_IP_NAT_NEEDED), then this structure additionally contains a nat substructure, and the info field of that substructure contains a structure of the type ip_nat_info:

 struct ip_conntrack { ... #ifdef CONFIG_IP_NF_NAT_NEEDED        struct {               struct ip_nat_info info; ...        } nat; #endif /* CONFIG_IP_NF_NAT_NEEDED */ }; The ip_nat_info structure stores the address bindings of a session flow: struct ip_nat_info {         /* Set to zero when conntrack created: bitmask of maniptypes */         int initialized;         unsigned int num_manips;         /* Manipulations to be done on this conntrack. */         struct ip_nat_info_manip manips[IP_NAT_MAX_MANIPS];         /* The mapping type which created us (NULL for null mapping). */         const struct ip_nat_mapping_type *mtype;         struct ip_nat_hash bysource, byipsproto;         /* Helper (NULL if none). */         struct ip_nat_helper *helper; }; 

The initialized bit vector specifies whether the address binding for the source address (bit 0) or the destination address (bit 1), or both, was initialized.

The num_manips field specifies the number of executable manipulations stored in the manips vector. Manipulations are counted separately at different hooks and for different directions. Each of these manipulations is represented by a structure of the type ip_nat_info_manip.

The bysource and byipsproto fields include hash values used to sort the structure in the two hash tables described below, and helper is a pointer to an optional helper module. (See Section 21.4.2.)

struct ip_nat_info_manip

linux/netfilter_ipv4/ip_nat.h


 struct ip_nat_info_manip {        /* The direction. */        u_int8_t direction;        /* Which hook the manipulation happens on. */        u_int8_t hooknum;        /* The manipulation type. */        u_int8_t maniptype;        /* Manipulations to occur at each conntrack in this dirn. */        struct ip_conntrack_manip manip; }; 

The ip_nat_info_manip structure represents a manipulation or address binding. It contains the direction (IP_CT_DIR_ORIGINAL for the forward direction, IP_CT_DIR_REPLY for the reverse direction), the netfilter hook number, and the address translation type (IP_NAT_MANIP_SRC for source NAT, IP_NAT_MANIP_DST for destination NAT), and its ip_conntrack_manip structure includes the IP address and port number to which the former address should be mapped.

To manage address bindings, the NAT module uses two hash tables, bysource and byipsproto, where collisions are resolved by linear lists. The byipsproto table is used to account for mappings done, to ensure that no two mappings to the same IP address exist. The keys are the transport protocol number and the IP source and destination addresses of the session flow after the address translation.

The keys for the bysource table are the transport protocol, the IP source address, and the source port before the address translation. This table is used by the find_appropriate_src() function to detect existing session flows. (See Section 21.3.4.)

21.3.2 Initializing and Uninitializing the NAT Module

The two functions init() and fini() are used to initialize and uninitialize the NAT module. In turn, these two functions invoke the init_or_cleanup() function with parameter 1 (initialize) or 0 (uninitialize).

init_or_cleanup()

net/ipv4/netfilter/ip_nat_standalone.c


The init_or_cleanup() function serves to initialize or uninitialize the NAT module, depending on the value of the init parameter.

To initialize the NAT module, the ip_nat_rule_init() function (from net/ipv4/netfilter/ip_nat_rule.c) is invoked first. This function uses ipt_register_table() to create the new netfilter table, nat. Subsequently, it uses ipt_register_target() to register the new branch destinations, SNAT and DNAT, with the handling functions ipt_snat_target() and ipt_dnat_target(). Next, the ip_nat_init() function (from net/ipv4/netfilter/ip_nat_core.c) initializes the standard protocols TCP, UDP, and ICMP and the two hash tables?TT>bysource and byipsproto. Finally, the functions ip_nat_fn(), ip_nat_local_fn(), and ip_nat_out() are registered with the appropriate hooks, and the usage counter of the connection-tracking module is incremented.

To uninitialize the NAT module, the cleanup work is done in reverse order: First, the usage counter of the connection tracking module is decremented; then, the NAT functions are removed from the hooks; next, ip_nat_cleanup() deletes the hash tables and the transport protocol modules; and, finally, ip_nat_rule_cleanup() releases the NAT branch destinations and the NAT table.

21.3.3 How an IP Packet Travels Through the NAT Module

An IP packet that traverses the system is handed over to the NAT module twice:

  • When it enters the system at the netfilter hook NF_IP_PRE_ROUTING. At this point, the packet is handled by the ip_nat_fn() function.

  • Once it has been created by a local process at the netfilter hook NF_IP_LOCAL_OUT. Notice that fragmented packets slip past NAT, and ip_nat_fn() is invoked for all other packets.

  • At netfilter hook NF_IP_POST_ROUTING, when the packet leaves the system. The appropriate function is ip_nat_out(); it first reassembles fragmented packets, if present, and then invokes ip_nat_fn(). As was described in Section 20.2.6, although fragmented packets are reassembled by the connection-tracking module (ip_conntrack_in() function), these packets might have been fragmented again by the routing code in the meantime.

ip_nat_fn()

net/ipv4/netfilter/ip_nat_standalone.c


This function is invoked for each packet (not only for packets subject to address translation). The parameters it takes include the number of the netfilter hook where it was invoked and a pointer to an sk_buff structure, together with the packet.

First, the HOOK2MANIP macro is used to select the NAT variant to be used from the hook number. At netfilter hook NF_IP_POST_ROUTING, the source address (IP_NAT_MANIP_SRC) should be changed; otherwise, the destination address (IP_NAT_MANIP_DST) has to be changed. Subsequently, the ip_conntrack_get() function is invoked from the connection-tracking module to discover the associated connection entry and its state. The further approach differs, depending on this state:

  • Expected connection (IP_CT_RELATED): If the packet is an ICMP message, then the function icmp_reply_translation() is invoked, which does the actual address translation. Otherwise, the packet is handled exactly as in the IP_CT_NEW case.

  • New connection (IP_CT_NEW): The ip_nat_info structure in the connection entry is checked to see whether the address allocation has already been initialized. (This can happen, for example, when a connection establishment packet is retransmitted after a timeout.) If this is not the case, then the function ip_nat_rule_find() initiates the nat netfilter table processing. If the new connection requires address translation, netfilter invokes one of the branch destination functions, ipt_snat_target() or ipt_dnat_target(), to initialize a new address binding. Finally, place_in_hashes() adds the new address to the two hash tables, byipsproto and bysource.

  • Other cases: No new address binding has to be created in any other case. Instead, only the ip_nat_info structure is read from the connection entry, to see whether any address binding applies.

Finally, the last step invokes do_bindings (see Section 21.3.5), which handles the actual address translation; the return value (normally NF_ACCEPT) is passed to the calling function.

21.3.4 Initializing an Address-Binding Process

ipt_snat_target(), ip_dnat_target()

net/ipv4/netfilter/ip_nat_rule.c


The ipt_snat_target() and ipt_dnat_target() functions are registered branch destinations for the netfilter table, nat. This table is processed only if the first packet of a new session flow was registered in ipt_nat_fn(). The table uses its rules list to identify session flows subject to source NAT or destination NAT.

Initially, either of the two functions uses ip_conntrack_get() to find the corresponding connection entry of the connection-tracking module and then invokes the ip_nat_setup_info() function to do a new address binding. The result of this invocation, the address information of the newly assigned binding, is passed to the calling function.

ip_nat_setup_info()

net/ipv4/netfilter/ip_nat_core.c


This function is invoked by the handling functions for the SNAT and DNAT branch destinations and does essentially three things:

  • When the get_unique_tuple() (see below) is invoked, it searches for a free address available to do the address translation. If no free address is available, then the value NF_DROP is returned and the packet is dropped.

    To take other address translations into account, the entry for the reverse direction inverted by invert_tuplepr() is used as the basis rather than the connection-tracking address entry for the forward direction of the session flow, because the reverse direction entry already includes the translated addresses, in contrast to the forward direction.

  • Invoking the ip_conntrack_alter_reply() function causes the reverse direction of a connection entry to be altered so that reply packets can be allocated properly to the correct connection, despite its translated address.

  • Finally, the new address binding is added to the substructure of the ip_nat_info connection entry. The mapping rules (e.g., transformation of source address and source port) result from comparing the original ip_conntrack_tuple structure with the new structure, supplied by get_unique_tuple.

get_unique_tuple()

net/ipv4/netfilter/ip_nat_core.c


This function is invoked by ip_nat_setup_info() to search for a free address within a specified address range (represented by a structure of the type ip_nat_multi_range).

If the function is invoked at the NF_IP_POST_ROUTING hook (i.e., to map the source address), then invoking the function find_appropriate_src() checks for whether an address binding exists for the source address (IP address, protocol number, and protocol-specific part) in the bysource hash table for that packet. If this binding is within the specified address range, it is returned as a new address entry.

In all other cases, the combination of IP address and protocol number least used within the specified address range is determined by the function find_best_ips_proto_fast(). This function iterates over all possible IP addresses in the address range and uses the byipsproto hash table to check the number of bindings existing at this IP address. If it finds an appropriate IP address, it lets the function ip_nat_used_tuple() check on whether this address is unique and, if the IP_NAT_RANGE_PROTO_SPECIFIED flag is set, whether the protocol-specific part is within the address range (with the help of the protocol-specific function, proto->in_range). If this is the case, then the tuple is returned. If this is not the case, both the new address and the address range are passed to the protocol-specific function, proto->unique_tuple(). Now, this function attempts to vary the protocol-specific part (e.g., TCP port number or ICMP ID) to find a unique combination. If this attempt is unsuccessful, the function uses find_best_ips_proto_fast() to select the next best IP address, and the checking process starts over again. Eventually, if no combination can be found, value 1 is returned, and the packet is dropped.

21.3.5 The Actual Address Translation

do_bindings()

net/ipv4/netfilter/ip_nat_core.c


This function applies the address bindings specified in the ip_nat_info structure to a packet. To this end, it searches the info->manips list for all bindings that belong to the matching direction and the matching hook. Next, it invokes the manip_pkt function to do the appropriate transformations. This function transforms the IP addresses, recalculates the checksum in the IP header, and invokes the proto->manip_pkt() function for the protocol-specific part, which does the actual translation and computes checksums, if present.

Finally, do_bindings() invokes a helper module, if present (e.g., for correct FTP handling see Section 21.1.7).


       


    Linux Network Architecture
    Linux Network Architecture
    ISBN: 131777203
    EAN: N/A
    Year: 2004
    Pages: 187

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net