16.2 Configuration

   


This section describes the options available to configure routing in Linux. First, this concerns the kernel configuration, which is used, for example, to determine whether advanced features, such as rule-based routing, should be integrated into the kernel. The options available for this configuration are described in Section 16.2.1. Second, you can also modify some routing parameters while the system is running. The setting options available for this in the proc file system are discussed in Section 16.2.2. Third, you have to add entries to routing tables and rule lists. The ip command, which is described in Section 16.2.3, is a good tool to manage such entries.

16.2.1 Configuring the Kernel

Some routing options can be set when you configure the Linux kernel, before it is compiled. All of them are in the networking options section and will be described briefly in this section below. In addition to the name of the preprocessor constant, which is defined when an option is activated, the label shown in the kernel configurator is given in double quotes. A prerequisite to being able to activate some of these options is that CONFIG_INET ("TCP/IP networking") should be enabled; without that, routing makes no sense, anyway.

  • CONFIG_NETLINK "Kernel/User netlink socket"

    Rather than directly influencing the routing mechanism, this option activates the bidirectional netlink interface between the kernel and the user-address space, which is implemented with datagram sockets of the new protocol family, PF_NETLINK, and can be used to communicate with different kernel areas. The respective area is selected by an identifier, which is given instead of a protocol when you open the socket. Section 26.3.3 describes more details.

    In connection with routing, the NETLINK_ROUTE "protocol identifier" is important, and it can be used by activating the following option. This option is available only provided that CONFIG_NETLINK is active:

    • CONFIG_RTNETLINK "Routing messages"

      Routing rules and routing tables can be modified by using sockets of the PF_NETLINK protocol family and the NETLINK_ROUTE "protocol." This interface, which will also be called RT netlink interface below, is used in the ip configuration tool described in Section 16.2.3. Besides, by reading an RT netlink socket, you can "eavesdrop" on changes made to routing tables by other processes.

  • CONFIG_IP_ADVANCED_ROUTER "IP: advanced router"

    This option has no direct effect; it represents a switch that allows you to select a number of additional options can be used to obtain much more control over the routing procedure. The options CONFIG_NETLINK and CONFIG_RTNETLINK are activated automatically when you select CONFIG_IP_ADVANCED_ROUTER.

    • CONFIG_IP_MULTIPLE_TABLES "IP: policy routing"

      This option links the file fib_rules.o into the kernel and enables the rule-based routing described in Section 16.1.6. If this option is disabled, then the kernel creates only two routing tables, local and main, and searches them in this order.

      The following additional options are available in connection with rule-based routing:

      • CONFIG_IP_ROUTE_FWMARK "IP: use netfilter MARK value as routing key"

        This option allows you to include the fwmark, which can be added to certain packets by using packet filter rules (see Section 19.3.5), in the forwarding decision (i.e., you can specify different routes for packets with different packet filter marks). For example, you can make the route selection indirectly dependent on transport-protocol attributes (e.g., ports). CONFIG_NETFILTER ("network packet filtering") has to be active to be able to select CONFIG_IP_ROUTE_FWMARK.

      • CONFIG_IP_ROUTE_NAT "IP: fast network address translation"

        When this option is active, you can use special routing entries to translate addresses (Network Address Translation NAT). This functionality complements the NAT rules mentioned in Section 16.1.6; see [Kuzn99] for a description of how you can configure this rarely used option.

        Activating CONFIG_IP_ROUTE_NAT causes ip_nat_dumb.o to be linked into the kernel.

    • CONFIG_IP_ROUTE_MULTIPATH "IP: equal cost multipath"

      If the routing table includes several equal-ranking entries to a specific destination, then Linux traditionally selects the first. This behavior cannot be used meaningfully, because the order in which the entries are found cannot be seen or influenced from outside of the kernel. You can use the option CONFIG_IP_ROUTE_MULTIPATH to enable special entries that specify several equal routes, and then have one of these routes selected randomly.

    • CONFIG_IP_ROUTE_TOS "IP: use TOS value as routing key"

      When enabled, this option causes the value of the Differentiated Services Codepoint field from the IP packet header to be included in the routing decision. (This field was formerly called Type of Service, which is the reason it is still referred to as the TOS field in the kernel and in this chapter.) You can assign values for this field in routing-table entries, which means that these entries will be used only for packets with matching values in the TOS field.

    • CONFIG_IP_ROUTE_VERBOSE "IP: verbose route monitoring"

      If this option is enabled, then messages are written to the system log when certain error situations occur during the routing process normally ones caused by attackes or faulty configurations.

    • CONFIG_IP_ROUTE_LARGE_TABLES "IP: large routing tables"

      The hash tables used to manage routing table entries normally have a fixed size. The size of these tables is increased automatically when CONFIG_IP_ROUTE_LARGE_TABLES is activated, so that the access speed doesn't drop when they include many entries.

  • CONFIG_IP_MROUTE "IP: multicast routing"

    This option activates multicast routing and links the ipmr.o file into the kernel. Multicast routing is discussed in Chapter 17.

  • CONFIG_WAN_ROUTER "WAN router"

    This option has no effect on the routing procedure. It includes the general management functionality for special network interfaces used to build Wide Area Networks (WANs). This special hardware allows you to use a Linux computer as WAN router.

  • CONFIG_NET_FASTROUTE "Fast switching"

    If the input and output interfaces of a forwarded packet are different, then you can accelerate the copying process required in some cases by special hardware support directly from network card to network card. CONFIG_NET_FASTROUTE has to be enabled to be able to use this option. The only effect on the routing procedure is that a mark is set in situations suitable for fast copying. This can be handled by the drivers of network cards, if the required hardware is available.

  • CONFIG_NET_SCHED "QoS and/or fair queuing"

    This option allows you to activate the options for traffic control, described in Chapter 18. We include this option here only because routing rules and routing-table entries can be used to classify packets. Notice that this requires the suboption CONFIG_NET_CLS ("Packet classifier API") and its suboption CONFIG_NET_CLS_ROUTE4 to be activated. As a consequence, the symbol CONFIG_NET_CLS_ROUTE is defined additionally. This symbol can be configured nowhere else, and it causes the data structures for routing rules and routing-table entries to be extended by an element required for classification.

16.2.2 Files in the proc File System

Some entries in the proc directory tree can be used to probe and manipulate data structures and routing properties. You find such entries in two different directories, /proc/net and /proc/sys/net/ipv4.

The /proc/net Directory

The /proc/net directory includes files that reflect extensive routing-related data structures in the kernel, namely the routing table main in route and the routing cache in rt_cache. In rt_acct, you might additionally be able to read statistics about the number of packets or bytes that used a specific route except that it is not yet used and so this file is always empty. All files mentioned here have read access only.

The /proc/sys/net/ipv4 Directory

The entries underneath /proc/sys are created by a relatively new uniform mechanism. Each of them describes a configurable parameter of the kernel. They can be probed and modified either by reading from or writing to a file, or by using the system call _sysctl() and the sysctl command. Entries for parameters of the IPv4 implementation, some of which are related to routing, are located underneath /proc/sys/net/ipv4:

  • ip_forward: This entry represents a switch for the forwarding functionality; the system acts as a router whenever this entry is set to one. If it is set to zero, then all packets received and not addressed to the local system are discarded.

  • route subdirectory: The files in the route subdirectory reflect numeric or Boolean values, with one exception; they are used by the kernel to manage the routing cache, amongst others. The directory entries and their variables in the kernel normally have the same names, with an ip_rt_ prefix for the variables. The exact meanings of these entries will not be discussed here, apart from the one single exception: Writing to the flush entry causes the routing cache to be deleted.

  • conf/device subdirectories: /proc/sys/net/ipv4/conf includes a number of subdirectories namely, one for each registered network interface (lo, eth0, ...), one named default, and one named all. All directories include the same entries, which refer to the interface with the same name. In addition, the entries in the all directory are global for all interfaces, and the entries in the default directory represent default values for any interfaces registered in the future. The following entries are of interest for the routing mechanism:

    • forwarding: Like the entry in /proc/sys/net/ipv4/ip_forward, the entry in forwarding represents a switch for the forwarding mechanism. The entry in the all directory even reflects exactly the same value. The entries in the interface directories apply only to the forwarding of packets that arrived via specific interfaces. Each time that the switch value (except the default value) is changed, the routing cache is automatically deleted. The all value (and accordingly also the value in /proc/sys/net/ipv4/ip_forward) has particular semantics: When it is written, then all interface entries and the default entry are automatically set to the same new value.

    • log_martians: If the all entry or the entry of an interface is set to 1, then socalled "Martians" illegal address values (e.g., values that are incorrect with respect to the configuration of the interface that received this packet) are shown in the system log.

    • rp_filter: If the all entry and the entry of an interface are active, then packets arriving over this interface are subject to Reverse-Path Filtering, which means that a check tests whether a packet with exchanged source and destination addresses, according to the routing tables, would be sent over the interface which actually received this packet. If this test fails, then the packet is discarded. Reverse-Path Filtering is a sensible security measure against packets with forged (or spoofed) source address. However, it can sometimes be useful to use different interfaces for different directions intentionally, so this measure can cause problems and therefore is allowed to be disabled.

16.2.3 Configuration on System Level

Before a Linux system can send IP packets, or act as a router and forward IP packets for other systems, we have to add appropriate entries to routing tables. Unless we are using a routing daemon for automatic routing based on a routing protocol, a capability hardly needed at the "outskirts" of the Internet, the system administrator has to either add static entries manually or use scripts upon system start or when new interfaces are added (e.g., when a PPP connection is established).

The "traditional" Unix command to manage routing tables is route. However, it does not support the relatively new rule-based routing, and all it allows you to do is modify the main table and read from the routing cache. It uses the ioctl() system call to interface to the kernel.

Alexey Kuznetsov, one of the major contributors to the development of the routing implementation in the Linux kernel, also proposed a tool that uses the more recent RT netlink interface to the kernel. It can be used to manipulate not only routing tables, but also a number of other parameters of the network configuration. The command is called ip, and it expects that the first parameter will always be an area to be configured. Table 16-1 shows an overview of all possible areas.

Table 16-1. Variants of the ip command.

Command

Function

ip link

Configures network interfaces (see also ifconfig).

ip address

Manages additional addresses of network interfaces.

ip neighbour

Manages the ARP table (see arp).

ip route

Manages routing tables (see route).

ip rule

Manages the routing rules database.

ip maddress

Manages multicast address entries in network cards.

ip mroute

Shows multicast routes.

ip tunnel

Manages tunnels.

ip monitor

Monitors the RT netlink interface.


The syntax of this command is relatively uniform for different areas: The area identifier normally is followed by an action identifier (e.g., show, add, delete, help), followed by area-specific parameters, which are denoted by a leading keyword. The action identifier help always supplies a syntax description. For example, ip route help shows the syntax of commands used to manipulate and query routing tables.

The following subsections describe only those two variants of the ip command that are used to manipulate routing rules and routing tables: ip rule, and ip route. More information about the other variants is included in the ip tool documentation [Kuzn99].

The ip rule Command

The ip rule variant of the ip command serves to output, add, or delete routing rules in the kernel database by using ip rule show, ip rule add, and ip rule delete. ip rule show outputs all rules; the other two commands require additional parameters to describe a rule. These parameters are denoted by a leading keyword, as shown in Table 16-2; see also Section 16.1.6 for a description of the meaning of these parameters.

Table 16-2. Parameters for ip rule add and ip rule delete.

Keyword

Parameter

type

Rule type (unicast, blackhole, unreachable, prohibit, nat).

from

Source address prefix (prefix length separated by /).

to

Destination address prefix.

iif

Name of the input interface.

tos

Value in the TOS field of the IP packet.

fwmark

Value of the fwmark.

priority

Unique priority value for the rule.

table

Name or number of a routing table for unicast rules.

realms

Class identifier of a queuing discipline.

nat

First address of a NAT source address range for nat rules.


If mandatory parameters are not stated, then default values are used. The type used then is unicast with a reference to the main table, and the priority value immediately below the smallest number used (except for the value null, which is always present) is assigned. The priority numbers should be unique, but notice that this is not checked. The names of routing tables are translated to table numbers by using the information from the configuration file /etc/iproute2/rt_tables, and so names can also be assigned to tables other than default tables.

For example, to use a special routing table (in this case number 99) for IP packets with source address matching the 16-bit prefix 192.168 and received over the eth1 interface, we could use the following command to insert the rule at position 1000:

 root@tux # ip rule add prio 1000 from 192.168/16 iif eth1 table 99 

The rule type (unicast) can be omitted, because it coincides with the default value. Next, we can use ip rule show to output the existing set of rules, together with the default rules:

 root@tux # ip rule show 0:        from all lookup local 1000:     from 192.168.0.0/16 iif eth1 lookup 99 32766:    from all lookup main 32767:    from all lookup default 

Of course, the table mentioned above?9 should also exist; otherwise, the rule would have no effect. We can create a new table implicitly by using ip route add to add entries to it.

The ip route Command

You can use ip route add to add, ip route change or ip route replace to modify, ip route show to output, and ip route delete (for single entries) or ip route flush (for several entries at once) to delete entries in routing tables. In addition, you can use ip route get to simulate a forwarding procedure, where the route found is output and stored in the routing cache.

Routing table entries have a large number of attributes, which can be set with appropriate parameters when you create them. For viewing or deleting of tables, parameters stated act as selectors to limit the number of entries output or deleted. Table 16-3 shows only the most important parameters. We have divided them into three groups, by their meaning:

Table 16-3. The most important parameters for ip route.

Keyword

Parameter

table

name/number of the routing table to be manipulated

to

type and destination address prefix

tos

value for the TOS field of the IP packet

metric

route quality (the higher, the worse)

dev

output interface

via

address of the next router


  • The table parameter is actually a command attribute rather than a route attribute. It specifies the routing table this command refers to. Here, too, either numbers or the names defined in /etc/iproute2/rt_tables can be used for tables. If the table parameter is not stated, then it is assumed that the command refers to the main table.

    For ip route show, you can also use all or cache together with table to display all tables or the routing cache.

  • The to parameter to specify a network prefix and the tos parameter to specify the TOS value of IP packets, which may use this entry, supply the key to search a routing table for a forwarding entry: For an IP packet to be forwarded, the search algorithm first looks for the entry with the longest matching network prefix and then checks for whether the TOS value matches, if set in the entry. If the TOS value does not match, then the search continues with shorter prefixes.

    The keyword to does not have to be stated, because the destination network prefix actually represents the default parameter for commands to manage routing entries. For a network prefix with length zero you can state the default keyword.

    Between the to keyword and the network prefix, you can optionally state a type for the entry you look for. Entries describing routes to other networks and end systems are normally of the type unicast, which is also assumed by default. The local table, which is maintained automatically by the kernel, can additionally use the types local and broadcast, which describe addresses of local network interfaces. Such entries are used to see whether incoming packets are meant for the local system. In addition, there are other entry types that virtually never occur. (See [Kuzn99].)

    The quality information, which can be set by the metric parameter, plays a role when several entries exist that otherwise match equally well. In this case, the entry with the smallest metric value is selected.

  • The result of a search in a routing table is essentially a network interface, which should be used to forward the current packet, and the next router, if the destination system is not in the network directly connected over this interface. The dev parameter can be used to specify a network interface. The next router is specified by the via parameter, if required.

    A next router can be specified only provided that it is known how this router can be reached. This means that another entry describing the subnetwork of this router has to exist; naturally, the next router always has to be in a directly connected subnetwork. The network interface used to reach this router can be determined from this entry. For this reason, when using via to specify a router when you create an entry, you don't have to use dev to specify a network interface.

There are a number of additional attributes you can assign to a routing entry (e.g., a set of TCP parameters see Chapter 24), which will be used when the entry is assigned to a TCP connection.[1]

[1] For efficiency reasons, rather than doing a routing request for each single IP packet created by TCP, there is only one single routing request when a connection is established.

The following example shows the commands used to build the main routing table for router B from Figure 16-4:

 root@tux # ip route add 10.0.3/24 dev eth0 root@tux # ip route add 10.0.4/24 dev eth1 root@tux # ip route add 10.0.5/24 via 10.0.4.3 root@tux # ip route add 10.0.2.1 dev ppp0 root@tux # ip route add default via 10.0.2.1 

However, the last two entries would normally be created not manually, but automatically by the PPP daemon as soon as a PPP connection is established. ip route show can be used to obtain the table shown in Figure 16-5 (in a slightly different format):

 root@tux # ip route show 10.0.2.1 dev ppp0 scope link 10.0.4.0/24 dev eth1 scope link 10.0.5.0/24 via 10.0.4.3 dev eth1 10.0.3.0/24 dev eth0 scope link default via 10.0.2.1 dev ppp0 


       


    Linux Network Architecture
    Linux Network Architecture
    ISBN: 131777203
    EAN: N/A
    Year: 2004
    Pages: 187

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net