ip routeRouting Table Management

   

ip route ” Routing Table Management

Abbreviations: route, ro, r

This command manages the route entries within the kernel routing tables. The kernel routing tables keep information about protocol paths to other networked nodes.

As you saw in Chapter 2, "Policy Routing Theory," there are two parts to the implementation of the RPDB (Routing Policy Data Base). The ip route object allows specification and definition of the routing information base part of the RPDB.

To understand the massive amount of information in the following section you will want to study the syntax and command flow in the ip route help listing. When you understand the command syntax flow, you will realize that the rest of these sections essentially walk through the command parts piece by piece. Here is the output for ip route help:

 
 Usage: ip route {  list  flush }  SELECTOR        ip route get ADDRESS [ from ADDRESS iif STRING ]                             [ oif STRING ]  [ tos TOS ]        ip route {  add  del  change  append  replace  monitor }  ROUTE SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ]             [ table TABLE_ID ] [ proto RTPROTO ]             [ type TYPE ] [ scope SCOPE ] ROUTE := NODE_SPEC [ INFO_SPEC ] NODE_SPEC := [ TYPE ] PREFIX [ tos TOS ]              [ table TABLE_ID ] [ proto RTPROTO ]              [ scope SCOPE ] [ metric METRIC ] INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]... NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]            [ rtt NUMBER ] [ rttvar NUMBER ]            [ window NUMBER] [ cwnd NUMBER ] [ ssthresh REALM ]            [ realms REALM ] TYPE := [ unicast  local  broadcast  multicast  throw            unreachable  prohibit  blackhole  nat ] TABLE_ID := [ local  main  default  all  NUMBER ] SCOPE := [ host  link  global  NUMBER ] FLAGS := [ equalize ] NHFLAGS := [ onlink  pervasive ] RTPROTO := [ kernel  boot  static  NUMBER ] 
 

ip route { add/change/replace}

This command adds, changes, or replaces routes in the routing tables.

  • ip route add ” Add new route

  • ip route change ” Change route

  • ip route replace ” Change route or add new one

Abbreviations: add, a; change, chg replace, repl

Arguments

to PREFIX or to TYPE PREFIX (default) ”The destination prefix of the route. If TYPE is omitted, ip assumes type unicast . Other values of TYPE are listed in Chapter 2 and are summarized here as follows :

  • unicast ” The route entry describes real paths to the destinations covered by the route prefix.

  • unreachable ” These destinations are unreachable; packets are discarded and the ICMP message host unreachable (ICMP Type 3 Code 1) is generated. The local senders get error EHOSTUNREACH .

  • blackhole ” These destinations are unreachable; packets are silently discarded. The local senders get error EINVAL .

  • prohibit ” These destinations are unreachable; packets are discarded and the ICMP message communication administratively prohibited (ICMP Type 3 Code 13) is generated. The local senders get error EACCES .

  • local ” The destinations are assigned to this host, the packets are looped back and delivered locally.

  • broadcast ” The destinations are broadcast addresses, the packets are sent as link broadcasts.

  • throw ” Special control route used together with policy rules. If a throw route is selected, then lookup in this particular table is terminated , pretending that no route was found. Without any Policy Routing, it is equivalent to the absence of the route in the routing table, the packets are dropped and ICMP message net unreachable (ICMP Type 3 Code 0) is generated. The local senders get error ENETUNREACH .

  • nat ” Special NAT (Network Address Translation ”see Chapter 8) route. Destinations covered by the prefix are considered as dummy (or external) addresses, which require translation to real (or internal) ones before forwarding. The addresses to translate to are selected with the attribute via .

  • anycast (not implemented currently) ”The destinations are anycast addresses assigned to this host. They are mainly equivalent to local addresses, with the difference that such addresses are invalid to be used as the source address of any packet.

  • multicast ” Special type, used for multicast routing. It is not present in normal routing tables.

PREFIX is an IPv4 or IPv6 address optionally followed by a slash and prefix length. If the length of the prefix is missing, ip assumes full-length host route. Also, there is one special PREFIX ”default ”that is equivalent to IP 0/0 or to IPv6 /0.

  • tos TOS or dsfield TOS ”Type of Service (TOS) key. This key has no mask associated and the longest match is understood as to first compare the TOS of the route and the packet; if they are not equal, then the packet still may match a route with zero TOS. TOS is either an 8-bit hexadecimal number or an identifier from /etc/iproute2/rt_dsfield .

  • metric NUMBER or preference NUMBER ”Preference value of the route. NUMBER is an arbitrary 32-bit number.

  • table TABLEID ”The table to add this route. TABLEID may be a number or a string from the file /etc/iproute2/rt_tables . If this parameter is omitted, ip assumes table main , with the exception of local, broadcast , and nat routes, which are put to table local by default.

  • dev NAME ”The output device name.

  • via ADDRESS ”The address of the nexthop router. Actually, the sense of this field depends on route type. For normal unicast routes it is either a true nexthop router or, if it is a direct route installed in BSD compatibility mode, it can be a local address of the interface. For nat routes it is the first address block of translated IP destinations.

  • src ADDRESS ”The source address to preferentially use when sending to the destinations covered by route prefix. This address must be defined on a local machine interface. This preference comes into play when routes and rules are combined with Masquerade and NAT functions as provided by other utilities.

  • realm REALMID ”The realm this route is assigned to. REALMID may be a number or a string from the file /etc/iproute2/rt_realms .

  • mtu MTU or mtu lock MTU ” The MTU along the path to destination. If the lock modifier is not used, MTU may be updated by the kernel due to path MTU discovery. If the lock modifier is used, then no path MTU discovery will be performed, and all the packets will be sent without the DF bit set for the IPv4 case or fragmented to the MTU for the IPv6 case.

  • window NUMBER ”The maximum advertised window for TCP to these destinations, measured in bytes. This parameter limits the maximum data bursts your TCP peers are allowed to send to you.

  • rtt NUMBER ”The initial RTT (Round Trip Time) estimate.

    Actually, in Linux 2.2 and 2.0 it is not RTT but the initial TCP retransmission timeout. The kernel forgets it as soon as it receives the first valid ACK from a peer. Alas, this means that this attribute affects only the connection retry rate and is hence useless.

  • nexthop NEXTHOP ”The nexthop of a multipath route. NEXTHOP is a complex value with its own syntax, as follows:

    via ADDRESS is the nexthop router.

    dev NAME is the output device.

    weight NUMBER is the weight of this element of multipath route reflecting its relative bandwidth or quality.

  • scope SCOPE_VAL ”The scope of the destinations covered by the route prefix. SCOPE_VAL may be a number or a string from the file /etc/iproute2/rt_scopes . If this parameter is omitted, ip assumes scope global for all gatewayed unicast routes, scope link for direct unicast routes and broadcast s, and scope host for local routes.

  • protocol RTPROTO ”The routing protocol identifier of this route. RTPROTO may be a number or a string from the file /etc/iproute2/rt_protos . If the routing protocol ID is not given, ip assumes the protocol is boot ” in other words, "this route has been added by someone who does not understand what he is doing." Several of these protocol values have a fixed interpretation as in the following list:

    • redirect ” Route was installed due to ICMP redirect.

    • kernel ” Route was installed by the kernel during autoconfiguration.

    • boot ” Route was installed during bootup sequence. If a routing daemon will start, it will purge all of them. This is the value assigned to manually inserted routes that do not have a protocol specified.

    • static ” Route was installed by administrator to override dynamic routing. Routing daemon(s) will respect them and advertise them if it is so configured.

    • ra ” Route was installed by Router Discovery protocol.

      Note that the rest of the values of RTPROTO are not reserved, and the administrator is free to assign or not assign protocol tags. Routing daemons at least should take care of setting some unique protocol values for themselves such as they are assigned in rtnetlink.h or in the rt_protos database.

  • onlink ” Pretend that the nexthop is directly attached to this link, even if it does not match any interface prefix. One application of this option may be found in IP tunnels between dissimilar addresses.

  • equalize ” Allow packet-by-packet randomization on multipath routes. Without this modifier, route will be frozen to one selected nexthop , so that load splitting will occur only on per-flow base. equalize works only if the appropriate kernel configuration option is chosen or if the kernel is patched. Note that the presence or absence of this modifier determines how load balancing is performed and also how traffic flows are policy routed in some situations.

Two more commands, prepend and append , exist. prepend does the same thing as the classic route add command by adding the route even if another route to the same destination already exists. The opposite is append , which adds the route to the end of the list. I strongly recommend that you avoid using these commands.

Unfortunately, IPv6 currently understands only the append command correctly, with all the rest of the command set translating to append . Certainly, this will change in the future.

ip route add Examples

To add a plain route to network 10.0.0/24 via gateway 193.233.7.65:

 
  ip route add 10.0.0/24 via 193.233.7.65  
 

To change it to a direct route via device dummy:

 
  ip ro chg 10.0.0/24 via 193.233.7.65 dev dummy  
 

To add default multipath route, splitting load between ppp0 and ppp1:

 
  ip route add default scope global nexthop dev ppp0 nexthop dev ppp1  
 

Note the scope value, which is not necessary but prompts the kernel that this route is gatewayed rather than direct. Actually, if you know the addresses of the remote endpoints, it would be better to specify them using the parameter via .

To nat the address 192.203.80.144 to 193.233.7.83 before forwarding:

 
  ip route add nat 192.203.80.142 via 193.233.7.83  

Note that the reverse nat translation is set up with policy rules, as described in the ip rule Policy Routing section.

ip route delete

Abbreviations: delete, del, d

ip route del has the same arguments as ip route add , but their semantics are a bit different.

Key values ( dest, tos, preference , and table ) select the route to delete. If any optional attributes are present, ip verifies that they coincide with attributes of the route to delete. If no route was given, the key and attributes are not found, and ip route del fails.

Linux kernel 2.0 had the capability to delete a route selected only by the prefix address while ignoring its netmask . This option does not exist anymore, due to the ambiguous nature of the selection. If you wish to have such functionality, look at the ip route flush command, which provides a richer set of capabilities.

ip route delete Examples

To delete the multipath route created by the add example previously:

 
  ip route del default scope global nexthop dev ppp0 nexthop dev ppp1  
 
ip route show

This format of the command allows viewing the routing table contents and looking at route(s) as selected by some criteria.

Abbreviations: show, list, sh, ls, l

Arguments

These are the selection arguments that allow you to select routes to show:

  • to SELECTOR (default) ”Select routes only from the given range of destinations. SELECTOR has optional modifiers ( root, match , and exact ) and a prefix.

  • root PREFIX ”Selects routes with prefixes not shorter than PREFIX . For example, root 0/0 selects all the routing table.

  • match PREFIX ”Selects routes with prefixes not longer than PREFIX . match 10.0/16 selects 10.0/16, 10/8 , and 0/0 , but it does not select 10.1/16 and 10.0.0/24 .

  • exact PREFIX (or just PREFIX ) ”Selects routes with exactly this prefix.

Note that if none of these options are present, then the ip command assumes root 0/0 , which lists the entire table. The rest of the selection arguments are:

  • tos TOS or dsfield TOS ”Select only routes with given TOS .

  • table TABLEID ”Show routes from this table(s). Default setting is to show table main (ID 254). TABLEID may be either the ID of a real table or one of the special values:

    all ” List all the tables.

    cache ” Dump the routing cache.

Note that IPv6 has only a single route table. However, splitting into main, local , and cache is emulated by the ip utility.

  • cloned or cached ” List cloned routes that are dynamically forked off of other routes because some route attribute (like MTU) was updated. It is equivalent to table cache .

  • from SELECTOR ”The same syntax as to SELECTOR but bounds the source address range rather than the destination. Note that the from option works only with cloned routes.

  • protocol RTPROTO ”List only routes of this protocol.

  • scope SCOPE_VAL ”List only routes with this scope.

  • type TYPE ”List only routes of this type.

  • dev NAME ”List only routes going via this device.

  • via PREFIX ” List only routes going via selected PREFIX nexthop routers.

  • src PREFIX ”List only routes with preferred source addresses selected by PREFIX .

  • realm REALMID or realms FROMREALM/TOREALM ”List only routes with these realms.

Using this command is best explained by running through an example.

Example

First you need to count the routes of protocol gated/bgp on a router.

 
 kuznet@amber~ $  ip route list proto gated/bgp  wc  1413    9891    79010 kuznet@amber~ $ 
 

To count the size of the routing cache, you have to use option -o , because cached attributes can take more than one line of output.

 
 kuznet@amber~ $  ip -o route list cloned  wc  159    2543    18707 kuznet@amber~ $ 
 

The output of this command consists of per route records separated by line feeds. However, some records may consist of more than one line, particularly when the route is cloned or you have requested additional statistics. If the option -o is given, line feeds separating lines inside records are replaced with backslash signs.

The output has the same syntax as arguments given to ip route add , so it can be understood easily.

 
 kuznet@amber~ $  ip route list 193.233.7/24  193.233.7.0/24 dev eth0  proto gated/conn  scope link }     src 193.233.7.65 realms inr.ac kuznet@amber~ $ 
 

If you list cloned entries, the output contains other attributes, which are evaluated during route calculation and updated during route lifetime. An example of the output is

 
 kuznet@amber~ $  ip route list 193.233.7.82 table cache  193.233.7.82 from 193.233.7.82 dev eth0  src 193.233.7.65 }   realms inr.ac/inr.ac     cache <src-direct,redirect>  mtu 1500 rtt 300 iif eth0 193.233.7.82 dev eth0  src 193.233.7.65 realms inr.ac     cache  mtu 1500 rtt 300 kuznet@amber~ $ 
 

This route looks a bit strange , doesn't it? Did you notice that this is the path from 193.233.7.82 back to 193.233.82 ? In the section on ip route get , you will see how this route is created.

The second line, which starts with the word cache , shows the additional attributes that normal routes do not possess. The cache flags contained within the angle brackets are

  • local ” Packets are delivered locally. It stands for loopback unicast routes, for broadcast routes, and for multicast routes if this host is a member of the corresponding group .

  • reject ” The path is bad. Any attempt to use it results in an error. See the error attribute below.

  • mc ” The destination is multicast.

  • brd ” The destination is broadcast.

  • src-direct ” The source is on a directly connected interface.

  • redirected ” The route was created by an ICMP Redirect.

  • redirect ” Packets going via this route will trigger ICMP redirect.

  • fastroute ” The route is eligible to be used for fastroute .

  • equalize ” Make packet-by-packet randomization along this path.

  • dst-nat ” Destination address requires translation.

  • src-nat ” Source address requires translation.

  • masq ” Source address requires masquerading.

  • notify (not implemented) ”A change or deletion of this route will trigger RTNETLINK notification.

The following are optional attributes that may be present:

  • error ” On reject routes this is the error code returned to local senders when they try to use this route. These error codes are translated to ICMP error codes sent to remote senders according to the rules described in the section on route types.

  • expires ” This entry will expire after this timeout.

  • iif ” The packets for this path are expected to arrive on this interface.

The option -statistics will show further information about this route:

  • users ” Number of users of this entry.

  • age ” Shows when this route was last used.

  • used ” Number of lookups of this route since its creation.

ip route flush ” Allows Group Deletion of Routes

This command allows flushing routes as selected by some criteria. The arguments have the same syntax and semantics as the arguments of ip route show , but the routing tables are purged rather than listed. The only difference is the default action performed. Where the ip route show command dumps the main IP routing table, ip route flush prints the help page.

With the option -statistics , the command becomes verbose and prints out the number of deleted routes and the number of rounds needed to flush the routing table. If the option is given twice, ip route flush also dumps all deleted routes in the format described in the previous subsection.

Abbreviations: flush, f

ip route flush Examples

The first example flushes all the gatewayed routes from table main , such as after a routing daemon crash.

 
 netadm@amber~ #  ip -4 ro flush scope global type unicast  
 

This option deserved to be put into the scriptlet routef , available within the IPROUTE2 utility distribution. This option was described in the route(8) man page as borrowed from BSD but was never implemented in Linux.

The second example is flushing all IPv6 cloned routes:

 
 netadm@amber~ #  ip -6 -s -s ro flush cache  3ffe:2400::220:afff:fef4:c5d1 via 3ffe:2400::220:afff:fef4:c5d1 }   dev eth0  metric 0     cache  used 2 age 12sec mtu 1500 rtt 300 3ffe:2400::280:adff:feb7:8034 via 3ffe:2400::280:adff:feb7:8034 }   dev eth0  metric 0     cache  used 2 age 15sec mtu 1500 rtt 300 3ffe:2400::280:c8ff:fe59:5bcc via 3ff:2400::280:c8ff:fe59:5bcc }   dev eth0  metric 0     cache  users 1 used 1 age 23sec mtu 1500 rtt 300 3ffe:2400:0:1:2a0:ccff:fe66:1878 via 3ffe:2400:0:1:2a0:ccff:fe66:1878 }   dev eth1  metric 0     cache  used 2 age 20sec mtu 1500 rtt 300 3ffe:2400:0:1:a00:20ff:fe71:fb30 via 3ffe:2400:0:1:a00:20ff:fe71:fb30 }   dev eth1  metric 0     cache  used 2 age 33sec mtu 1500 rtt 300 ff02::1 via ff02::1 dev eth1  metric 0     cache  users 1 used 1 age 45sec mtu 1500 rtt 300 ***Round 1, deleting 6 entries*** ***Flush is complete after 1 round*** netadm@amber~ #  ip -6 -s -s ro flush cache  Nothing to flush. 
 

The third example is flushing BGP routing tables after gated death.

 
 netadm@amber~ #  ip ro ls proto gated/bgp  wc  1408    9856    78730 netadm@amber~ #  ip -s ro f proto gated/bgp  ***Round 1, deleting 1408 entries*** ***Flush is complete after 1 round*** netadm@amber~ #  ip ro f proto gated/bgp  Nothing to flush. netadm@amber~ # 
 

Note that there is one usage of ip route flush you will become very familiar with and it is worth mentioning now:

 
  ip route flush cache  
 

This command flushes out the routing cache and should be run whenever you manually manipulate the routing table on a running machine. If you do not flush the cache after adding, deleting, or changing a route, there may be a delay of up to several minutes before the routing manipulation takes effect. This is due to the current routing process being optimized for usage. A datastream that has created a routing decision through the route process will cause the routing decision to be cached for the life of the stream plus a timeout period. Thus in order for a route manipulation to be immediately effective you will want to flush out the cache. At this time there is no mechanism for flushing a specific part of the routing cache but there is little penalty in most situations when a complete cache flush is performed.

ip route get ” Obtain Route Pathing

This command gets a single route to a destination and prints its contents exactly as the kernel sees it. This is not the same as a physical traceroute style lookup nor is it equivalent to ip route show. ip route show shows the existing routes; ip route get resolves them and creates new clones if necessary.

Essentially, ip route get is equivalent to actually sending a packet along this path. If the argument iif is not given, the kernel creates a route to output packets toward the requested destination. This is equivalent to pinging the destination then running ip route list cache . However, in the case of ip route get , no packets are actually sent. With the argument iif present, the kernel pretends that a packet has arrived from this interface and searches for a path to forward the packet. This command outputs routes in the same format as ip route ls .

Abbreviations: get, g

Arguments

These are the options to define what route to get:

  • to ADDRESS (default) ”The destination address.

  • from ADDRESS ”The source address.

  • tos TOS or dsfield TOS ”Type Of Service.

  • iif NAME ”The device this packet is expected to arrive from.

  • oif NAME ”Enforce output device on which this packet will be routed out.

  • connected ” If no source address (option from) was given, look up the route again, with the source address set to the preferred address as received from the first lookup. If Policy Routing is used, this may be a different route.

ip route get Examples

To find a route to output packets to 193.233.7.82:

 
 kuznet@amber~ $  ip route get 193.233.7.82  193.233.7.82 dev eth0  src 193.233.7.65 realms inr.ac     cache  mtu 1500 rtt 300 kuznet@amber~ $ 
 

To find a route to forward packets arriving on eth0 from 193.233.7.82 and destined to 193.233.7.82:

 
 kuznet@amber~ $  ip route get 193.233.7.82 from 193.233.7.82 iif eth0  193.233.7.82 from 193.233.7.82 dev eth0  src 193.233.7.65 }   realms inr.ac/inr.ac     cache <src-direct,redirect>  mtu 1500 rtt 300 iif eth0 kuznet@amber~ $ 
 

This is the operation that created the funny route in the examples to ip route list with 193.233.7.82 looped back to 193.233.7.82 . Note the redirect flag present on the output.

To find multicast route for packets arriving on eth0 from host 193.233.7.82 and destined to multicast group 224.2.127.254 assuming that a multicast routing daemon is running (in this case running pimd) .

 
 kuznet@amber~ $  ip route get 224.2.127.254 from 193.233.7.82 iif eth0  multicast 224.2.127.254 from 193.233.7.82 dev lo  }   src 193.233.7.65 realms inr.ac/cosmos     cache <mc> iif eth0 Oifs eth1 pimreg kuznet@amber~ $ 
 

This route differs from the ones seen before. It contains a normal part and a multicast part. The normal part is used to deliver or not deliver the packet to local IP listeners. In this case the router is not acting as a member of the multicast group, so the route has no local flag and only forwards packets. The output device for such entries is always loopback. The multicast part consists of an additional Oifs list showing the output interfaces.

Now it is time for a more complicated example, adding an invalid gatewayed route for a destination that is really directly connected

 
 netadm@alisa~ #  ip route add 193.233.7.98 via 193.233.7.254  netadm@alisa~ #  ip route get 193.233.7.98  193.233.7.98 via 193.233.7.254 dev eth0  src 193.233.7.90     cache  mtu 1500 rtt 3072 
 

and probing it with ping:

 
 netadm@alisa~ #  ping -n 193.233.7.98  PING 193.233.7.98 (193.233.7.98) from 193.233.7.90  56 data bytes From 193.233.7.254 Redirect Host(New nexthop 193.233.7.98) 64 bytes from 193.233.7.98 icmp_seq=0 ttl=255 time=3.5 ms From 193.233.7.254 Redirect Host(New nexthop 193.233.7.98) 64 bytes from 193.233.7.98 icmp_seq=1 ttl=255 time=2.2 ms 64 bytes from 193.233.7.98 icmp_seq=2 ttl=255 time=0.4 ms 64 bytes from 193.233.7.98 icmp_seq=3 ttl=255 time=0.4 ms 64 bytes from 193.233.7.98 icmp_seq=4 ttl=255 time=0.4 ms ^C --- 193.233.7.98 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max = 0.4/1.3/3.5 ms 
 

What occurred? The router at 193.233.7.254 understood that you have a much better path to the destination and sent an ICMP redirect message. Now retry ip route get to see what you have in your routing tables.

 
 netadm@alisa~ #  ip route get 193.233.7.98  193.233.7.98 dev eth0  src 193.233.7.90     cache <redirected>  mtu 1500 rtt 3072 
 

   
Top


Policy Routing Using Linux
Policy Routing Using Linux
ISBN: B000C4SRVI
EAN: N/A
Year: 2000
Pages: 105

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net