14.3 IP Options

   


When a packet is sent to the IP layer, then it normally includes all required information in the packet's protocol header. However, there could be times when packets require additional information in the protocol header for example, for diagnostics purposes, or if a packet's path across the Internet is specified before it is sent. For these purposes, an Option field with variable length can be added to each IP packet header. All guidelines for these IP options are described in [Post81c].

14.3.1 Standardized IP Packet Options

Figure 14-9 shows that the IP packet options are appended to the end of an IP header. The length of the Option field is variable, and the end of a packet header has to be aligned to a 32-bit boundary, so an additional padding field of the appropriate length is added (and set to 0 by default). In this case, "variable" also means that the packet options can be left out, if they are not required. The Option field can take one or several packet options, where an option can be given in either of two formats:

  • One single byte describes only the option type. The length of these options is always exactly one byte.

  • The first byte includes the option type, and the second byte contains the length of this packet option. The following bytes include the actual data of that option.

Figure 14-9. The IP packet header.

graphics/14fig09.gif


The byte stating the length of the packet option in the second case includes merely the number of data bytes. The first two bytes are not counted. The option type in the first byte is composed as follows:

Copy Flag

Option Class

Option Number


The (1-bit) copy flag is required for packet fragmentation. If a packet has to be fragmented, then this bit states whether this packet option has to appear in all fragments or may be set in the first fragment only.

The option class is represented by 2 bits. The (5-bit) option number shows the length of a packet option implicitly (i.e., we can see whether the next byte also belongs to this packet option or already belongs to the next packet option). Table 14-1 lists all IP packet options defined in RFC 791, including their lengths and their defined option numbers and option classes. There are four option classes in total, but only two are currently used. Option class 0 includes packet options for control and management; option class 2 includes debugging and measurement options. The option classes 1 and 3 are reserved for future IP packet-option classes.

Table 14-1. Defined IP packet options.

Class

Number

Length

Name

0

0

-

End of Option List

0

1

-

No Operation

0

2

11

Security

0

3

var

Loose Source Routing

0

9

var

Strict Source Routing

0

7

var

Record Route

0

8

4

Stream ID

2

4

var

Internet Timestamp


We will discuss each of these IP packet options in the following sections.

End of Option List

Bit sequence:

?0000000


This packet option marks the end of a series of options; it is appended to the last packet option and must never be between any other pair of options. The End-of-Option-List packet option is superfluous if the end of the option list is aligned at a 32-bit boundary. (See Figure 14-9.) The question mark at the beginning of this bit sequence corresponds to the copy flag, which was described above. If fragmentation is necessary, then this option can be copied, inserted, or deleted, depending on the number of packet options in the fragments of an IP datagram. It then has to be inserted into a fragment for example, if only a part of the original options has to be copied, and the end of the new option list no longer matches the 32-bit boundary.

No Operation

Bit sequence:

?0000001


No Operation can be between any two packet options, for example to let the second option begin at a 32-bit boundary. If fragmentation is necessary, then this option can be copied, inserted or deleted. Like the End-of-Option List option, this option can always be inserted into a fragment, if only some of the original options are copied and if a packet option must begin at a 32-bit boundary.

Security

Bit sequence:

graphics/253fig01.gif

The Security option is used primarily in military networks; it comprises a total of 11 bytes. The Security option allows end systems to send security parameters or define own (controllable) groups of communication partners, which want to exchange IP packets "in isolation" from all other traffic. The two-byte Security field can be used to state 16 security levels for an IP packet; of these, the original RFC 791 defines eight levels, including Unclassified, Confidential, Restricted, Secret or Top Secret. The other security levels are reserved for future use. As the One in the first bit (corresponding to the Copy Flag) of this packet option already states, this packet option has to be set in each fragment, if IP packets are fragmented.

These fields are primarily specified by the Defense Intelligence Agency. For this reason, the current implementation in Linux does not support the Security option.

Loose Source Routing

Bit sequence:

10000011

Length

Pointer

Route Data


This option is used to specify all routers an IP packet has to visit on its way across the network. In addition, it accepts data about the packet's path. The third byte includes a pointer to the address of the next router that the packet has to pass. This pointer is relative to this option the smallest possible value is four. If the pointer points to a byte not belonging to this option according to the length byte, then the packet can be sent over an arbitrary path to the actual destination address. In contrast, if the packet has reached the address specified in the destination address field, yet the pointer still points to another valid address, then the destination address field is overwritten with this address. The pointer is incremented by the length of an IP address, 4 bytes. The consequence of this replacement strategy is that the protocol header of the IP packet maintains a constant length all the time. In contrast to the Record Route packet option, addresses are defined exclusively by the sender; no addresses are entered by intermediate systems.

If the packet has to be fragmented, then this packet option has to be copied to each packet fragment, because the fragments are forwarded independently of one another, which means that they can reach the receiver over different paths across the Internet.

Strict Source Routing

Bit sequence:

10001001

Length

Pointer

Route Data


The Strict Source Routing option differs in only one point from the Loose Source Routing option: The packet may pass exactly those routers specified in the Route Data list. If a packet arrives in a router not explicitly present in this list, then an ICMP message has to be generated and returned to the sender. Section 14.4 describes the Internet Control Message Protocol (ICMP).

As with the previous option, if fragmentation is required, then the Strict Source Routing option has to be copied in each single fragment, which means that One is in the first position of this option.

Record Route

Bit sequence:

00000111

Length

Pointer

Route Data


The Record Route option can be used to register the addresses of all intermediate systems an IP packet will pass on its way to the destination. The third byte includes a pointer to the field that is to accept the next address. The length of this option should never change; the sender specifies twice the available space, which is initially filled with zeros. These zeros are not treated as an End of Option List, because the length byte, Length, states the option's length. Each Internet node adds its address in a field specially provided for this purpose and increases the pointer by four [bytes] (corresponding to the length of an IP address). If no more space is available, then the IP packet is forwarded without storing the address. In this case, an ICMP message can be returned to the sender.

In contrast to the two previous packet options (i.e., Loose Source Routing and Strict Source Routing), this option appears only in the first fragment, if an IP packet has to be fragmented.

Stream Identifier

Bit sequence:

10001000

00000010

Stream ID


This option enables the transport of SATNET Stream Identifiers across the Internet. The Stream Identifier packet option is always 4 bytes long and has to be copied to all fragments, if fragmentation is used. However, this option currently has no practical use, and we list it here only for the sake of completeness.

Internet Timestamp

Bit sequence:

01000100

Length

Pointer

Counter

Flag

Address

Timestamp


The original RFC 791 includes the Internet Timestamp option as the only packet option of class 2 (i.e., debugging or measurement options). This option can be used to store time stamps of selected or all network nodes. A 4-bit flag determines the data to be stored here, and it can take either of the following values:

  • 0 The option stores time stamps only.

  • 1 The option stores all time stamps and addresses.

  • 2 A router completes its timestamp only if its address is listed in this option.

Notice that the size of the Internet Timestamp option does not change, because the sender specifies it previously in the length field. For this reason, there is an additional (4-bit) Counter field, which includes the number of all routers for the time stamps of which there was no more space in the data field. The maximum length of this option is 40 bytes. The third byte points to the next four or eight bytes to be filled with an entry.

If fragmentation is required, this option appears in the first fragment only, and so the Copy Flag is set to 0.

14.3.2 Configuration

User Access

Each Linux user can use the traceroute command to track an IP packet on its way across the Internet to the destination node. This might suggest that the Record Route IP packet option is used in this case. Actually, this is not so; the traceroute command uses another method, for several reasons:

  • Formerly, not all routers supported the Record Route packet option, which means that they wouldn't have been available for use.

  • Record Route is normally intended for one-way use only the receiver has to return an echo of the IP packet it received to the sender. This means that the recorded addresses would have to be duplicated.

  • However, the main reason is lack of space: A maximum of nine IP addresses fits into the address list of the Option field. Formerly, this might have been sufficient, but today the average number of intermediate systems for a connection across the Internet is much higher.

For these reasons, traceroute uses the Internet Control Message Protocol (ICMP; see Section 14.4) and the Time-to-Live (TTL) field of the IP header, which stores the remaining lifetime of the packet. It sends consecutive ICMP packets with the same destination address and increments the value in the TTL field at each step. The first packet gets a lifetime of one (i.e., the first Internet node returns an ICMP message to the sender as soon as it receives the packet). The sender receives an ICMP message also from each of the next receivers, so that it can follow the path to the destination address. However, a trick has to be used at the destination address, because the receiver looks at the lifetime only if the packet is not delivered locally. For this purpose, the UDP port number is set to a meaningless value to cause the receiver to return the ICMP message Port Unreachable.

Notice, however, that this method works only because all IP packets from a sender normally take the same path through the Internet to reach the receiver in most cases. It was actually intended to let a user run the traceroute command to access the packet option Strict Source Routing or Loose Source Routing. When the first version of traceroute included this option, many system administrators found that it results in an excessive load on most routers. Consequently, to use these packet options today, we need a corresponding patch.

The following example uses the Loose Source Routing option:

 # traceroute -g 129.13.92.254 rzstud1.rz.uni-karlsruhe.de traceroute to rzstud1.rz.uni-karlsruhe.de (129.13.197.1), 30 hops max, 40 byte packets 1 rzasc01.rz.uni-karlsruhe.de (129.13.92.1) 20 ms 20 ms 20 ms 2 r-ascend-netz.rz.uni-karlsruhe.de (129.13.92.254) 20 ms 20 ms 20 ms 3 rzstud1.rz.uni-karlsruhe.de (129.13.197.1) 213 ms 22 ms 24 ms} 

Because the traceroute command, which is normally installed in Linux, does not let a user access the IP packet options, another way to use it would be the ping command. ping is intended to verify that a host is reachable. For this purpose, it continually sends ICMP requests to the destination computer and expects a reply in the form of an ICMP message. Today, there are still ping implementations that allow you to use the packet options Source Routing and Internet Timestamp. For the same reasons as with traceroute, the Internet Timestamp option was removed from most implementations, which means that only the Record Route option remained. The following example shows how you can use ping with the Record Route packet option set. The route is output after the first request.

 # ping -R rzstud1.rz.uni-karlsruhe.de PING rzstud1.rz.uni-karlsruhe.de (129.13.197.1): 56 data bytes 64 bytes from 129.13.197.1: icmp_seq=0 ttl=253 time=235.977 ms RR: isdn216-10.rz.uni-karlsruhe.de (129.13.216.10)       rzasc01.rz.uni-karlsruhe.de (129.13.92.1)       129.13.197.62       rzstud1.rz.uni-karlsruhe.de (129.13.197.1)       r-ascend-netz.rz.uni-karlsruhe.de (129.13.92.254)       rzasc01.rz.uni-karlsruhe.de (129.13.92.1)       isdn216-10.rz.uni-karlsruhe.de (129.13.216.10) 64 bytes from 129.13.197.1: icmp_seq=1 ttl=253 time=47.171 ms (same route) 64 bytes from 129.13.197.1: icmp_seq=2 ttl=253 time=48.728 ms (same route) --- rzstud1.rz.uni-karlsruhe.de ping statistics --- 9 packets transmitted, 9 packets received, 0% packet loss round-trip min/avg/max = 45.100/70.545/235.977 ms 

If a user at the local computer has root rights, then the verbose mode of tcpdump lets the user additionally view the packet options of all IP packets. tcpdump monitors the data traffic at a network adapter. The following example uses tcpdump in verbose mode to monitor the previous ping example.

 # tcpdump -v User level filter, protocol ALL, datagram packet socket tcpdump: listening on ippp0 15:37:56.025267 isdn216-10.rz.uni-karlsruhe.de > rzstud1.rz.uni-karlsruhe.de: icmp: echo request (ttl 64, id 1284, optlen=40 RR{isdn216-10.rz.uni-  karlsruhe.de#0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0} EOL) 15:37:56.261172 rzstud1.rz.uni-karlsruhe.de > isdn216-10.rz.uni-karlsruhe.de: icmp: echo reply (ttl 253, id 28562, optlen=40 RR{isdn216-10.rz.uni- karlsruhe.de rzasc01.rz.uni-karlsruhe.de 129.13.197.62 rzstud1.rz.uni- karlsruhe.de r-ascend-netz.rz.uni-karlsruhe.de rzasc01.rz.uni-karlsruhe.de# 0.0.0.0 0.0.0.0 0.0.0.0} EOL) 

Programming Access

We will use the ping program once more in another example to show you how IP packet options can be accessed during programming. This example uses Version 1.38. When ping starts, the first thing is to check the parameters passed. If they include -R, then the Echo Request packet has to include the Record Route IP option. For this purpose, ping uses the setsockopt() function to inform an existing socket about packet options.

The following example shows you how this function is invoked from within the source code of ping:

 if (setsockopt (s, IPPROTO_IP, IP_OPTIONS, rspace, sizeof (rspace)) < 0)         {              perror (_("ping: record route"));              exit(1);         } 

The specified options will then be set in each packet sent over this socket in the future. The IPPROTO_IP parameter means that the packet option to be set is an IP option. This does not necessarily mean that it is an IP option in the true sense. It could be present in another position within the IP header (e.g., IP_TTL also belongs to the IPPROTO_IP group). From the programming perspective, we always have to assume that the current kernel implementation does not support the desired IP option. In this case, setsockopt() returns the value 1 and outputs an error message. Subsequently, an arbitrary number of packets with the packet option set are sent over this socket.

Incoming ICMP packets sent by ping are checked for their options as follows: An option pointer that points to the first Option field in the protocol header is computed. The first option is processed, and the option pointer is incremented so that it points to the next option. For the Record Route packet option, the pointer has to point to a byte that includes the number "7". ping doesn't actually have to take care of this value; like all other option-specific constants, it is defined in <linux/ip.h>:

 #define IPOPT_END (0 |IPOPT_CONTROL) #define IPOPT_NOOP (1 |IPOPT_CONTROL) #define IPOPT_SEC (2 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_LSRR (3 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_TIMESTAMP (4 |IPOPT_MEASUREMENT) #define IPOPT_RR (7 |IPOPT_CONTROL) #define IPOPT_SID (8 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_SSRR (9 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_RA (20|IPOPT_CONTROL|IPOPT_COPY) 

If one of these packet options is found, then it is output.

14.3.3 The ip_options Class in the Linux Kernel

This section describes all functions of the ip_options class implemented in the Linux kernel. If options are passed to or from functions, then this is normally done by use of the ip_options data type. This type is defined in <linux/ip.h> and includes the variables, pointers, and constants required for all packet options.

ip_options_build()

net/ipv4/ip_options.c


This function takes the information about IP options from the socket object and creates the options part in the IP header.

The parameters passed here include a socket buffer, the packet options, the packet destination address, the routing table, and the is_frag variables. The socket buffer includes a datagram with a protocol header that is not yet complete. The passed packet options are copied to the end of the protocol header. If the option Strict Source Routing exists, then the destination address of the packet is written to the address list of the packet option. If the packet is not a fragment, and if the Internet Timestamp or Record Route option exists, then the required data is inserted into the corresponding lists. If the packet is a fragment and one of the two options exists, then these options are replaced by No Operation.

ip_options_echo()

net/ipv4/ip_options.c


The ip_options_echo() routine takes the options from an IP packet received and uses them to create an echo packet (i.e., a reply to the incoming message). This function is normally used to send a reply when packets with IP options have been received for example, to invert a Strict Source Routing option. The parameters passed here are a socket buffer and the destination options.

ip_options_fragment()

net/ipv4/ip_options.c


This function takes the fragment that was passed as socket buffer and overwrites all packet options with the No Operation option, with the Copy Flag not set. As described in Section 14.3.1, this flag is not set for the Internet Timestamp and Record Route options. The replacement by No Operation has the advantage that the length of the protocol header does not change.

ip_options_compile()

net/ipv4/ip_options.c


This function compiles the Option field at the end of the IP header. ip_options_build() uses data structures readily prepared for packet options, but ip_options_compile() has to compile all options. The parameters passed here include the packet options and the socket buffer. This function works option by option until it reaches an End-of-Option List or the end of the protocol header. Any No Operation in the option list is skipped. If an error occurs in this procedure, then an ICMP message is returned to the sender.

ip_options_undo()

net/ipv4/ip_options.c


It can be necessary to delete the last entries in the packet options Source Routing, Record Route, and Internet Timestamp. The function ip_options_undo() is responsible for this task. This function can follow once the ip_options_echo() was invoked, for example. If an Echo Request with the Record Route option is sent to the local computer, then the function ip_options_echo() duplicates the options set in the incoming packet. These packet options are then used to return an Echo Reply packet. Unless the ip_options_undo() function is invoked, the IP option would include two entries for the local computer. The packet options represent the only parameter passed here.

ip_options_get()

net/ipv4/ip_options.c


This function checks on whether the IP options can be accessed when setsockopt() is invoked. If so, then it returns 0; otherwise, it returns a negative error code (error codes are defined in the file <include/asm/errno.h>).

ip_forward_options()

net/ipv4/ip_options.c


If necessary, this function adds all information required about the local IP node to a packet that has to be forwarded. This information is added by the packet options Record Route, Strict Source Route, and Internet Timestamp. The only parameter passed here is the appropriate socket buffer.

ip_options_rcv_srr()

net/ipv4/ip_options.c


This function checks the IP options Loose Source Routing and Strict Source Routing in an incoming packet. For example, if the destination address in the protocol header is the local address, and if the address list has not yet been fully visited, the packet may not be delivered locally. As with the previous function, the socket buffer is the only parameter passed here.

14.3.4 IP Options in the IP Layer

Incoming Packets

There are several ways an IP packet can move across the IP layer. It can enter either from the lower or from the higher layers (i.e., from the local Internet module). Depending on whether it is intended for the local computer, the packet is passed to the next higher or next lower layer. Figure 14-10 shows this relation and the position within the packet-handling process where functions are invoked to handle IP options in the Linux kernel.

Figure 14-10. IP Options in the IP layer.

graphics/14fig10.gif


If an IP packet enters the IP layer from a lower layer, then ip_rcv() is the first function invoked. The packet is passed as socket buffer, and it first has to pass the netfilters. Netfilters have the functionality of a firewall and can do address translations. To translate addresses, NK_HOOK() with the ip_rcv_finish() parameter is invoked. Chapter 19 describes how netfilters are handled and implemented. After this process, ip_rcv_finish() is the function executed next. The only parameter passed to this function is the socket buffer. It finds out the packet's path and checks its protocol header. If the header length is greater than five (i.e., more than 5 * 32 bits) then the packet includes an option field that causes the function ip_options_compile() to be invoked. The packet options are separated and stored in the opt data structure. Normally, Boolean variables (e.g., opt->is_strictroute) are set at this point. Subsequently, the opt->srr pointer has to be tested. If this pointer is set to one, then the packet option Loose Source Routing or Strict Source Routing is specified, which means that the function ip_options_rcv_srr() has to be invoked.

The return value of the ip_rcv_finish() function is a pointer that points to either ip_local_deliver() or ip_forward(), depending on whether the packet has to be delivered locally or forwarded.

Local Packet Delivery

The function ip_local_deliver() is invoked if an IP packet has to be delivered to the local computer. This packet could be a fragment of a larger IP datagram, which has to be reassembled with the other fragments. This means that several things have to be checked for example, whether all fragments have arrived, which is checked by the function ip_defrag(). Once all fragments have arrived, all options have to be removed from the first fragment to reassemble the fragments into the original datagram. The first fragment always includes all options that were copied when the datagram was fragmented. Next, the packet traverses the netfilters once more. The function ip_local_deliver_finish() completes a local packet delivery in the IP layer.

Forwarding Packets

The function ip_forward() is invoked in the event that an IP packet has to be forwarded. (See the center part of Figure 14-10.) The packet is checked again, including a test for the Strict Source Routing option. If this option exists, and the local address is not in the option field, then an ICMP message is returned to inform the sender accordingly. In this case, neither Strict Source Routing nor Loose Source Routing may be specified in the option field to ensure that an ICMP Redirect message can be returned to the sender.

A backup copy of the IP packet (including all packet options) has to be created, because the packet could be changed in the further course. The value of the TTL variable is decremented by one. If the packet is too big and the Don't-Fragment bit is set in the IP packet header, then the complete packet is discarded. At the end, the netfilter is called again, but this time with the ip_forward_finish() parameter.

The function ip_forward_finish() checks the length of each IP packet options. If the length is not null, then the function ip_forward_options() handles the Record Route and Source Routing options, and ip_send() is invoked in either case. If the packet is too big and has to be fragmented, then ip_send() invokes the ip_fragment() function. Depending on whether the Copy Flag is set, only some of the packet options or only the first fragment have to be copied to all fragments. This function is extremely space- and time-saving. It also means that ip_options_fragment() is invoked only provided that the content of the socket buffer is the first or only fragment.

The next function is ip_finish_output(), which completes the packet-forwarding process.

Handling Packets Created Locally

A packet created locally can take either of two paths across the IP layer:

  • The function ip_build_and_send_pkt() is invoked. Though the passed socket buffer contains a datagram, it doesn't have a protocol header yet. In this case, packet options are passed as parameters, separately from the payload, and all pointers in the header structure are set. Depending on whether there are options, the header length, which was previously set to "5" [bytes], is corrected, and the function ip_options_build() is invoked, passing the socket buffer, the packet options, the destination address, and the routing table. Next, ip_send_check() verifies the checksum in the packet header, before the parameter output_ maybe_reroute() invokes the netfilters. The IP options play no further role on the remaining path as the packet travels through the IP layer. At the end, the packet is passed to the lower layers or ARP (Address Resolution Protocol see Chapter 15).

  • The higher layers pass the IP packet as parameter of the function ip_queue_xmit() to the IP layer. Notice that the packet options are not directly passed as parameters to the function; ip_queue_xmit() can use a pointer referring to the socket to access them. The first step has to decide where the packet has to be sent to. If the Source Routing option is set, then the destination address of the packet is determined by the address specified next. This requires a check for whether the option is_strictroute() exists and for whether the destination address is unequal to the router registered in the local routing table. In this case, the IP packet cannot be transmitted. If the route can be determined without problem, then the next step creates the remaining protocol header, as in the first way. The packet leaves the IP layer on the same path.


       


    Linux Network Architecture
    Linux Network Architecture
    ISBN: 131777203
    EAN: N/A
    Year: 2004
    Pages: 187

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net