2.2 Address mapping and configuration techniques

In large internetworks automating address allocation and device configuration is a must. Long gone are the days when network administrators hand-crafted every single host; today's internetworks are simply too large and too dynamic to make this either practical or economical. For example, a workstation typically needs the following items resolved at boot time:

IP address and subnet mask (a basic requirement for any form of IP communication)
Host name, domain name, DNS server name (covered in section 2.3)
Default gateway (i.e., the nearest router)
Primary and secondary DNS server
Primary and secondary WINS server address (for Microsoft networks)
Boot image (if the workstation has no local persistent storage capability)

At the very least a workstation needs an IP address and subnet mask, together with knowledge of how to reach the nearest router; otherwise, communication over an IP network is not practical. Today there are several protocols and services that are widely used to assist with address allocation, booting, and device configuration. These include ARP, RARP, Proxy ARP, BOOTP, and DHCP. Another important protocol used to indicate errors with their operations is ICMP. We will discuss these protocols in turn.

2.2.1 Address Resolution Protocol (ARP)

Address resolution is fundamental to the operation of IP networks and enables IP to be insulated from various types of hardware and networking media. The configuration of address resolution software can significantly affect network and application performance, and it is worth understanding some of the subtleties that could impact the network design. The key components of this functionality include a simple request-response protocol, a number of application-oriented implementations (including ARP, RARP, Proxy ARP, and Gratuitous ARP), and a database known as the ARP cache.

The Address Resolution Protocol (ARP) is used to resolve addressing queries where the IP address is known but the corresponding hardware address is not (e.g., we may know that a server on the local LAN can be reached at IP address 193.167.56.1, but we do not know what the MAC address of that server is). Without both pieces of relevant information we cannot sensibly place a frame on the wire. ARP is transparent to the end user and is not IP or media specific (variations exist with other protocol stacks such as AppleTalk's AARP).

When using ARP the host uses a lookup table (called the ARP cache) to maintain a list of associated IP and hardware addresses. This cache may contain static entries (typically hard-coded by the administrator and loaded at boot time) together with dynamically discovered associations. For example, under DOS we can view the ARP cache by typing the command:

    arp -a

To add a static entry we could, for example, type:

 arp -s 158.34.3.3 00-aa-00-61-c7-0a

ARP is also typically implemented on devices such as routers. For example, on a Cisco router we could examine the ARP cache using the show arp command, as follows:

 c3550#sh arp Protocol Address   Age (min)  Hardware Addr  Type Interface Internet 1.1.1.1   -   0010.7b1f.4a61 ARPA Ethernet0/0 Internet 1.1.1.5   28   00a0.c903.6077 ARPA Ethernet0/0 Internet 32.97.105.46   153   0000.0caa.2350 ARPA Ethernet0/1 Internet 1.1.1.6   4   00a0.c903.6064 ARPA Ethernet0/0 Internet 171.68.225.9   145   0000.0caa.2350 ARPA Ethernet0/1 Internet 209.17.176.120  142   0000.0caa.2350 ARPA Ethernet0/1 Internet 204.71.200.74  91   0000.0caa.2350 ARPA Ethernet0/1 Internet 128.32.18.166  156   0000.0caa.2350 ARPA Ethernet0/1 Internet 152.163.241.223  161   0000.0caa.2350 ARPA Ethernet0/1 Internet 38.15.254.206  144   0000.0caa.2350 ARPA Ethernet0/1 Internet 206.79.171.51  153   0000.0caa.2350 ARPA Ethernet0/1

When a station wishes to transmit a frame, it will first examine the ARP cache to recover the destination MAC address, using the destination IP address from the frame. If the MAC address cannot be resolved using the ARP cache, the source node will first queue the frame and then issue an ARP request, with the destination MAC address set to the broadcast address (0xFFFFFFFFFFFF, which could, therefore, be forwarded over bridges and switches). Within the ARP header the source and destination IP addresses are set using the known data, and the source hardware address is set to the source node's own MAC address. All nodes within the broadcast domain receive the frame and unpack the ARP header to see if the query is relevant. The node with the matching source IP address fills in the missing hardware information and returns an ARP reply as a unicast frame. Upon receipt the source node will update the local ARP cache with this new binding (setting an aging timer for this entry to allow for possible topology or address binding changes). Since we now have complete addressing information, the queued frame can now be transmitted. This whole process normally takes a few milliseconds and so is transparent to the user. All subsequent transmissions for this destination node can now be translated locally using the cache. However, if the source and destination nodes are on different subnets (either side of a router perhaps), then one of two actions may take place, as follows:

If the source and destination nodes are using natural masks, or understand subnetting, then the source will know that the destination IP address must be reached via the router. In this case the source will typically forward the packet directly to the default (or nearest) router. The router is responsible for forwarding the packet to its ultimate destination. If the MAC address of the router is unknown, then the source must issue an ARP request for the router itself. Thus, the packet will traverse the network with a consistent source and destination IP address, but the MAC addresses will be modified to reflect each data link hop.
If subnetting is used but the two nodes do not understand subnetting, then the router must run Proxy ARP to fool the nodes into believing that they are communicating directly (as explained later in the section).

ARP runs directly over the MAC layer and is identified by EtherType (0x0806). An example ARP request is shown in the following Ethernet packet trace. In this example the source node (193.128.2.34) is looking for the MAC address associated with IP address 193.128.2.33.

 File:IPXDIRSV.ENC  Type:SNIFFER_ENC Mode:••••••• Records:4021 =========================================================== Frame  : 52        Len   : 60        Error  : None T Elapsed: 01:27:34:333   T Delta : 00:00:00:000 -------------------------------[mac]----------------------- Dest Mac : ffffffffffff   Sourc Mac: Xyplex12a36a   Type   : ARP -------------------------------[arp]----------------------- HW Type : 10Mb Ethernet  Protocol : IP        Opcode  : ARP REQuest HW AddLen: 6  Bytes    PR AddLen: 4  Bytes SrHW Addr: 08008712a36a SrPR Addr: 193 128 2 34 DeHW Addr: ffffffffffff DePR Addr: 193 128 2 33 ===============================[data:  0]====[pad: 18]======================== 002A 00 00 00 00 2 4F 4B D1 00 00 00 00 00 00 00 00  .....OK......... 003A 00 00 Note that the destination MAC address is broadcast in both the frame header and data field (since it is unknown). The corresponding ARP response from node 193.128.2.33 would be as follows: File:IPXDIRSV.ENC  Type:SNIFFER_ENC Mode:••••••• Records:4021 =========================================================== Frame  : 53        Len   : 60        Error  : None T Elapsed: 01:27:34:333   T Delta : 00:00:00:000 -------------------------------[mac]----------------------- Dest Mac : Xyplex12a36a   Sourc Mac: Xyplex12a362   Type   : ARP -------------------------------[arp]----------------------- HW Type : 10Mb Ethernet  Protocol : IP        Opcode  : ARP RESPonse HW AddLen: 6  Bytes    PR AddLen: 4  Bytes SrHW Addr: 08008712a362 SrPR Addr: 193 128 2 33 DeHW Addr: 08008712a36a DePR Addr: 193 128 2 34 ===============================[data:  0]====[pad: 18]======================== 002A 00 00 00 00 2 4F 4B D1 00 00 00 00 00 00 00 00  .....OK......... 003A 00 00

Note that the response is unicast, since all addresses have been resolved by this stage.

Issues with ARP on mixed-media LANs

In [4] several popular LAN media types are discussed, and it was described how these could be integrated transparently via bridging. Current hardware types supported by ARP include those shown in the following chart:

ID	Protocol
1	Ethernet (10 Mb)
2	Experimental Ethernet (3 Mb)
3	Amateur Radio AX.25
4	Proteon ProNET Token Ring
5	Chaos
6	IEEE 802 Network
7	Arcnet
8	Hyperchannel
9	Lanstar
10	Autonet Short Address
11	LocalTalk
12	LocalNET (IBM PCNet or SYTEK LocalNET)
13	Ultra link
14	SMDS
15	Frame Relay
16	ATM
17	HDLC
18	Fibre Channel
19	ATM
20	Serial Line
21	ATM

One of the key differences between media types such as Ethernet and Token Ring/FDDI is bit ordering on the wire. Token Ring and FDDI use the noncanonical format (most significant bit first), whereas Ethernet uses the canonical form (least significant bit first). Converting the source and destination MAC addresses in the header is trivial, but since ARP frames also carry addressing information inside their payload, these fields need to be consistent on either side of a bridge, and the standards are not helpful in this respect. Another problem you may encounter is minor incompatibilities between Token Ring and Ethernet ARP implementations. For example, at least one bridge manufacturer I have come across fails to convert the hardware ID in the ARP packet when translating ARP frames from Token Ring to Ethernet. Some implementations of ARP are quite relaxed in dealing with this, but other implementations may reject an ARP frame where the hardware ID specified in the ARP source fields does not match the interface type over which the frame is received. In short, you need to check with the equipment manufacturer to establish what facilities are available to cope with these problems.

Issues with ARP on NBMAs

Since we have already stated that ARP uses broadcasts, this implies that we cannot use ARP over media types that do not support broadcasts (i.e., Non-broadcast MultiAccess networks [NBMA] such as X.25 and Frame Relay). In fact, on NBMA networks you will need to configure static ARP entries at the edge devices. Since these connections are more likely to be PVCs, this should not present too much of a problem. If the network topology changes frequently or SVCs are used, then other techniques should be considered (e.g., ATMarp, as described in [4]). Alternatively, INARP could be used to resolve DLCIs to IP addresses for Frame Relay networks.

Issues with ARP table size

On a large, bridged network there can be thousands of directly attached stations, and each station will cache address bindings for any device it communicates with. The ARP table has finite size (usually determined by the OS) and if the number of entries is exceeded, this can lead to a situation where ARP requests are continually rebroadcasted and responded to unnecessarily. The level of traffic can seriously degrade performance, and in rare cases can cause a broadcast storm. You should, if possible, tune the size of your ARP caches in line with the worst-case number of active nodes on the network within the aging timer period.

Issues with ARP synchronization

Most ARP implementations, particularly those on end systems, do not allow the aging timer for the ARP cache to be modified. The default timer is typically 5 minutes, but it may also be as long as 20 to 60 minutes. This timer ensures that the ARP cache will be flushed of any entries that have not been refreshed within the allotted time period. This means that the cache is kept in sync with the current network topology, and it also ensures that the cache does not continue to grow indefinitely. In general, frequently active flows are rewarded. Less active flows are penalized, since any requests for addresses that have been flushed require a new ARP request to be broadcasted and resolved. Note that if you set this aging timer too low, this could potentially result in a great deal of unnecessary traffic. As the ARP timer approaches zero, more and more stations must issue ARP broadcast requests for addresses flushed from their local cache. In theory, with an ARP timer of zero all stations must issue ARP requests for every frame transmitted.

If a station is reconfigured (e.g., by swapping out a line card) or uses some form of dynamic addressing (e.g., hot-swapping technology), then current ARP entries may be invalid. This may result in lost connectivity (typically for the duration of the aging timer). This could be a trivial annoyance or a more serious problem (especially if the device undergoing the change is a router or server). One way to resolve this problem is to flush all entries in the ARP cache or explicitly delete the out-of-date entry (e.g., using the arp -d command under DOS). We could then issue a broadcast ping (e.g., ping 140.42.255.255) to rebuild the table. For example, if a station loses connectivity to its default router, flushing the ARP table and issuing a broadcast ping should enable the station to locate its backup router (if configured). Broadcast pings can be more or less specific, as discussed in section 2.1.3. If flushing the ARP cache is not practical, then we can either wait for the entries to flush naturally or consider lowering the aging timer, subject to the caveat mentioned previously.

2.2.2 Proxy ARP

Proxy ARP is a variation of ARP, which enables older devices to communicate in a subnetted environment. It is sometimes referred to as promiscuous ARP, or ARP hack. Typical applications are as follows:

Subnetting, where devices do not understand subnets.
Multiple LAN segments share a common subnet number, and the segments are not bridged.
To provide IP address mapping for dial-in users.

Old terminal servers may not understand subnetting and may rely on natural IP masks only. In a subnetted environment, where routers are used, this can mean that communication across routers for such devices is not possible. Consider the example shown in Figure 2.7, where node-T does not understand subnetting (note the prefix is /16 even though all other nodes attached to the 140.16.0.0 class B network are using a subnetted /24 prefix). Router-1 has been configured with a /24 prefix on all interfaces and so is able to route requests from one subnet to the other; however, in this case, when node-T attempts to send traffic to node-S1, it firmly believes that node-S1 is a local host (i.e., directly reachable). It will, therefore, not forward this traffic to the Router-1 (its default router), so we have a problem. On the left in Figure 2.7, we have a terminal server that does not understand subnetting. To reach node-S1 Proxy ARP must be enabled on Router-1; otherwise, node-T will issue ARP requests for node-S1 in vain. On the right we have a gateway performing NAT. If Router-2 does not have host-specific static routes configured, then it too will ARP in vain. Proxy ARP is run on the NAT gateway's external interface, and NATed addresses are mapped to the gateway's own MAC address.

click to expand
Figure 2.7: Examples of Proxy ARP applications.

We can resolve this by running Proxy ARP on the 140.16.1.1 interface of Router-1. When node-T attempts communication with node-S1, it will first broadcast an ARP request, requesting the source MAC address for node-S1. Since this is a limited broadcast, the router will not forward it, so node-S1 will never see this packet. However, since the router can now listen for ARP request, it will service any ARP requests for devices known to be remote (since it is also subnet aware). In this case, instead of supplying node-S1's source MAC address, it supplies its own source MAC address for the .1 interface. On receipt of the ARP reply, node-T caches the IP address of node-S1 together with the source MAC address of Router-1 in its ARP table and starts sending traffic. Note that node-T believes it is having a direct conversation with node-S, whereas in reality it is forwarding traffic to the router. Remember that the source and destination IP addresses in this case are the real client and server IP addresses—the trickery is achieved via the IP address to MAC address mapping.

Figure 2.7 also illustrates another scenario for using Proxy ARP To the right of the diagram we see a NAT gateway which, for security reasons, is translating node-C's source IP address before forwarding it to the wide area router for access to the Internet. In this example assume that node-C's NATed address is 194.32.16.55/24. Unless Router-2 has a host-specific static route configured, incoming traffic for node-C will trigger Router-2 to send out an ARP request to discover node-C's MAC address. Since node-C is on the other side of the gateway, it will never respond. To resolve this we could map the 194.32.16.55 address to the gateway's MAC address on the external .2 interface and also run Proxy ARP on that interface. This way the gateway responds by proxy, and the router forwards traffic to the gateway, which then performs reverse address translation and forwards packets to node-C. NAT is explained in more depth in section 2.5.2.

Issues with Proxy ARP

You should be careful to enable Proxy ARP only where necessary (note that some routers enable Proxy ARP by default). A router running Proxy ARP between subnets may result in some confusion and possible IP addressing problems (e.g., ARP caches on various hosts may display multiple IP address entries, each associated with the same MAC addresses).

2.2.3 RARP

During the boot process some network devices know their hardware addresses but not their IP addresses (e.g., diskless workstations). Reverse ARP (RARP) can be used by such a device to dynamically retrieve its IP address, based solely on knowledge of its hardware address. RARP is documented in [18]. Note that RARP uses the same frame structure as ARP but is identified by a different EtherType (0x8035). RARP requires one or more RARP servers to maintain a database of mappings from hardware address to IP addresses and respond to requests for mapping information from clients. For example, a diskless workstation could issue a RARP request (using the destination MAC broadcast address 0xFFFFFFFFFFFF) so that any RARP servers listening on the local subnet can respond. A RARP server simply examines the source hardware address in the RARP request and returns the relevant IP address if available. Note that extensions to RARP for dynamic address mapping services (Dynamic RARP, or DRARP) are documented in [19]; but since RARP is limited to providing addressing information, it has been largely superseded by more sophisticated protocols such as BOOTP and DHCP. These protocols are now more commonly used (especially DHCP), since they also facilitate the transfer of configuration data and boot images.

2.2.4 Gratuitous ARP

There are occasions where it is necessary for devices to use ARP for broadcasting unsolicited address mappings; this feature is generally referred to as Gratuitous ARP. In a scenario where the binding of IP addresses to MAC addresses is dynamic, such as when a client PC boots up and requests an IP address from a DHCP server, it is advantageous to inform the network about this new address binding immediately (e.g., a Windows NT client will issue a Gratuitous ARP immediately after DHCP is successfully run). A Gratuitous ARP frame is broadcasted using the same format as the standard ARP request (EtherType 0x0806); however, both the source and destination IP address fields in the ARP payload will be set to the station's recently acquired IP address. The source hardware address field is set to the station's MAC address, with the destination hardware address set to 00-00-00-00-00-00. In this way all devices that are listening can precharge their ARP caches so that multiple ARP requests from devices subsequently wishing to communicate with this node are avoided. Therefore, this feature saves both time and bandwidth.

In a scenario where the IP address binding may change more frequently while a system is active, this can cause connectivity problems because the default ARP cache aging timers are too slow to react to such events. For example, two gateways, G1 and G2, with addresses 140.1.1.1 and 140.1.1.2, could be configured in a high-availability configuration running the Virtual Redundant Router Protocol (VRRP). In this case G1 is configured as master, with G2 backing up its IP address. In the stable state all end systems have physical MAC addresses of G1 in their caches, associated with IP address 140.1.1.1. However, if G1 dies, then G2 takes ownership of address 140.1.1.1 but somehow needs to announce that its own MAC address is now associated with 140.1.1.1. By sending an unsolicited ARP broadcast, advertising this change of mapping, any devices that are listening have the opportunity to update their ARP cache entries immediately, rather than suffer interim connectivity problems until those ARP entries age out (note that since ARP frames are not acknowledged, the implementation may issue more than one broadcast to be sure that all interested parties see the announcements).

2.2.5 Bootstrap Protocol (BOOTP)

BOOTP was developed to enable remote booting of diskless hosts over a network and has been widely deployed to automate device configuration. BOOTP protocol and operations are documented in [20], with extensions in [21]. BOOTP enables a device with a minimal IP stack and little local configuration information to download boot code (and possibly configuration data) from a BOOTP server. The download protocol is not defined by BOOTP but is typically the Trivial File Transfer Protocol (TFTP—see [22]). Since BOOTP relies on limited broadcasts, both the client and server must be on the same subnet, or a router must be available that supports BOOTP forwarding or relaying, as described in Chapter 3 and [21]. BOOTP has been somewhat overshadowed by Dynamic Host Configuration Protocol (DHCP) in the enterprise, but for legacy environments several enhancements to BOOTP enable it to interoperate with the DHCP (see section 2.2.4).

Message format

BOOTP runs directly over UDP, although TCP operation is possible; the BOOTP server process uses port 67; the BOOTP client process uses port 68. UDP uses a simple checksum to check data integrity (a 16-bit one's complement of the one's complement sum of a pseudo-IP header, UDP header, and the data field [23]). The format of a BOOTP message is shown in Figure 2.8.

click to expand
Figure 2.8: BOOTP message format.

Field definitions

Code—1 Indicates a request; 2 indicates a reply.
Hwtype—The type of hardware,. Refer to [5] for a complete list. For example, Ethernet = 1, IEEE 802 = 6.
Length—Hardware address length in bytes. For example, Ethernet and Token Ring both use 6.
Hops—The client sets this to 0. It is incremented by a router, which relays the request to another server and is used to identify loops. Reference [20] suggests that a value of 3 indicates a loop.
Transaction ID—A random number used to match this boot request with the response it generates.
Seconds—Set by the client. It is the elapsed time since the client started its address acquisition or renewal process.
Flags field—The most significant bit is used as a broadcast flag. All other bits must be zero and are reserved for future use. Normally, BOOTP servers attempt to deliver BOOTREPLY messages directly to a client using unicast delivery. The destination address in the IP header is set to the BOOTP Your IP Address and the MAC address is set to the BOOTP client hardware address. If a host is unable to receive a unicast IP datagram until it knows its IP address, then this broadcast bit must be set to indicate to the server that the BOOTREPLY must be sent as an IP and MAC broadcast. Otherwise, this bit must be zero. This field is defined in [21].
Client IP addr—Set by the client. Either its known IP address, or 0.0.0.0.
Your IP addr—Set by the server if the client IP address field is 0.0.0.0.
Server IP addr—Set by the server.
Router IP addr—This is the address of a BOOTP relay agent to be used by the client. It is set by the forwarding agent when BOOTP forwarding is being used (see Chapter 3).
Client hardware addr—Set by the client and used by the server to identify which registered client is booting.
Server host name—Optional server host name terminated by 0x00.
Boot file name—The client either leaves this null or specifies a generic name or the boot file to be used. The string is null terminated (0x00). The server returns the fully qualified file name of a boot file suitable for the client.
Vendor-specific area—Optional information. Clients should set the first four bytes with a magic cookie. If a vendor-specific magic cookie is not used, the client should use 99.130.83.99 followed by an end tag (255) with the remaining bytes set to zero. The vendor-specific area can also contain vendor extensions; these are options that can be passed to the client at boot time along with its IP address. BOOTP shares many of the same options as DHCP; see [24] for full details.

Operations

Once the client has determined its own hardware address (usually this is held locally in ROM or on the NIC), the BOOTP process proceeds as follows:

The client sends a BOOTP request (UDP source port 68, destination port 67) to the server, stating its hardware address. The client will use 0.0.0.0 for its own IP address and a limited broadcast 255.255.255.255 for the destination (server) address.
The server receives the request and looks for the associated IP address in its database (usually a BOOTP configuration file). The server fills in the remaining fields in the request message and returns a BOOTP response to the client (UDP source port 67, destination port 68), using one of the following methods:
- If the client IP address was included in the BOOTP request, then the server returns the datagram directly to the client. If the ARP cache on the server has not already cached the client IP and hardware address, then ARP will be used to resolve these addresses.
- If the client uses 0.0.0.0 as its address in the BOOTP request, then the server cannot use ARP to resolve this mapping, since the client knows only its hardware address. In this case the server must either have a mechanism for directly updating its own ARP cache, or it must send a limited broadcast response.
- Once the client has processed the response, it has enough IP configuration data to download a boot file if required (typically via the TFTP protocol). The client can then execute the full boot process. In the case of a diskless device this process will often replace the minimal IP stack (loaded from ROM) with a full IP stack, downloaded using the boot file.

BOOTP can be used for centralized configuration of multiple clients. However, this requires a static table to be configured, with mapping entries for every client that requires service. This is clearly inflexible and a potential maintenance problem on large networks. This approach can be considered partly secure, however, since a client can be allocated an IP address by the BOOTP server only if it has the associated MAC address. One of the things you may have picked up from this discussion is that BOOTP uses limited broadcasts, and clearly these are nonroutable by their very nature. So how does BOOTP operate in internetwork environments? Chapter 3 describes a feature supported on many routers called BOOTP forwarding.

Issues with BOOTP over IEEE 802.5 Token Ring networks

BOOTP was originally introduced for Ethernet use, and its operations needed to be modified for Token Ring LANs because of the use of non-transparent bridging [4]. On Token Ring LANs the client should send its broadcast BOOTREQUEST with an All Routes Explorer RIF. This will enable servers and relay agents to cache the return route if required. For those server or relay agents that cannot cache the return route (e.g., because they are stateless), the BOOTREPLY message should be sent to the client's hardware address (extracted from the BOOTP message) with a Spanning Tree Rooted RIF. The bridge route will be recorded by the client and server (or BOOTP relay agent) via ARP. For further information on this topic refer to [21].

2.2.6 Dynamic Host Configuration Protocol (DHCP)

The Dynamic Host Configuration Protocol (DHCP) is based mainly on BOOTP with several extensions. DHCP was introduced by Microsoft via the Windows 95 and NT operating systems and is now a key component of many large enterprises. DHCP comprises two main features, as follows:

Address assignment mechanisms—for the assignment of permanent or temporary client network addresses, from either static or dynamic address lists held on the server.
Client/server protocol—a protocol that downloads host-specific configuration data from a DHCP server to a client. For example, the default gateway or WINS server address.

Address allocation can be achieved by one of the following methods:

Automatic—where DHCP assigns a permanent address to a host.
Dynamic—where DHCP leases an IP address to a client for a limited period of time. This allows efficient and automatic reuse of addresses that are no longer in use.
Manual—where addresses are statically mapped (usually by the network administrator). This method is typically used for devices such as routers, firewalls, or permanent servers.

It may be appropriate to run multiple DHCP servers on your network, each controlling a pool of addresses. If there are multiple servers, then a DHCP client will select the most appropriate response from those servers that answer the request. A DHCP server provides permanent storage of configuration parameters associated with clients. It stores a <key><value> entry for each client, the key being a unique identifier (e.g., a combination of IP subnet number and hardware address), and the value being the configuration parameters previously allocated to the client. This means that a DHCP client will tend to be allocated the same IP address by the server on successive occasions, provided the address pool is not oversubscribed.

Message format

DHCP messages use the same UDP ports as BOOTP—port 67 (server) and port 68 (client)—and DHCP devices can interwork with BOOTP devices, as described in section 2.2.4. The format of a DHCP message is shown in Figure 2.9. Note that all field definitions are as per BOOTP, with the exception of those defined explicitly here. The interested reader is referred to [24, 25] for further details.

click to expand
Figure 2.9: DHCP message format.

Operations

This section briefly describes the DHCP client/server interaction for activities such as address allocation, configuration, and lease renewal.

Allocating a new network address

Assume that the DHCP server has a block of addresses from which it can satisfy new requests. Each server also maintains a database of allocated addresses and leases in permanent local storage. Remember that there may be multiple servers available on the network.

Field definitions (otherwise as per BOOTP)

Client hardware address—Set by the client. DHCP defines a client identifier option, which is used for client identification. If this option is not used, the client is identified by its MAC address.
Boot file name—The client either leaves this null or specifies a generic name, such as router, indicating the type of boot file to be used. In a DHCPDISCOVER request this is set to null. The server returns a fully qualified directory path name in a DHCPOFFER request. The value is terminated by 0x00.
Options—The first four bytes contain the magic cookie (99.130.83.99). The remainder comprises tagged parameters called options. A DHCP client must be prepared to receive DHCP messages with an options field of at least 312 bytes. Several options have been defined. One particular option—the DHCP message type option—must be included in every DHCP message. This option defines the type of the DHCP message. DHCP messages fall into one of the following categories:
- DHCPDISCOVER—broadcasted by a client to find available DHCP servers.
- DHCPOFFER—a response from a server to a DHCPDISCOVER, offering IP address and other parameters.
- DHCPREQUEST—sent from a client to servers. This either requests the parameters offered by one of the servers and declines all other offers, verifies a previously allocated address after a system or network change (e.g., a reboot), or requests the extension of a lease on a particular address.
- DHCPACK—an acknowledgment from server to client with parameters, including IP address.
- DHCPNACK—a negative acknowledgment from server to client, indicating that the client's lease has expired or that a requested IP address is incorrect.
- DHCPDECLINE—sent from client to server indicating that the offered address is already in use.
- DHCPRELEASE—sent from client to server canceling a lease and relinquishing the network address.
- DHCPINFORM—sent by a client that already has an IP address (e.g., manually configured), requesting further configuration parameters from the DHCP server.

Additional options may be allowed, required, or not allowed, depending on the DHCP message type. Refer to [24].

The DHCP client/server interaction, illustrated in Figure 2.10, proceeds as follows:

The client broadcasts a DHCPDISCOVER message on its local subnet. The DHCPDISCOVER message may include options such as the offered address and lease duration. If multiple servers are available, each server may respond with a DHCPOFFER message, which includes an offered network address (your IP address) together with other configuration options. The servers may mark the address as offered to prevent the same address from being offered elsewhere in the interim.
The client receives one or more DHCPOFFER messages. The client chooses one based on the configuration parameters offered and broadcasts a DHCPREQUEST message, which includes the server identifier option, to indicate which message it has selected, as well as the requested IP address option taken from Your IP Address in the selected offer. In the event that no offers are received, if the client has cached its previous network address, the client may attempt to reuse that address if its lease is still valid.
The servers receive the DHCPREQUEST broadcast from the client. Those servers not selected by the DHCPREQUEST message use the message as notification that the client has declined that server's offer. The selected server saves the binding for the client to persistent storage (e.g., hard disk) and responds with a DHCPACK message containing appropriate configurarion paramerers. The combination of client hardware and assigned network address constitutes a unique identifier for the client's lease and is used by both the client and server to uniquely identify a lease. The selected network address is inserted into the Your IP Address field in the DHCPACK message.
The client receives the DHCPACK message with configuration parameters. The client performs a final check on the parameters (e.g., using ARP for allocated network addresses) and notes the duration of the lease and the lease identification cookie specified in the DHCPACK message. At this point, the client is configured. If the client detects a problem with the parameters in the message (e.g., the address is already in use on the network), then the client sends a DHCPDECLINE message to the server and restarts the configuration process. The client should wait a minimum of ten seconds before restarting to avoid excessive network traffic in the event of looping. On receipt of a DHCPDECLINE, the server must mark the offered address as unavailable (and possibly inform the system administrator that there is a configuration problem). If the client receives a DHCP-NAK message, the client restarts the process.
At any time the client may choose to relinquish its lease on a network address by sending a DHCPRELEASE message to the server. The client identifies the lease to be relinquished by including both its network address and its hardware address.

click to expand
Figure 2.10: Interaction between DHCP client and server.

Responses from the DHCP server to the DHCP client may be broadcast or unicast, depending on whether the client is able to receive a unicast message before the TCP/IP stack is fully configured; this varies between implementations.

Lease renewal

DHCP defines a process to control lease expiration and renewal for clients that have already been configured, but have not been active for some time. This process is as follows:

When a server sends the DHCPACK message to a client, it includes the lease time for the allocated address as one of the options in the message, together with two timer values, T1 and T2. The client is entitled to use the address for the duration of the lease time. Once the configuration is applied, the client also starts its own T1 and T2 timers, where T1 must be less than T2, and T2 must be less than the lease time. Reference [24] states that T1 defaults to (0.5 × lease time) and T2 defaults to (0.875 × lease time).
When timer T1 expires, the client unicasts a DHCPREQUEST message back to the originating server, requesting an extension to the lease period. The server typically responds with a DHCPACK message indicating the new lease time. Timers T1 and T2 are reset at the client accordingly. The server also resets its record of the lease time. If a DHCPACK is not received before timer T2 expires, the client broadcasts a DHCPREQUEST message to attempt to extend its lease. This request can be confirmed with a DHCPACK message from any DHCP server on the network.
In normal circumstances, an active client would continually renew its lease in this way indefinitely without letting the lease expire. However, if the client does not receive a DHCPACK message after its lease has expired, it must stop using the address. The client may then restart the process by issuing a DHCPDISCOVER broadcast.

A host should use DHCP to reacquire or verify its current IP address and network configuration whenever the local network parameters have changed—for example, at system boot time or after a disconnection from the local network, since the local network configuration can change without the host's or user's knowledge. If a client has multiple IP interfaces, each of them must be configured by DHCP separately. For further information, please refer to [24, 25].

Issues with DHCP

DHCP relieves the network administrator of great deal of manual configuration work. The ability for a device to be moved from network to network and to automatically obtain valid configuration parameters can be of great benefit to mobile users. Since IP addresses are allocated only when clients are actually active, it is possible (by using shorter lease times, as well as the fact that mobile clients do not generally require more than one address) to optimize the address space used by an organization. However, the following points should be considered when DHCP is being implemented:

DHCP runs over UDP and lacks built-in security. In normal operation, an unauthorized client could connect to a network and obtain a valid IP address and configuration. To prevent this, it is possible to preallocate IP addresses to particular MAC addresses (similar to BOOTP), but this increases the administration workload and removes the benefit of recycling of addresses. Unauthorized DHCP servers could also transmit false and potentially disruptive information to clients (possibly initiating a denial of service attack—see Chapter 5).
With automatic or dynamic address allocation it is generally not possible to predetermine the IP address of a client. In this case, if static DNS servers are also used, the DNS servers are unlikely to hold valid host-name-to-IP-address mappings for the clients. If having client entries in the DNS is important, you may use DHCP to manually assign IP addresses to those clients and then administer the client mappings in the DNS accordingly.
Using DHCP may have some impact on your installation if you are using security implementations that map user IDs to IP addresses (sometimes called source IP address—based security schemes). This is likely to cause problems if you use the dynamic allocation or leasing capability.
Devices that use large ARP cache aging timers may experience or cause problems if DHCP reissues IP addresses to different hosts before the previous ARP entries either age out or are manually flushed. For example, if a router maintained old entries in its ARP cache, then it could be temporarily forwarding packets using the wrong MAC address.

There is a relatively new enhancement (proposed as an IETF draft) for DHCP to improve resilience, called DHCP Safe Failover Protocol. In the situation where we have a primary and backup DHCP server, all DHCP requests are sent to both servers. The primary server updates the backup with lease information. The backup takes over when primary fails. Backup servers use a dedicated pool of addresses allocated by the primary to prevent duplicate IP addresses from being assigned. Servers synchronize when the primary is up. For further information, see [4].

2.2.7 Using BOOTP and DHCP concurrently

The format of DHCP messages is, for the most part, identical to the format of BOOTP messages (in fact, on network analyzer traces, DHCP messages are often interpreted as BOOTP). This enables BOOTP and DHCP clients to interoperate in certain circumstances. All DHCP messages must include a DHCP message type (51) in the options field. Any message without this option is, therefore, assumed to be a BOOTP message. DHCP servers will ordinarily discard any BOOTP message, unless configured by a system administrator to handle both BOOTP and DHCP clients. If BOOTP clients are supported, a DHCP server will respond to BOOTPREQUEST messages with BOOTPREPLY, instead of a DHCPOFFER. A DHCP server may offer static addresses or automatic addresses to a BOOTP client. Note that if an automatic address is offered to a BOOTP client, then it must have an infinite lease time, since the client has no concept of a lease mechanism. DHCP messages may be forwarded by routers configured as BOOTP relay agents. For further information on interoperability, refer to [26].

2.2.8 Internet Control Message Protocol (ICMP)

The Internet Control Message Protocol (ICMP) is important for addressing and router operations, since it typically informs a sender when destination addresses or port numbers are either unavailable or unresolvable. ICMP is essentially a diagnostic protocol that runs directly over IP. ICMP must be implemented by every IP module and uses IP datagrams to send messages that perform flow control, error reporting, routing manipulation, and other key functions. Network engineers make extensive use of the ubiquitous ping utility (described in Chapter 9), which uses ICMP's Echo facility to test reachability and response times for any device with an IP address. A response from ping means that network routing is operational between the two nodes and that the remote node is alive. ICMP allows routers and hosts to communicate between themselves for control purposes. It provides feedback about problems in the communication environment but does not make IP reliable.

ICMPv4

ICMP messages have a common header, including the type and code fields, plus 64 bits of the original data datagram if applicable. IPv4 ICMP functions are as follows (type fields are in parentheses):

Echo Request (8), Echo Reply (0)—allows return of information to verify paths.
Destination Unreachable (3)—indicates whether the net, host, protocol, or port is unreachable; whether fragmentation is needed; or whether the source route failed.
Source Quench (4)—is sent when the gateway discards a datagram due to a number of conditions, such as insufficient buffer space for incoming packets. The gateway may send source quench to a host that is transmitting too aggressively. There is no guarantee that the host or application will back off or even understand what to do in such circumstances.
Redirect (5)—sent when the gateway recognizes a shorter path or if a remote path becomes unavailable.
Router Discovery (9), Router Solicit (10)—The ICMP Router Discovery Protocol (IRDP) uses router advertisement and router solicitation messages to dynamically discover the addresses of routers on directly attached subnets. We cover IRDP specifically in Chapter 3.
Time Exceeded (2)—generally indicates that TTL of a packet was exceeded in transit (or if fragment reassembly time was exceeded).
Parameter Problem (12)—indicates header parameter problems such that it cannot complete processing of the datagram. This may include incorrect arguments in an option.
Timestamp Request (13), Timestamp Reply (14)—returns the time the sender last touched the message before sending it, the time the echoer first touched it on receipt, and the transmit time when the echoer last touched the message on sending it.
Information Request (16), Information Reply (16)—returns the number of the network it is on (obsolete).
Address Mask Request (17), Address Mask Reply (18)—broadcasted by a host to discover the subnet mask for the network specified. Typically responded to by a router.

For protocol-specific details of ICMP the interested reader is referred to [27, 28].

ICMPv6

A new version of ICMP that will operate with IPv6 is specified in [29]. The new protocol is called ICMPv6. Each ICMPv6 message is preceded by an IPv6 header, and zero or more IPv6 extension headers. The ICMPv6 header is identified by an IPv6 Next Header value of 58. The following functions are currently specified:

Destination Unreachable (1)—Generated by a router or by the IPv6 layer in the originating node in response to a packet that cannot be delivered to its destination address for reasons other than congestion.
Packet Too Big (2)—must be sent by a router in response to a packet that it cannot forward because the packet is larger than the MTU of the outgoing link.
Time Exceeded (3)—If a router receives a packet with a hop limit of zero, or a router decrements a packet's hop limit to zero, it must discard the packet and send an ICMPv6 Time Exceeded message with Code 0 to the source of the packet. This indicates either a routing loop or too small an initial hop limit value.
Parameter Problem (4)—If an IPv6 node processing a packet finds a problem with a field in the IPv6 header or extension headers such that it cannot complete processing the packet, it must discard the packet and should send an ICMPv6 Parameter Problem message to the packet's source, indicating the type and location of the problem.
Echo Request (128)—Every node must implement an ICMPv6 Echo responder function that receives Echo Requests and sends corresponding Echo Replies. A node should also implement an application-layer interface for sending Echo Requests and receiving Echo Replies, for diagnostic purposes.
Echo Reply (129)—As per Echo Request.

ICMP protocol packet exchanges can be authenticated using the IP Authentication Header [30]. For further information on ICMPv6, the interested reader is directed to [29]. The implications of ICMP for routers are explained in Chapter 3. Further information on diagnostic tools that exploit ICMP is provided in Chapter 9.

ICMP redirects

ICMP Redirects (Type 5 in IPv4) are of particular importance for routers, since they are often used to redirect a source to a better next-hop router if problems are detected upstream. For example, if there are two LAN-attached routers, R1 and R2, and a host has a default route to R1, if R1 loses the upstream path it can send an ICMP Redirect to the host, informing it that there is a better path via R2 (this is to stop the host from sending traffic to R1 and relying on R1 to forward the traffic over the LAN to R2). Routers generally send ICMP Redirects when the following conditions are met:

The OS supports ICMP Redirects and is configured to send them.
The router's receiving interface for a packet must be the same interface on which the packet is currently being forwarded (i.e., in our example R1's LAN interface is receiving host traffic and then having to forward this traffic back out of the same interface to R2).
The network or subnet address of the routed packet's IP source address must be consistent with the next-hop IP address.
The packet is not source routed.

Note that ICMP Redirects may be disabled in some circumstances—for instance, when using the HSRP protocol in high-availability cluster configuration [4]. You should check the vendor documentation for your particular platform.