15.3 Implementing the ARP Instance in the Linux Kernel

   


In theory, ARP would have to run an address resolution for each outgoing IP packet before transmitting it. However, this would significantly increase the required bandwidth. For this reason, address mappings are stored in a table the so-called ARP cache as the protocol learns them. We have mentioned the ARP cache several times before. This section describes how the ARP cache and the ARP instance are implemented in the Linux kernel.

Though the Address Resolution Protocol was designed for relatively generic use, to map addresses for different layers, it is not used by all layer-3 protocols. For example, the new Internet Protocol (IPv6) uses the Neighbor Discovery (ND) address resolution to map IPv6 address to layer-2 addresses. Though the operation of the two protocols (ARP and ND) is similar, they are actually two separate protocol instances. The Linux kernel designers wanted to utilize the similarity between the two protocols and implemented a generic support for address resolution protocols in LANs, the socalled neighbour management.

A neighbour represents a computer that is reachable over layer-2 services (i.e., directly over the LAN). Using the neighbour interface and the available functions, you can implement special properties of either of the two protocols (ARP and Neighbour Discovery). The following sections introduce the neighbour interface and discuss the ARP functions. Chapter 23 describes how Neighbor Discovery is implemented.

15.3.1 Managing Reachable Computers in the ARP Cache

As was mentioned earlier, computers that can be reached directly (over layer 2) are called neighbor stations in Linux. Figure 15-4 shows that they are represented by instances of the neighbour structure.

Figure 15-4. Structure of the ARP cache and its neighbor elements.

graphics/15fig04.gif


The set of reachable computers is managed in the ARP cache, which is organized in a hash table. The hash function arp_hash() can be used to map neighbour structures to rows in the hash table. A linear collision resolution occurs if several structures fall on the same hash row. The basic functions of the ARP hash table are handled by the neighbour management. This means that the ARP hash table is only an instance of the more general neigh_table structure.

The structures of the neighbour management and its linking are introduced below.

struct neighbour

include/neighbour.h


The neighbour structure is the representation of a computer that can be reached directly over the data-link layer. The ARP instance creates a neighbour structure as soon as a layer-3 protocol (normally, the Internet Protocol) asks for the layer-2 address of a computer in the LAN. This means that the ARP cache contains all reachable stations and, additionally, the addresses of stations that are currently being determined. To prevent the cache from growing endlessly, entries with layer-2 addresses that have not been requested are deleted after a certain time. The neighbour structure has the following parameters:

  • next: Because neighbor stations are organized in hash tables, and collisions are resolved by the chaining strategy (linear linking), the next field references the next neighbor structure in a hash row.

  • tbl: This pointer points to the neigh_table structure that belongs to this neighbour and manages the current entry.

  • parms: The neigh_parms structure includes several parameters about a neighbour computer (e.g., a reference to the associated timer and the maximum number of probes. (See neigh_timer_handler() function, below.)

  • dev: This is a pointer to the corresponding network device.

  • timer: This is a pointer to a timer used to initiate the handling routine neigh_timer_handler().

  • opts: Neighbor options define several functions used to send packets to this neighbour. The functions actually used depend on the properties of the underlying medium (i.e., on the type of network device). Figure 15-5 shows the neigh_opts variants. For example, the hh options are used when the network device needs an address to be resolved and supports a cache for layer-2 headers, and direct is used for network devices that do not need address resolution, such as point-to-point connections. The functions available in a neigh_opts variant are used for different tasks involved in the address-resolution process (e.g., resolve an address (solicit()) or send a packet to a reachable neighboring computer (connected_output()).

  • hardware_address: This array stores the physical address of the neighboring computer.

  • hh: This field refers to the cache entry for the layer-2 protocol of this neighbour computer. For example, an Ethernet packet header consists of the sender address, the destination address, and the ethertype field. It is not necessary to fill these fields every time; it is much more efficient to have them computed and readily stored, so that they need only be copied.

  • nud_state: This parameter manages the state (i.e., valid, currently unreachable, etc.) of the neighboring station. Figure 15-5 shows all states a neighbor can possibly take. These states will be discussed in more detail in the course of this chapter.

  • output(): This function pointer points to one of the functions in the neigh_ops structure. The value depends on the current state (nud_state) of the neighbour entry and the type of network device used. Figure 15-5 shows the possible combinations. The output() function is used to send packets to this neighboring station. If a function pointer is used, then the state of a packet does not have to be checked when it is sent. Should this state ever change, then we can simply set a new pointer.

  • arp_queue: The ARP instance collects in this queue all packets to be sent for neighbour entries in the NUD_INCOMPLETE state (i.e., the neighboring computer currently cannot be reached). This means that they don't have to be discarded, but can be sent as soon as an address has been successfully resolved.

Figure 15-5. Available neighbor options.

graphics/15fig05.gif


struct neigh_table

include/net/neighbour.h


A neigh_table structure manages the neighbour structures of an address-resolution protocol (see Figure 15-4), and several tables like this can exist in one single computer. We describe only the special case with an ARP table here. The neigh_table instance of the ARP protocol can be reached either over the linked list in the neigh_table structures or directly over the arp_tbl pointer.

The most important fields in a neighbour hash table are as follows:

  • next: As mentioned earlier, a separate neigh_table instance is created for each protocol, and these instances are linearly linked. This is the purpose of the next pointer. The neigh_tables variables points to the beginning of the list.

  • family: This field stores the address family of neighbour entries. The ARP cache contains IP addresses, so this field takes the value AF_INET.

  • constructor(): This function pointer is used to generate a new neighbour entry. Depending on the protocol instance, different tasks may be required to generate such an entry. This is the reason why each protocol should have a special constructor. In the arp_tbl structure, this pointer references the function arp_constructor(), which will be described later.

  • gc_timer: A garbage collection (GC) timer is created for each neigh_table cache. This timer checks the state of each entry and updates these states periodically. The handling routine used by this timer is neigh_periodic_timer().

  • hash_buckets [NEIGH_HASHMASK+1]: This table includes the pointers to the hash rows that link the neighbour entries linearly. The arp_hash() function is used to compute hash values.

  • phash_buckets [PNEIGH_HASHMASK+1]: This second hash table manages the neighbour structures entered when the computer is used as an ARP proxy.

struct neigh_ops

include/net/neighbour.h


The ops field of each neighbour structure includes a pointer to a neigh_ops structure. The available options define different types of neighbors and include several functions belonging to a neighbour type (connected_output(),hh_output(), etc.). For example, the functions needed to send packets to a neighboring computer are defined in the neighbour options. The following four types are available for entries in the ARP cache: generic, direct, hh, and broken.

The respective functions of these types are shown in Figure 15-5. Depending on the type of network device used, the ops fields for new neighbour structures in the arp_constructor() function are set to one of the following four options:

  • arp_direct_ops() is used when the existing network device does not include hardware headers (dev->hard_header == NULL). These stations are directly reachable, and no layer-2 packet header is required (e.g., for PPP).

  • arp_broken_ops() is reserved for special network devices (ROSE, AX25, and NETROM).

  • arp_hh_ops() is set when the network device used has a cache for layer-2 packet headers (dev->hard_header_cache). In this case, the ops field is set to arp_hh_ops.

  • arp_generic_ops() is used when none of the above cases exists.

The output() functions of the neigh_ops structure are particularly important. Each neighbour structure includes an output() pointer that points to a function used to send data packets to a neighboring station. For ARP cache entries in the NUD_REACHABLE, NUD_PERMANENT, or NUD_NOARP state, the output() pointer references the function connected_output() of the neigh_ops structure; it is the fastest of all. connected_output() assumes that the neighboring computer is reachable, because these three states mean either that the reachability was confirmed recently or that no confirmation is required (permanent entry or point-to-point).

For neighbour stations in other states, the output() pointer references the output() function, which is slower and more careful. Direct reachability is doubted, so an initial attempt is made to obtain a confirmation of the neighboring computer's reachability (probe).

Possible States for neighbour Entries

It is theoretically possible to leave the entries for all neighboring stations ever learned in the ARP cache. However, there are several reasons why these entries are valid for a limited period of time. First, it would mean memory wasted to maintain entries for all these computers, especially if there is little or no data exchange with them. Second, we have to keep these entries consistent. For example, there can be a situation when the network adapter in a computer is replaced and so this computer will have a different layer-2 address. This computer could no longer be reached with the old mapping. Therefore, it is assumed that the mapping stored for a computer is no longer valid if that computer has not sent anything for some time.

In practice, the size of the ARP cache is limited (normally to 512 entries), and old or rarely used entries are periodically removed by a kind of garbage collection. On the other hand, it could well be that a computer does not communicate over a lengthy time, which means that its table is empty. In fact, this was not possible up to kernel Version 2.4, because the size of a neigh_table structure was also limited downwards: No garbage collection was done when the table included fewer than gc_thresh1 values, which normally meant 128 entries. This bottom limit no longer exists in kernel Version 2.4 and higher. You can use the arp command (see Section 15.2) to view the contents of the ARP cache.

Each neighbour entry in the ARP cache has a state, which is stored in the hud_state field of the corresponding neighbour structure. Figure 15-6 shows all possible states and the most important state transitions. There are other transitions, but they hardly ever occur. We left them out for the sake of keeping the figure easy to understand. The states and state transitions are described below.

Figure 15-6. State transition diagram for neighbour entries in neighbour caches.

graphics/15fig06.gif


  • NUD_NONE: This entry is invalid. A neighbor normally is in this state only temporarily. New entries for the ARP cache are created by the neigh_alloc() function, but this state is changed immediately.

  • NUD_NOARP, NUD_PERMANENT: No address resolution is done for entries in these two states. NUD_NOARP are neighbors that do not require address resolution (e.g., PPP). Entries with the NUD_PERMANENT state were permanently set by the administrator and are not deleted by the garbage collection.

  • NUD_INCOMPLETE: This state means that there is no address mapping for this neighbor yet, but that it is being processed. This means that an ARP request has been sent, and the protocol is waiting for a reply.

  • NUD_REACHABLE: neighbour structures in this state are reachable with the fastest output() function (neigh_ops->connected_output()). An ARP reply packet from this neighbor was received, and its maximum age is neigh->parms->reachable_time time units. This interval is restarted when a normal data packet is received.

  • NUD_STALE: This state is taken when an entry has been REACHABLE, but reachable_time time units have expired. For this reason, it is no longer certain that the neighbouring computer can still be reached with the address mapping currently stored. For this reason, rather than using connected_output() to send packets to this neighbour, the slower neigh_ops->output() is used.

  • NUD_DELAY: If a packet needs to be sent to a station in the NUD_STALE state, then the NUD_DELAY state is set. It is between the NUD_STALE und NUD_PROBE states only temporarily. Of course, if the address mapping is confirmed once again, then the entry changes to the NUD_REACHABLE state.

  • NUD_PROBE: The entry in the ARP cache is in the probing phase: Consecutive ARP request packets are sent in an attempt to obtain the layer-2 address of this computer.

  • NUD_FAILED: The address mapping cannot be resolved for entries in this state. ARP tries to solve the problem by sending neigh_max_probes request packets. If it still doesn't get replies to these packets, then the state of the neighbour entry is set to NUD_FAILED. Subsequently, the garbage collection deletes all entries in this state from the ARP cache.

To understand the states better, we summarize three additional state combinations below:

  • NUD_IN_TIMER = (NUD_INCOMPLETE | NUD_DELAY | NUD_PROBE): An attempt is currently being made to resolve the address.

  • NUD_VALID = (NUD_PERMANENT | NUD_NOARP | NUD_REACHABLE | NUD_PROBE | NUD_STALE | NUD_DELAY): The neighbour entry includes an address mapping, which has been valid.

  • NUD_CONNECTED = (NUD_PERMANENT | NUD_NOARP | NUD_REACHABLE): The neighbour entry is valid and the neighboring computer can be reached.

15.3.2 Operation of the Address Resolution Protocol (ARP)

Given that the ARP cache and other neighbour tables have been built as discussed in the previous section, this section describes how the Address Resolution Protocol (ARP) in the Linux kernel operates. We first discuss the routes different ARP packets take across the kernel and how the ARP instance operates. Figure 15-7 shows the routes of ARP request and ARP reply packets.

Figure 15-7. ARP requests and ARP replies traveling through the ARP instance.

graphics/15fig07.gif


Incoming ARP PDUs

arp_rcv() handles incoming ARP packets on layer 3. ARP packets are packed directly in layer-2 PDUs, so a separate layer-3 protocol definition (arp_packet_type) was created for the Address Resolution Protocol. This information and the protocol identifier ETH_P_ARP from the LLC header are used to identify that the packet is an ARP PDU and to treat it as such.

arp_rcv()

net/lpv4/arp.c


Once a computer has received it, an ARP PDU is passed to the ARP handling routine by the NET_RX software interrupt (net_rx_action). arp_rcv() first checks the packet for correctness, verifying the following criteria the packet is dropped if one of these conditions is true:

  • Is the net_device structure a correct network device (in_dev == NULL)?

  • Are the ARP PDU length and the addresses it contains correct (arp->ar_hln != dev->addr_len)?

  • Does the network device used require the ARP protocol at all (dev->flags & IFF_NOARP)?

  • Is the packet directed to another computer (PACKET_OTHERHOST) or intended for the LOOPBACK device?

  • The arp_plen field should have value 4. Otherwise, the packet does not originate from a request for the layer-2 address of an IP address or a reply, respectively. Currently, the Linux kernel supports only address resolutions based on the Internet Protocol.

The packet is dropped if one the these conditions (in brackets) is true. If the ARP packet is correct, it is checked to see whether the MAC type specified in the packet complies with the network device. For example, if the ARP packet arrived in an Ethernet card, then the protocol type in the ARP packet should be either ARPHRD_ETHER or ARPHRD_IEEE802. Interestingly, the Ethernet hardware identifier is also used for token ring and FDDI network devices.

Subsequently, all packets are filtered, if they are neither ARP request nor ARP reply PDUs or if they probe for the layer-2 address of a loopback address (127.x.x.x) or a multicast IP address.

Further handling of a packet differs only slightly for an ARP request or ARP reply. Both types are entered in the ARP cache, or neigh_lookup() updates an existing entry.

An additional step for ARP requests returns a reply PDU to the requesting computer. To this end, the arp_send() function is used to compose an ARP reply packet (as shown below). One particularity here is that the computer can act as ARP proxy for other computers, in addition to listening to ARP requests with its own address. For example, this is necessary when the computer acts as firewall, and the firewall does not admit ARP requests. Consequently, this computer has to accept packets for other computers without the senders' knowledge. The computer acting as a firewall identifies itself to the ARP mechanism as these other computers. The work of the ARP proxy is done by arpd (ARP daemon).

neigh_lookup()

net/core/neighbour.c


This function is required to search the ARP cache for specific entries. If the neighbor we look for is found in the hash table, then a pointer to the neighbour structure is returned, and the reference counter of that ARP entry is incremented by one.

arp_send()

net/ipv4/arp.c


The arp_send() function includes all parameters to be set as arguments in an ARP PDU. It uses them to build an ARP packet with all fields properly set. The Hardware Type field and the layer-2 address are set in relation to the corresponding network device. The Internet Protocol is the only layer-3 protocol supported, so the fields for the layer-3 protocol type and the length of a layer-3 address always have the same values. Fianlly, the layer-2 packet header is appended, and the complete packet is sent by dev_queue_xmit().

neigh_update()

net/core/neighbour.c


The function neigh_update(state) is used to set a new state (new_state). This has no effect for neighbour entries in the NUD_PERMANENT and NUD_NOARP states, because no state transitions are allowed from these states to another state. (See Figure 15-6.)

If the state should be NUD_CONNECTED, then neigh_connect() is invoked to set the output() function to neigh_connected_output(). If this is not the case, the function neigh_suspect() has to be invoked to obtain the opposite effect.

If the old state was invalid (if (!old & NUD_VALID)), there might be packets waiting for this neighbor in the ARP queue. As long as the entry remains in the NUD_VALID state, and packets are still waiting in the queue, these will now be sent to the destination station.

Handling Unresolved IP Packets

So far, we have looked only at the case where an ARP PDU arrived in the computer and some action was taken in response. This section discusses how and when the Address Resolution Protocol resolves addresses. We know from Chapter 14 that an IP packet is generally sent by the ip_finish_output() function. The netfilter hook POST_ROUTING handles the last steps of this function in ip_finish_output2(). In the latter, the function pointer hh_output() is invoked for packets to a destination present in the layer-2 header cache. In contrast, the function pointer dst->neighbour->output() is used for network devices without layer-2 header cache. The function pointers hh_output() and output() of the neigh_ops options hide behind these two pointers. If the ARP entry is valid, then the pointers normally point to dev_queue_xmit(). (See Figure 15-5.) If there is no address resolution, then the output() pointer of the neighbour options is normally used; it points to neigh_resolve_output(). Of course, the IP packet can be sent immediately if the network device does not use ARP, so the pointer also points to dev_queue_xmit().

The benefits of function pointers become obvious at this point again. The protocol status the entry in the ARP cache, in this case does not have to be checked every time; instead, we simply invoke the output() method. This means that the fast transmit function is invoked, or the address resolution method is used, depending on the entry's state. In summary, function pointers represent an elegant method of implementing stateful protocols.

neigh_resolve_output()

net/core/neighbour.c


neigh_resolve_output(skb) is the second function that can be referenced by the output() function pointer. In contrast to neigh_connected_output(), it cannot be assumed in this case that the stored address resolution is valid. For this reason, neigh_event_send() is used first to check the state the neighbour entry is in and whether the packet specified by skb can be sent to the destination station without prior address resolution. If so, then dev->hard_header() creates a layer-2 PDU, and neigh->ops->queue_xmit() sends the packet. If the network device supports a layer-2 header cache, and no entry yet exists for this receiver, then neigh_hh_init() creates this entry.

If the packet cannot yet be sent for example, because the neighbour entry is in the NUD_STALE or NUD_INCOMPLETE states then neigh_send_event() stores the packet in the arp_queue of the neighbour entry.

neigh_event_send()

net/core/neighbour.h


The return value of neigh_event_send(neigh, skb) is a Boolean value showing whether the packet specified by skb can be sent to the destination station (return value 0) or the address resolution is currently invalid (1), which means that the packet should not be sent. The value 0 is returned immediately for neighboring computers in the NUD_NOARP, NUD_PERMANENT, and NUD_REACHABLE states; otherwise, the function _neigh_event_send() is invoked, which does the following actions for the other states:

  • NUD_NONE: New neighbor entries in this state are initially set to NUD_INCOMPLETE. Next, the timer of this neighbour is set, and neigh->ops->solicit() starts the first attempt to resolve the address.

  • NUD_FAILED: A value of 1 is returned immediately, because the attempt to resolve the address of this station failed. No packets can be sent to this station.

  • NUD_INCOMPLETE: Packets intended for computers in the NUD_INCOMPLETE state are stored in the arp_queue of the neighbour entry. Subsequently, the value 1 is returned to prevent the packet from being sent. The packet is temporarily stored in the queue of that neighbour entry until the neighboring station can be reached or the attempt to transmit is considered to have failed.

  • NUD_STALE: In this case, the neighbour entry's state changes to NUD_DELAY, and the timer is set to expires = now + delay_probe_time. The timer's handling routine, neigh_timer_handler(), will then check the state of this entry as soon as the specified time has expired.

The function returns 1 if none of the above states is applicable.

neigh_connected_output()

net/core/neighbour.c


This function is the fastest possibility for neigh->output() to use without sending a stored layer-2 header. It is used only by neighbors in the NUD_REACHABLE state and for network devices that do not support hardware header caches. First, dev->hard_header() is invoked to create the layer-2 PDU; then, neigh->ops->queue_xmit() sends this PDU.

arp_solicit()

net/ipv4/arp.c


arp_solicit() is the actual function used to obtain the MAC address of a neighboring computer. It is used to send ARP Request packets.

The probes parameter in the neighbour structure stores the number of requests sent so far (i.e., the number of unsuccessful attempts since the neighbour entry was in the NUD_REACHABLE state).

arp_solicit() checks for how many ARP requests have been sent. If the specified limit has not yet been exceeded, then an ARP request is sent by the arp_send() function. The ARP_REQUEST parameter specifies the packet type.

If the MAC address of an interested computer is already known for example, from an earlier request then an attempt is first made to send the ARP request in a unicast packet to directly the neighboring station. This means that a simple check is done to see whether this computer is still reachable at this address, without disturbing other computers in the same LAN. Notice that a maximum of neigh->parms->ucast_probes are sent. Additional requests are then broadcast to all computers in the LAN. If the maximum number (neigh->max_probes) is exceeded again, then no more requests will be sent.

neigh_timer_handler()

net/core/neighbour.c


This is a handling routine invoked by the timer of a neighbour entry in the ARP cache. In contrast to neigh_periodic_timer(), the timer calls at intervals specified in the neighbour entry, rather than continually.

The timer is set when an ARP request PDU is sent, among other events. The triggering time is set to expires = now + retrans_time to check for whether a reply has arrived for this request, when this time has expired.

One of the following actions is performed, depending on the current state of the neighbour entry:

  • NUD_VALID: The state of the ARP entry has changed to NUD_VALID since the time when the timer was set and the handling routine was executed. The corresponding computer is reachable, and its state is now set to NUD_REACHABLE.

    The neigh_connect(neigh)function ensures that the correct functions of a reachable computer are executed. For example, it sets the output() functions to neigh->ops->connected_output().

  • NUD_DELAY: In this case, the state of the neighbour entry is changed to NUD_PROBE. The number of probes is set to null, which means that the entry starts the probing phase.

  • NUD_PROBE: The entry in the ARP cache is in the probing phase; successive ARP request packets are sent in an attempt to resolve the computer's address.

When the number of sent requests (probes) has exceeded the maximum number (neigh_max_probes(probes)), it is assumed that the computer is not reachable, and its state changes to NUD_FAILED. If there are still packets for this computer in the queue of this neighbour entry, then the error_report() routine is invoked for each socket buffer, and finally the arp_queue is deleted.

If the maximum number of probes has not yet been exceeded, the neigh->ops->solicit() routine is invoked to send an ARP request. Before this request is sent, the timer is reinitialized, so that the timer handler will be invoked again as soon as neigh->parms->retrans_time time units (jiffies) have expired.

neigh_connect()

net/core/neighbour.c


neigh_connect(neigh) is invoked when the neighbour entry changes its state to NUD_CONNECTED. The output() function of this entry is set to connected_output(). If a hardware header exists, then the procedure to send a packet at the network device interface (hh->hh_output()) is set to neigh->ops->hh_output(), to be able to use the stored hardware header.

neigh_suspect()

net/core/neighbour.c


The neigh->output() functions are changed to neigh->ops->output(). This means that, if the fast way over the hardware header cache was previously used, it is no longer used now, so that, when the next packet is ready to be sent, a probe for the MAC address will be started (neighbor solicitation by neigh_resolve_output()).

neigh_destroy()

net/core/neighbour.c


A neighbour entry is deleted from the ARP cache, and its structures are released. Entries in the hardware header cache are also released. neigh_release() invokes neigh_destroy(). It first checks for whether there are still other pointers to this neighbour (if (atomic_dec_and_test(&neigh->refcnt))) and for whether the entry has already been marked as unused (if (neigh->dead)). Both conditions must be true before the neighbor may be deleted.

neigh_sync()

net/core/neighbour.c


This function has no effect for permanent neighbour entries (NUD_PERMANENT) or for network devices without ARP support (NUD_NOARP), and it returns immediately. Otherwise, the following actions are taken, depending on the entry's state:

  • NUD_REACHABLE: If an entry is in this state and a certain time (neigh->reachable_time) has expired since the last acknowledgement was received from the neighboring computer, either by an incoming packet or an explicit ARP request or ARP reply, then the entry is marked as NUD_STALE. This means that no sign of life has come from this computer over a certain period of time and so it probably no longer exists. The function neigh_suspect() is used to verify this situation; it tries to update that computer's state.

  • NUD_VALID: If the computer is known and an acknowledgement has arrived before the normal lifetime of the entry (neigh->reachable_time) expired, then its state is set to NUD_REACHABLE, and neigh_connect(neigh) is invoked.

neigh_sync() is invoked by neigh_update() before the new state is entered in the neighbour structure. The intention is to ensure that the current state be updated before state transitions occur.

neigh_periodic_timer()

net/core/neighbour.c


This function initializes a timer for each neighbour cache. This timer periodically checks and updates the states of cache entries (i.e., it runs a so-called garbage collection). The relevant handling routine is the function neigh_periodic_timer(). It visits each entry in the cache and does one of the following actions, depending on the entry's state:

  • NUD_PERMANENT: This is a permanent entry; nothing in its state has to be changed.

  • IN_TIMER: An attempt is currently being made to reach the specified computer by sending an ARP request packet. This also means that the timer of the neighbour entry is set, and the handling routine neigh_timer_handler() will run soon. In this case, the entry's state is updated at the same time, so that neigh_periodic_timer() changes nothing in the state of this entry.

  • NUD_FAILED: If a neighbour entry is in the NUD_FAILED state, or if the time neigh->ops->staletime has expired, the computer is considered no longer reachable, and neigh_release() deletes this entry from the ARP cache.

  • NUD_REACHABLE: If an entry is marked as reachable, but neigh->ops->reachable_time jiffies have already passed since the last acknowledgment, then it is classified as old (NUD_STALE), and neigh_suspect() (described earlier) attempts to update this entry.

The function neigh_periodic_timer() runs as an independent tasklet in multi-processor systems.

Creating and Managing neighbour Instances

neigh_create()

net/core/neighbour.c


This function is responsible for creating a new neighbour entry and entering it in the respective neighbour cache. neigh_create() is normally invoked by the arp_bind_neighbour() function when neigh_lookup() was unsuccessful at finding the ARP entry of the interested computer. Accordingly, it creates a new entry.

To create a new neighbour entry, the function first initializes a neighbour structure in the appropriate neighbour instance (neigh_alloc()). If a constructor was defined for the entries in this table, then it is invoked now.

Before it adds a neighbour to the table, the function first checks for whether such an entry already exists. If not, then the entry is added as the first element of the hash row that references pkey. A new entry is added at the beginning of the hash row, because the probability is high that it will be accessed next. The return value is a pointer to the new entry in the ARP cache.

neigh_alloc()

net/core/neighbour.c


neigh_alloc(tbl) creates a new neighbour structure for a specific neighbour table (tbl). This table is specified additionally, because it includes some information required for the new entry. In addition, before the neighbour structure is created, the function first checks on whether the current table is full. tbl->gc_thresh3 is the absolute upper limit of the table. This limit must not be exceeded. gc_thresh2 is a threshold value that should be exceeded only briefly. The garbage collector allows you to exceed this limit for a maximum of five seconds. When this time expires, it runs a garbage collection. The following query tests for these two conditions:

 if (tbl->entries > tbl->gc_thresh3 ||         (tbl->entries > tbl->gc_thresh2 && now - tbl->last_flush > 5*HZ)). 

If this is the case, then neigh_forced_gc() runs a garbage collection and checks for whether sufficient space was freed in the table. If the space freed is insufficient, the function returns NULL and doesn't create a new neighbour structure.

If the table can accommodate the new entry, a new neighbour structure is taken from the memory cache tbl->kmem_cachep and added to the table. The state of the new entry is set to NUD_NONE, and a pointer to the new neighbour structure is returned.

neigh_forced_gc()

net/core/neighbour.c


If a neighbour table is full (see neigh_alloc()), the garbage collector runs neigh_force_gc() immediately. This function is invoked by neigh_alloc() to free memory space for new neighbour structures. Entries that meet the following conditions are deleted from the cache:

  • There is no longer any reference to the structure (n->refcnt == 1).

  • The neighbour is not permanent (n->nud_state != NUD_PERMANENT).

  • For an empty NUD_INCOMPLETE entry, the structure has to have been in the cache for at least retrans_time to avoid unnecessary duplication of request packets: (n->nud_state != NUD_INCOMPLETE || jiffies - n->used >n->parms->retrans_time).

The number of deleted entries is output as soon as this function has finished checking all neighbour entries.

arp_constructor()

net/ipv4/arp.c


Once neigh_create() has invoked the neigh_alloc() function to initialize a new neighbour structure, it invokes the appropriate constructor function for the specified neigh_table?/TT>for example, the arp_constructor() method for the ARP cache.

In the first step, arp_constructor() checks for whether the network device used requires the ARP protocol. If this is not the case, then the state of this entry is set to NUD_NOARP. Next, it checks for whether the hard_header_cache includes an entry for this network device. If so, then the neigh_ops field of this neighbour structure is set to arp_hh_ops. Otherwise, this neighbour entry uses the methods of the arp_generic_ops options. Finally, when the entry has reached the NUD_VALID state, the connected_output() function can be used to communicate with the neighbouring computer. Otherwise, the normal output() function will be used again.

neigh_table_init()

net/core/neighbour.c


neigh_table_init() takes the following steps to initialize a new neigh_table structure:

  • It obtains memory for the neighbour cache (tbl->kmem_cachep = kmem_cache_create()).

  • It initializes a timer (tbl->gc_timer()) and sets the expiry time to now + tbl->gc_interval + tbl->reachable_time. This timer calls neigh_periodic_timer() periodically.

  • It inserts the new table into a singly linked list, neigh_tables.

arp_hash()

net/ipv4/arp.c


The arp_tabl() function uses this function as a method for computing the hash function. The hash value is computed on the basis of the IP address (primary_key), using modulo NEIGH_HASHMASK (ARP table size).


       


    Linux Network Architecture
    Linux Network Architecture
    ISBN: 131777203
    EAN: N/A
    Year: 2004
    Pages: 187

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net