24.3 Connection Management

   


Being a connection-oriented protocol that supports a number of additional mechanisms, such as packet transmission in the correct order or urgent data, the TCP protocol is extremely complex. The protocol machine shown in Figure 24-5 is characterized by a total of twelve states. This complexity calls for extensive management of the current state of active connections.

Figure 24-5. The TCP state automaton.

graphics/24fig05.gif


24.3.1 The TCP State Machine

A TCP connection's state is stored in the state field of the associated sock structure. The response to the receipt of packets is different, depending on the state, so this state has to be polled for each incoming packet. There are three phases: the connection-establishment phase, the data-transmission phase, and the connection-teardown phase. Section 24.4 describes the protocol mechanisms of the data-transmission phase in detail. This section discusses the connection-establishment and connection-teardown phases.

As shown in Section 24.2, tcp_rcv_state_process() (net/ipv4/tcp_input.c) is the most important function for connection management, as long as the connection has not yet been established. Packets in the TIME_WAIT state are the only packets handled earlier in the tcp_v4_rcv() function.

tcp_rcv_state_process()

net/ipv4/tcp_input.c


The tcp_rcv_state_process() function handles mainly state transitions and the management work for the connection. Depending on the connection state, there are different actions when a packet is received:

  • In the CLOSED state: The packet is dropped.

  • In the LISTEN state: If ACK or SYN flags are set, then the connection establishment is registered, and data is ignored.

  • In the SYN_SENT state: tcp_rcv_synsent_state_process() checks for correct connection establishment, and the connection is moved to the ESTABLISHED state. If this fails, then the remaining flags are processed.

  • If the PAWS check finds an error, then a DUPACK is returned, and the function is exited.

  • Subsequently, the sequence number is checked, and if a packet arrived out of order, then a DUPACK is returned and the packet is dropped.

  • If the RST flag is active, the connection is reset and the packet is dropped.

  • If a timestamp is present in the segment header, the recent timestamp stored locally is updated.

  • If the SYN flag is set, but invalid due to the sequence number, the connection is reset and the packet is dropped.

  • If the ACK flag is active, the next action is different, depending on the state:

    • In the SYN_RCVD state: The connection state changes to ESTABLISHED, and the acknowledgement is processed.

    • In the FIN_WAIT_1 state: The connection state changes to FIN-WAIT2 and the TIMEWAIT timer is set.

    • In the CLOSING state: Transition to the TIMEWAIT state occurs, if the packet is not out of order.

    • In the LAST_ACK state: The socket is reset, and the state changes to the CLOSED state, if the packet is not out of order.

  • If the URG flag is active, then the urgent data is processed (by the tcp_urg() function).

  • If the packet contains payload, then this data is processed or dropped, or an RST packet is sent, depending on the connection state.

  • The packet is deleted in all other cases.

24.3.2 Establishing a Connection

A connection to the partner instance has to be established before a TCP instance can send payload. A connection is established on the basis of the so-called three-way hand-shake to reduce the probability of establishing a wrong connection. This could happen, for example, if a connection is established more than once, because of timeouts, or when a connection is established between two TCP protocol instances before an existing connection is reset.

To begin establishing a connection, both TCP protocol instances define an initial value for the sequence number (Initial Sequence Number ISN). These initial values are exchanged and acknowledged between the participating TCP protocol instances in the three-way handshake.

The connection diagram shown in Figure 24-5 has the following states for the connection establishment phase:

  • LISTEN: After passive opening, the local TCP waits for a SYN as a request to establish a connection.

  • SYN_SENT: After sending a SYN, the local TCP waits for a connection establishment by the TCP instance of the communication partner.

  • SYN_RECV: The local TCP waits for an acknowledgement that the connection has been established (ACK to SYN).

  • ESTABLISHED: The connection is established, and the two communicating partners can exchange data; the connection-establishment phase was exited.

The most interesting functions during the establishment of a connection are those to initialize the sock structure, to set options before transmitting data, and to request a connection.

tcp_v4_init_sock()

net/ipv4/tcp_ipv4.c


This function runs various initialization actions: initialize queues and timers, initialize variables for slow start and maximum segment size, and set the appropriate state (TCP_CLOSE) and the pointer for PF_INET-specific routines.

tcp_setsockopt()

net/ipv4/tcp.c


This function sets the options selected by the service consumer for the TCP protocol: TCP_MAXSEG, TCP_NODELAY, TCP_CORK, TCP_KEEPIDLE, TCP_KEEPINTVL, TCP_KEEPCNT, TCP_SYNCNT, TCP_LINGER2, TCP_DEFER_ACCEPT, and TCP_WINDOW_CLAMP. The following options are important for the throughput of the TCP protocol:

  • TCP_MAXSEG: This option specifies the maximum segment length stored in the user_mss variable of the tcp_opt data structure.

  • TCP_NODELAY: This option deactivates the load-control function by setting a suitable value for the nonagle variable in the tcp_opt data structure. (See Section 24.4.3.)

tcp_connect()

net/ipv4/tcp_output.c


This function initializes an outgoing connection: It reserves memory for the data unit headers in the sk_buff structures, initializes the sliding-window variables, sets the maximum segment length (taking the service consumer options into account), sets the TCP header (including the SYN flag), sets the appropriate TCP state, initializes the timers and control variables for retransmission, and finally passes a copy of the initialized segment to the tcp_transmit_skb() routine to send and, subsequently, set the timer for retransmission of the connection-establishment segment.

Transition from CLOSED to SYN_SENT

To establish the connection, the client sends a packet with the SYN flag set, and then changes from the CLOSED state to the SYN_SENT state. This happens in the tcp_connect() method, which is invoked by tcp_v4_connect() (see Figure 24-6). The tcp_v4_connect() function is invoked when the client application calls connect() at the socket interface.

Figure 24-6. Transition to the SYN_SENT state.

graphics/24fig06.gif


tcp_connect() (net/ipv4/tcp_output.c) changes the state to SYN_SENT: tcp_set_state(sk, TCP_SYN_SENT);.

Transition from LISTEN to SYN_RECV

The LISTEN state is assumed by the server's TCP when the server application activates the listen() invocation at the socket interface. When the TCP in the server receives the SYN character in the LISTEN state, it changes to the SYN_RECV state. This happens in the tcp_create_openreq_child()() function with the newsk->state = TCP_SYN_RECV; assignment. The left path in Figure 24-7 shows how this method is invoked. Subsequently, the tcp_rcv_state_process() function assumes further handling. This function uses the function pointer tcp->af_specific->conn_request() to invoke the tcp_v4_conn_request() function (net/ipv4/tcp_ipv4.c), which specifies the initial sequence number. Finally, the tcp_v4_send_synack() function is used to send a reply with the SYN and ACK flags set.

Figure 24-7. Transition from LISTEN to the SYN_RECEIVED and ESTABLISHED states.

graphics/24fig07.gif


Transition from SYN_SENT to ESTABLISHED

After it has received a packet with the SYN and ACK flags set, the client TCP sends an ACK to the server and changes from the SYN_SENT state into the ESTABLISHED state.

tcp_rcv_synsent_state_process()

net/ipv4/tcp_input.c


The appropriate part of this function checks on whether the ACK and SYN flags were set and then returns a packet with the ACK flag set. The TCP changes into the ESTABLISHED state:

 if (th->ack) {              (...)              if (!th->syn)                      goto discard;              (...)              tcp_set_state(sk, TCP_ESTABLISHED);              (...)              tcp_schedule_ack(tp);              (...) } 

Transition from SYN_SENT to SYN_RECEIVED

If client TCP is in SYN_SENT state and receives only one packet with the SYN flag set, it returns a packet with the SYN and ACK flags set to the server and changes into the SYN_RECEIVED state. This happens when both TCP protocol instances start establishing a connection simultaneously.

tcp_rcv_synsent_state_process()

net/ipv4/tcp_input.c


TCP changes into the SYN_RECEIVED state:

 if (th->syn) {               tcp_set_state(sk, TCP_SYN_RECV);               (...)               tcp_send_synack(sk);               (...) } 

Transition from SYN_RECEIVED to ESTABLISHED

From the SYN_RECEIVED state, the server switches to the ESTABLISHED state as soon as it receives an ACK character (ACK to SYN).

tcp_rcv_state_process()

net/ipv4/tcp_input.c


TCP changes into the ESTABLISHED state:

 if (th->ack) {               switch(sk->state) {               case TCP_SYN_RECV:                    (...)                    tcp_set_state(sk, TCP_ESTABLISHED);              } } 

Now the connection is established and the communication partners can exchange data.

24.3.3 Tearing Down a Connection

A connection between two communicating partners can be terminated in either of two different ways: graceful close and abort.

  • Graceful close: The higher-layer protocols of both computers start tearing down the connection either simultaneously or consecutively. TCP monitors this process and ensures that the connection is not disestablished unless all data has finally been transmitted.

  • Abort: A higher-layer protocol forces the establishment to be torn down. In this case, the process of tearing down a connection is not monitored, and so data can be lost.

The state-transition diagram includes the following states for the connection-closing phase:

  • FINWait 1: The local TCP initiated the connection teardown process and is waiting for a FIN or ACK to the FIN sent by the remote TCP.

  • FINWait 2: The local TCP received an ACK to the FIN sent by the remote TCP and is now waiting for the connection to be closed by the remote TCP (FIN).

  • Closing: Once a FIN has been sent and received, the local TCP waits for the final ACK.

  • TimeWait: Once it has received the connection-closing ACK from the remote TCP, the local TCP has to wait until it is sure that the remote TCP has received the final ACK.

  • Close Wait: A request to close the connection (FIN) has been received.

  • Last ACK: Having sent a FIN to acknowledge the connection teardown, the local TCP is now waiting for the final ACK.

  • Closed: The connection was closed.

Transition from ESTABLISHED to FIN_WAIT_1

Like the connection-establishment phase, the connection-teardown phase also uses a kind of three-way handshake. In this case, it is assumed that both communication partners are in the ESTABLISHED state. Specifically, a computer, A, initiates the connection closing by sending a packet with the FIN flag set to computer B and then switches to the FIN_WAIT_1 state. (See Figure 24-8.)

Figure 24-8. Transition from FIN_WAIT_1 to LAST_ACK.

graphics/24fig08.gif


tcp_close_state()

net/ipv4/tcp.c


This function switches the TCP to the next state, FIN_WAIT_1:

 /* ns: next state (FIN wait 1) */ tcp_set_state(sk, ns); 

Transition from ESTABLISHED to CLOSE_WAIT

When computer B receives the packet with the FIN flag set, it sends an ACK character to computer A and switches from the ESTABLISHED state into the CLOSE_WAIT state. (See left path in Figure 24-9.)

Figure 24-9. Transition from CLOSE_WAIT to FIN_WAIT_2 to TIME_WAIT and finally to CLOSING.

graphics/24fig09.gif


tcp_fin()

net/ipv4/tcp_input.c


This function is invoked if the FIN flag is set in the packet received, and TCP is switched to the CLOSE_WAIT state:

 switch(sk->state) {             case TCP_SYN_RECV:             case TCP_ESTABLISHED:                  /* Move to CLOSE_WAIT */                  tcp_set_state(sk, TCP_CLOSE_WAIT);                    if (th->rst)                            sk->shutdown = SHUTDOWN_MASK;                    break;                (...) } 

tcp_ack_snd_check() (net/ipv4/tcp_input.() sends a packet with the ACK flag set: tcp_send_ack(sk);.

Transition from CLOSE_WAIT to LAST_ACK

While in the CLOSE_WAIT state, TCP tries to pass all data from the receive buffer to the higher-layer protocol as quickly as possible. TCP sends a FIN character to the TCP in computer A only if the local application in computer B has no more data to send. This confirms the connection teardown process, and the TCP in computer B switches to the LAST_ACK state. (See Figure 24-8.)

tcp_close_state()

net/ipv4/tcp.c


TCP switches to the LAST_ACK state as follows:

 /* ns: next state (Last ACK) */ tcp_set_state(sk, ns); 

For the TCP in the FIN_WAIT_1 state in computer A, there are different ways to tear down the connection, depending on how computer B responds.

Transition from FIN_WAIT_1 to FIN_WAIT_2

When the user at computer B sends data ready for transmission, after the FIN character was received, then computer B acknowledges the FIN character by sending an ACK character. Subsequently, it will send the FIN character after all data has been sent. In this case, the TCP in computer A switches to the FIN_WAIT_2 state once it has received the ACK character. (See righthand path in Figure 2-49)

tcp_rcv_state_process()

net/ipv4/tcp_input.c


This switches TCP into the FIN_WAIT_2 state:

 if (th->ack) {         switch(sk->state) {               (...)               case TCP_FIN_WAIT1:                    (...)                    tcp_set_state(sk, TCP_FIN_WAIT2);                    (...)               (...)        } } 

Transition from FIN_WAIT_2 to TIME_WAIT

As soon as the TCP in the FIN_WAIT_2 state in computer A receives a FIN character from computer B, it sends an ACK and switches into the TIME_WAIT state (right-hand path in Figure 24-9).

tcp_fin()

net/ipv4/tcp_input.c


 switch(sk->state) {       (...)       case TCP_FIN_WAIT2:            /* Received a FIN - send ACK and enter TIME_WAIT */            tcp_send_ack(sk);            tcp_time_wait(sk, TCP_TIME_WAIT, 0);            break;       (...) } 

Transition from FIN_WAIT_1 to TIME_WAIT

When computer A in the FIN_WAIT_1 state receives an ACK and a FIN character as a response to its FIN character, then it sends an ACK to acknowledge the connection-closing process, then switches to the FIN_WAIT_2 state:

tcp_rcv_state_process()

net/ipv4/tcp_input.c


This function causes TCP to switch into the FIN_WAIT_2 state. If the FIN flag was set, then the transition from FIN_WAIT_2 to TIME_WAIT described above is initiated:

 switch(sk->state) {       (...)       case TCP_FIN_WAIT1:            (...)            tcp_set_state(sk, TCP_FIN_WAIT2);            (...)            sk->state_change(sk);       (...) } 

Immediately afterwards, the function tcp_rcv_state_process() uses tcp_data_queue() to invoke the tcp_fin() function, if a FIN was set additionally. (See Figure 24-9.) Next, an ACK is sent from within the tcp_fin() function, and the state is moved to TIME_WAIT:

 switch(sk->state) {       (...)       case TCP_FIN_WAIT2:            /* Received a FIN -- send ACK and enter TIME_WAIT. */            tcp_send_ack(sk);            tcp_time_wait(sk, TCP_TIME_WAIT, 0);            break;       (...) } 

Transition from the FIN_WAIT_1 State to CLOSING

If computer A in the FIN_WAIT_1 state initially receives a FIN character, then it switches into the CLOSING state. (See Figure 24-9.)

tcp_fin()

net/ipv4/tcp_input.c


This function switches the TCP into the CLOSING state and sends a packet with the ACK flag set:

 switch(sk->state) {       (...)       case TCP_FIN_WAIT1:            tcp_send_ack(sk);            tcp_set_state(sk, TCP_CLOSING);       (...) } 

Transition from CLOSING to TIME_WAIT

While in the CLOSING state, TCP waits until it receives an ACK character. Subsequently, it switches into the TIME_WAIT state. (See Figure 24-9.)

tcp_rcv_state_process()

net/ipv4/tcp_input.c


 if (th->ack) {         switch(sk->state) {               (...)               case TCP_CLOSING:                    (...)                    tcp_time_wait(sk, TCP_TIME_WAIT, 0                    (...)               (...)        } } 

The TIME_WAIT State

The three different ways to tear down a connection all converge in the TIME_WAIT state. Computer A has to wait a specific period of time (twice the maximum segment lifecycle) before the connection is finally closed.

Additional functions are required to handle the TIME_WAIT state, and the special cases that can occur in this state, correctly.

_tcp_time_wait()

net/ipv4/tcp_minisocks.c


This function activates the TIME_WAIT state by initializing the tcp_tw_bucket structure and entering it in a hash table. As described earlier, this function is invoked by tcp_fin() when a connection changes to the TIME_WAIT state.

_tcp_tw_hashdance()

net/ipv4/tcp_minisocks.c


This function is a helper function of the tcp_time_wait function. It adds the tcp_tw_bucket structure to a hash table.

tcp_timewait_kill()

net/ipv4/tcp_minisocks.c


This function deletes a connection, or its representation in the form of a tcp_tw_bucket structure, from the hash table for established connections.

The receipt of packets for a connection currently in the TIME_WAIT state is separated in the function tcp_v4_do_rcv() and taken over by the function tcp_timewait_state_process() for further handling.

tcp_timewait_state_process()

net/ipv4/tcp_minisocks.c


This function processes a packet in the TIME_WAIT state, a state in which packets can be received only under specific conditions. Specifically, in relation to the tcp_v4_rcv() function, the following three cases are handled: receiving of a SYN packet, of a SYN-ACK packet, and of an RST packet. The connection is reestablished when an SYN packet arrives under certain conditions. Also, an ACK is sent as a response to a SYN-ACK packet under certain conditions. In contrast, an RST packet is normally sent as a response to an RST packet.


       


    Linux Network Architecture
    Linux Network Architecture
    ISBN: 131777203
    EAN: N/A
    Year: 2004
    Pages: 187

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net