Section 8.5. TCP Congestion Control | Computer and Communication Networks (paperback)

8.5. TCP Congestion Control

TCP uses a form of end-to-end flow control. In TCP, when a sender sends a packet, the receiver acknowledges receipt of the packet. A sending source can use the acknowledgment arrival rate as a measure of network congestion. When it successfully receives an acknowledgment, a sender knows that the packet reached its desired destination. The sender can then send new packets on the network. Both the sender and the receiver agree on a common window size for packet flow. The window size represents the number of bytes that the source can send at a time. The window size varies according to the condition of traffic in the network to avoid congestion. Generally, a file of size f with a total transfer time of ” on a TCP connection results in a TCP transfer throughput denoted by r and obtained from

Equation 8.1

We can also derive the bandwidth utilization , _u , assuming that the link bandwidth is B , by

Equation 8.2

TCP has three congestion-control methods : additive increase , slow start , and retransmit . The following subsections describe these three mechanisms, which are sometimes combined to form the TCP congestion-control scheme.

8.5.1. Additive Increase, Multiplicative Decrease Control

An important variable in TCP congestion control is the value of the congestion window . Each connection has congestion window size, ‰ _g . The congestion window represents the amount of data, in bytes, that a sending source is allowed to have in transit at a particular instant of time in a network. Additive increase, multiplicative decrease control performs a slow increase in the congestion window size when the congestion in the network decreases and a fast drop in the window size when congestion increases . Let ‰ _m be the maximum window size , in bytes, representing the maximum amount of unacknowledged data that a sender is allowed to send. Let ‰ _a be the advertised window sent by the receiver, based on its buffer size. Thus,

Equation 8.3

By having ‰ _m replace ‰ _a , a TCP source is not permitted to transmit faster than the network or the destination. The challenge in TCP congestion control is for the source node to find a right value for the congestion window. The congestion window size varies, based on the traffic conditions in the network. TCP watches for timeout as a sign of congestion. One can arrange timeouts to be used as acknowledgments to find the best size for the congestion window. This is done because the implications for having too large a window are much worse than having too small a window. This TCP technique requires that the timeout values be set properly. Two important factors in setting timeouts follow.

Average round-trip times (RTTs) and RTT standard deviationsare based to set timeouts.
RTTs are sampled once every RTT is completed.

Figure 8.6 depicts the additive-increase method. The congestion window is interpreted in terms of packets rather than bytes. Initially, the source congestion window is set to one packet. Once it receives an acknowledgment for the packet, the source increments its congestion window by one packet. So the source transmits two packets at that point. On successful receipt of both acknowledgments, the source once again increments the congestion window by one packet (additive increase).

Figure 8.6. Additive increase control for TCP congestion control

In practice, the source increments its congestion window by a small amount for each acknowledgment instead of waiting for both acknowledgments to arrive . If a timeout occurs, the source assumes that congestion is developing and therefore sets the congestion window size to half its previous value (multiplicative decrease). The minimum congestion window size is called maximum segment size, which represents one packet. In general, a TCP segment is defined as a TCP session packet containing part of a TCP bytestream in transit.

8.5.2. Slow Start Method

Additive increase is ideally suited when the network operates near capacity. Initially, it would take a considerable amount of time to increase the congestion window. The slow-start method increases the congestion window size nonlinearly and in most cases exponentially, as compared to the linear increase in additive increase. Figure 8.7 shows the slow-start mechanism. In this case, the congestion window is again interpreted in packets instead of bytes.

Figure 8.7. Slow-start timing between a source and a destination

A source initially sets the congestion window to one packet. When its corresponding acknowledgment arrives, the source sets the congestion window to two packets. Now, the source sends two packets. On receiving the two corresponding acknowledgments, TCP sets the congestion window size to 4. Thus, the number of packets in transit doubles for each round-trip time. This nonlinearity trend of increase in the window size continues as seen in the figure. With this method of congestion control, routers on a path may not be able to service the flow of traffic, as the volume of packets increases nonlinearly. This congestion-control scheme by itself may lead to a new type of congestion. The slow-start method is normally used

just after a TCP connection is set up or
when a source is blocked, waiting for a timeout.

A new variable, congestion threshold , is defined. This variable is a saved size of the congestion window when a timeout arrives. When a timeout occurs, the threshold is set to half the congestion window size. Then the congestion window is reset to one packet and ramped up all the way to the congestion threshold, using the slow-start method. Once the connection is established, a burst of packets is sent during a slow start. Then, a number of packets are lost, and the source admits no waiting time for acknowledgments. Finally, a timeout occurs, and the congestion window is reduced. Thus, a timeout results in the reduction of the congestion window, as in the previous scheme. The congestion threshold and the congestion window are reset. Slow start is used to increase the congestion window size exponentially.

After the congestion threshold is reached, additive increase is used. At this point, packets may be lost for a while, and the source removes the waiting time for acknowledgments. Then, a timeout occurs immediately, and the congestion window size is reduced. The congestion threshold is reset, and the congestion window is reset to one packet. Now, the source uses slow start to ramp up, and then additive increase is used. After reaching the congestion threshold, additive increase is used. This pattern continues, creating a pulse-type plot. The reason for the large packet loss initially with slow start is that it is more aggressive in the beginning in order to learn about the network. This may result in a few packet losses, but it seems to be better than the conservative approach, in which the throughput is very small.

8.5.3. Fast Retransmit Method

Fast retransmit is based on the concept of duplicate acknowledgment (ACK). The additive-increase and slow-start mechanisms have idle periods, during which the source admits no waiting time for an acknowledgment. Fast retransmit of segments sometimes leads to a retransmission of the lost packet before the associated timeout periods.

Each time it receives an out-of-order packet, the destination should respond with a duplicate ACK of the last successful in-order packet that has arrived. This must be done even if the destination has already acknowledged the packet. Figure 8.8 illustrates the process. The first three packets are transmitted, and their acknowledgments are received.

Figure 8.8. Timing of retransmit method between a source and a destination

Now, assume that packet 4 is lost. Since it is not aware of this lost packet, the source continues to transmit packet 5 and beyond. However, the destination sends the duplicate acknowledgment of packet 3 to let the source know that it has not received packet 4. In practice, once the source receives three duplicate acknowledgments, it retransmits the lost packet. Fast recovery is another improvement to TCP congestion control. When congestion occurs, instead of dropping the congestion window to one packet, the congestion window size is dropped to half, and additive increase is used. Thus, the slow-start method is used only during the initial connection phase and when a timeout occurs. Otherwise, additive increase is used.

8.5.4. TCP Congestion Avoidance Methods

Network congestion is a traffic bottleneck between a source and a destination. Congestion avoidance uses precautionary algorithms to avoid possible congestion in a network. Otherwise, TCP congestion control is applied once congestion occurs in a network. TCP increases the traffic rate to a point where congestion occurs and then gradually reduces the rate. It would be better if congestion could be avoided. This would involve sending some precautionary information to the source just before packets are discarded. The source would then reduce its sending rate, and congestion could be avoided to some extent.

Source-Based Congestion Avoidance

Source-based congestion avoidance detects congestion early from end- hosts . An end host estimates congestion in the network by using the round-trip time and throughput as it measures. An increase in round-trip time can indicate that routers' queues on the selected routing path are increasing and that congestion may happen. The source-based schemes can be classified into four basic algorithms:

Use of round trip time (RTT) as a measure of congestion in the network. As queues in the routers build up, the RTT for each new packet sent out on the network increases. If the current RTT is greater than the average of the minimum and maximum RTTs measured so far, the congestion window size is reduced.
Use of RTT and window size to set the current window size. Let ‰ be the current window size, ‰ _o be the old window size, r be the current RTT, and r _o be the old RTT. A window RTT product is computed based on ( ‰ - ‰ _o )( r - r _o ). If the product is positive, the source decreases the window size by a fraction of its old value. If the product is negative or 0, the source increases the window size by one packet.
Use of throughput as a measure to avoid congestion. During every RTT, a source increases the window size by one packet. The achieved throughput is then compared with the throughput when the window size was one packet smaller. If the difference is less than half the throughput at the beginning of the connection when the window size was one packet, the window size is reduced by one packet.
Use of throughput as a measure to avoid congestion. But this time, the algorithm uses two parameters: the current throughput and the expected throughput to avoid congestion.

TCP normalized is presented next , as an example for the fourth algorithm.

TCP Normalized Method

In the TCP normalized method , the congestion window size is increased in the first few seconds, but the throughput remains constant, because the capacity of the network has been reached, resulting in an increase in the queue length at the router. Thus, an increase in the window size results in any increase in the throughput. This traffic over and above available bandwidth of the network is called extra data. The idea behind TCP normalized is to maintain this extra data at a nominal level. Too much of extra data may lead to longer delays and congestion. Too little extra data may lead to an underutilization of resources, because the available bandwidth changes owing to the bursty nature of Internet traffic. The algorithm defines the expected value of the rate E [ r ] as

Equation 8.4

where r _m is the minimum of all the measured round-trip times, and ‰ _g is the congestion window size. We define A _r as the actual rate and ( E[r] - A _r ) as the rate difference. We also denote the maximum and minimum threshold to be _max and _min , respectively. When the rate difference is very smallless than _min the method increases the congestion window size to keep the amount of extra data at a nominal level. If the rate difference is between _min and _max , the congestion window size is unaltered. When the rate difference is greater than _max , there is too much extra data, and the congestion window size is reduced. The decrease in the congestion window size is linear. The TCP normalized method attempts to maintain the traffic flow such that the difference between expected and actual rates lies in this range.