< Day Day Up > |
TCP is a reliable transport layer protocol that offers a full duplex connection byte stream service. The bandwidth of TCP makes it appropriate for wide area IP networks where there is a higher chance of packet loss or reordering. What really complicates TCP are the flow control and congestion control mechanisms. These mechanisms often interfere with each other, so proper tuning is critical for high-performance networks. We start by explaining the TCP state machine, then describe in detail how to tune TCP, depending on the actual deployment. We also describe how to scale the TCP connection-handling capacity of servers by increasing the size of TCP connection state data structures. FIGURE 3-8 presents an alternative view of the TCP state engine. Figure 3-8. TCP State Engine Server and Client NodeThis figure shows the server and client socket API at the top and the TCP module with the following three main states: Connection SetupThis includes the collection of substates that collectively set up the socket connection between the two peer nodes. In this phase, the set of tunable parameters includes:
For a server, there are two trade-offs to consider:
Connection EstablishedThis includes the main data transfer state (the focus of our tuning explanations in this chapter). The tuning parameters for congestion control, latency, and flow control will be described in more detail. FIGURE 3-8 shows two concurrent processes that read and write to the bidirectional full-duplex socket connection. Connection ShutdownThis includes the set of substates that work together to shut down the connection in an orderly fashion. We will see important tuning parameters related to memory. Tunable parameters include:
Note close_wait is no longer a tunable parameter. Instead, use tcp_time_wait_interval. TCP Tuning on the Sender SideTCP tuning on the sender side controls how much data is injected into the network and the remote client end. There are several concurrent schemes that complicate tuning. So to better understand, we will separate the various components and then describe how these mechanisms work together. We will describe two phases: Startup and Steady State. Startup Phase TCP tuning is concerned with how fast we can ramp up sending packets into the network. Steady State Phase tuning is concerned about other facets of TCP communication such as tuning timers, maximum window sizes, and so on. Startup PhaseIn Startup Phase tuning, we describe how the TCP sender starts to initially send data on a particular connection. One of the issues with a new connection is that there is no information about the capabilities of the network pipe. So we start by blindly injecting packets at a faster and faster rate until we understand the capabilities and adjust accordingly. Manual TCP tuning is required to change macro behavior, such as when we have very slow pipes as in wireless or very fast pipes such as 10 Gbit/sec. Sending an initial maximum burst has proven disastrous. It is better to slowly increase the rate at which traffic is injected based on how well the traffic is absorbed. This is similar to starting from a standstill on ice. If we initially floor the gas pedal, we will skid, and then it is hard to move at all. If, on the other hand, we start slowly and gradually increase speed, we can eventually reach a very fast speed. In networking, the key concept is that we do not want to fill buffers. We want to inject traffic as close as possible to the rate at which the network and target receiver can service the incoming traffic. During this phase, the congestion window is much smaller than the receive window. This means the sender controls the traffic injected into the receiver by computing the congestion window and capping the injected traffic amount by the size of the congestion window. Any minor bursts can be absorbed by queues. FIGURE 3-9 shows what happens during a typical TCP session starting from idle. Figure 3-9. TCP Startup PhaseThe sender does not know the capacity of the network, so it starts to slowly send more and more packets into the network trying to estimate the state of the network by measuring the arrival time of the ACK and computed RTT times. This results in a self-clocking effect. In FIGURE 3-9, we see the congestion window initially starts with a minimum size of the maximum segment size (MSS), as negotiated in the three-way handshake during the socket connection phase. The congestion window is doubled every time an ACK is returned within the timeout. The congestion window is capped by the TCP tunable variable tcp_cwnd_max, or until a timeout occurs. At that point, the ssthresh internal variable is set to half of tcp_cwnd_max. ssthresh is the point where upon a retransmit, the congestion window grows exponentially. After this point it grows additively, as shown in FIGURE 3-9. Once a timeout occurs, the packet is retransmitted and the cycle repeats. FIGURE 3-9 shows that there are three important TCP tunable parameters:
In different types of networks, you can tune these values slightly to impact the rate at which you can ramp up. If you have a small network pipe, you want to reduce the packet flow, whereas if you have a large pipe, you can fill it up faster and inject packets more aggressively. Steady State PhaseIn Steady State Phase, after the connection has stabilized and completed the initial startup phase, the socket connection reaches a phase that is fairly steady and tuning is limited to reducing delays due to network and client congestion. An average condition must be used because there are always some fluctuations in the network and client data that can be absorbed. Tuning TCP in this phase, we look at the following network properties:
In short, tuning will be adjusted according to the type of network and associated key properties: propagation delay, link speed, and error rate. These properties actually self-adjust in some instances by measuring the return of acknowledgments. We will look at various emerging network technologies: optical WAN, LAN, wireless, and so on and describe how to tune TCP accordingly. |
< Day Day Up > |