Exceptional Conditions during TCP Communications
The TCP protocol is tremendously robust in the face of poor network conditions. It can survive slow connections, flaky routers, intermittent network outages, and a variety of misconfigurations, and still manage to deliver a consistent, error-free data stream.
all problems. This section
discusses the common exceptions, as well as some common programming errors.
Exceptions during connect()
Various errors are common during calls to
The remote host is up, but no server is listening when the client
A client tries to connect to a remote host, but no server is listening to the indicated port. The
function aborts with a "Connection
The remote host is down when the client tries to connect.
A client tries to connect to a remote host, but the host is not running (it is crashed or unreachable). In this case,
blocks until it times out with a "Connection timed out" (
) error. TCP is
of slow network connections, so the timeout might not occur for many minutes.
The network is misconfigured.
A client tries to connect to a remote host, but the operating system can't figure out how to route the message to the desired destination, because of a local misconfiguration or the failure of a router somewhere along the way. In this case,
fails with a "Network is unreachable" (
There is a programmer error.
Various errors are due to common programming mistakes. For example, an attempt to call
with a filehandle rather than a socket results in a "Socket operation on non-socket" (
) error. An attempt to call
on a socket that is already connected results in "Transport endpoint is already connected" (
error can also be returned by other socket calls, including
, and the
Exceptions during Read and Write Operations
Once a connection is established, errors are still possible. You are almost sure to encounter the following errors during your work with networked programs.
The server program crashes while you are connected to it.
If the server
crashes during a communications session, the operating system will close the socket. From your perspective, the situation is identical to the remote program closing its end of the socket deliberately.
On reads, this results in an EOF the
is called. On
, this results in a PIPE exception, exactly as in the pipe examples in Chapter 2. If you intercept and handle
returns false and
is set to "Broken pipe" (
). Otherwise, your program is
The server host crashes while a connection is established.
On the other hand, if the
crashes while a TCP connection is active, the operating system has no chance to terminate the connection gracefully. At your end, the operating system has no way to distinguish between a dead host and one that is simply experiencing a very long network
. Your host will continue to retransmit IP packets in hopes that the remote host will reappear. From your perspective, the current read or write call will block indefinitely.
At some later time, when the remote host comes back on line, it will receive one of the local host's retransmitted packets. Not knowing what to do with it, the host will transmit a low-level reset message, telling the local host that the connection has been rejected. At this point, the connection is broken, and your program receives either an EOF or a pipe error, depending on the operation.
One way to avoid blocking indefinitely is to set the
option on the socket. In this case, the connection times out after some period of unresponsiveness, and the socket is closed. The keepalive timeout is relatively long (minutes in some cases) and cannot be changed.
The network goes down while a connection is established.
goes down while a connection is established, making the remote host unreachable, the current I/O operation blocks until connectivity is restored. In this case, however, when the network is restored the connection usually continues as if nothing
, and the I/O operation completes successfully.
There are several exceptions to this, however. If, instead of simply going down, one of the routers along the way starts issuing error messages, such as "host unreachable," then the connection is terminated and the effect is similar to scenario (1). Another common situation is that the remote server has its own timeout system. In this case, it times out and
the connection as soon as network connectivity is restored.