14.8 Sockets and Standard IO

14.8 Sockets and Standard I/O

In all our examples so far, we have used what is sometimes called Unix I/O , the read and write functions and their variants ( recv , send , etc.). These functions work with descriptors and are normally implemented as system calls within the Unix kernel.

Another method of performing I/O is the standard I/O library . It is specified by the ANSI C standard and is intended to be portable to non-Unix systems that support ANSI C. The standard I/O library handles some of the details that we must worry about ourselves when using the Unix I/O functions, such as automatically buffering the input and output streams. Unfortunately, its handling of a stream's buffering can present a new set of problems we must worry about. Chapter 5 of APUE covers the standard I/O library in detail, and [Plauger 1992] presents and discusses a complete implementation of the standard I/O library.

The term stream is used with the standard I/O library, as in "we open an input stream" or "we flush the output stream." Do not confuse this with the STREAMS subsystem, which we will discuss in Chapter 31.

The standard I/O library can be used with sockets, but there are a few items to consider:

A standard I/O stream can be created from any descriptor by calling the fdopen function. Similarly, given a standard I/O stream, we can obtain the corresponding descriptor by calling fileno . Our first encounter with fileno was in Figure 6.9 when we wanted to call select on a standard I/O stream. select works only with descriptors, so we had to obtain the descriptor for the standard I/O stream.
TCP and UDP sockets are full-duplex. Standard I/O streams can also be full-duplex : we just open the stream with a type of r+ , which means read-write. But on such a stream, an output function cannot be followed by an input function without an intervening call to fflush , fseek , fsetpos , or rewind . Similarly, an input function cannot be followed by an output function without an intervening call to fseek , fsetpos , or rewind , unless the input function encounters an EOF. The problem with these latter three functions is that they all call lseek , which fails on a socket.
The easiest way to handle this read-write problem is to open two standard I/O streams for a given socket: one for reading and one for writing.

Example: `str_echo` Function Using Standard I/O

We now show an alternate version of our TCP echo server (Figure 5.3), which uses standard I/O instead of read and writen . Figure 14.14 is a version of our str_echo function that uses standard I/O. (This version has a problem that we will describe shortly.)

Figure 14.14 `str_echo` function recoded to use standard I/O.

advio/str_echo_stdio02.c

 1 #include    "unp.h"  2 void  3 str_echo(int sockfd)  4 {  5     char     line[MAXLINE];  6     FILE    *fpin,  *fpout;  7     fpin = Fdopen(sockfd, "r");  8     fpout = Fdopen(sockfd, "w");  9     while (Fgets(line, MAXLINE, fpin) != NULL) 10         Fputs(line, fpout); 11 }

Convert descriptor into input stream and output stream

7 “10 Two standard I/O streams are created by fdopen: one for input and one for output. The calls to read and writen are replaced with calls to fgets and fputs .

If we run our server with this version of str_echo and then run our client, we see the following:

`hpux %` `tcpcli02 206.168.112.96`
`hello, world`	we type this line, but nothing is echoed
`and hi`	and this one, still no echo
`hello??`	and this one, still no echo
`^D`	and our EOF character
`hello, world`	and then the three echoed lines are output
`and hi`
`hello??`

There is a buffering problem here because nothing is echoed by the server until we enter our EOF character. The following steps take place:

We type the first line of input and it is sent to the server.
The server reads the line with fgets and echoes it with fputs .
The server's standard I/O stream is fully buffered by the standard I/O library. This means the library copies the echoed line into its standard I/O buffer for this stream, but does not write the buffer to the descriptor, because the buffer is not full.
We type the second line of input and it is sent to the server.
The server reads the line with fgets and echoes it with fputs .
Again, the server's standard I/O library just copies the line into its buffer, but does not write the buffer because it is still not full.
The same scenario happens with the third line of input that we enter.
We type our EOF character, and our str_cli function (Figure 6.13) calls shutdown , sending a FIN to the server.
The server TCP receives the FIN, which fgets reads, causing fgets to return a null pointer.
The str_echo function returns to the server main function (Figure 5.12) and the child terminates by calling exit .
The C library function exit calls the standard I/O cleanup function (pp. 162 “164 of APUE). The output buffer that was partially filled by our calls to fputs is now output.
The server child process terminates, causing its connected socket to be closed, sending a FIN to the client, completing the TCP four-packet termination sequence.
The three echoed lines are received by our str_cli function and output.
str_cli then receives an EOF on its socket, and the client terminates.

The problem here is the buffering performed automatically by the standard I/O library on the server. There are three types of buffering performed by the standard I/O library:

Fully buffered means that I/O takes place only when the buffer is full, the process explicitly calls fflush , or the process terminates by calling exit . A common size for the standard I/O buffer is 8,192 bytes.
Line buffered means that I/O takes place when a newline is encountered , when the process calls fflush , or when the process terminates by calling exit .
Unbuffered means that I/O takes place each time a standard I/O output function is called.

Most Unix implementations of the standard I/O library use the following rules:

Standard error is always unbuffered.
Standard input and standard output are fully buffered, unless they refer to a terminal device, in which case, they are line buffered.
All other streams are fully buffered unless they refer to a terminal device, in which case, they are line buffered.

Since a socket is not a terminal device, the problem seen with our str_echo function in Figure 14.14 is that the output stream ( fpout ) is fully buffered. One way around this is to force the output stream to be line buffered by calling setvbuf . Another is to force each echoed line to be output by calling fflush after each call to fputs . But in practice, either of these solutions is still error-prone and may interact badly with the Nagle algorithm described in Section 7.9. In most cases, the best solution is to avoid using the standard I/O library altogether for sockets and operate on buffers instead of lines, as described in Section 3.9. Using standard I/O on sockets may make sense when the convenience of standard I/O streams outweighs the concerns about bugs due to buffering, but these are rare cases.

Be aware that some implementations of the standard I/O library still have a problem with descriptors greater than 255. This can be a problem with network servers that handle lots of descriptors. Check the definition of the FILE structure in your <stdio.h> header to see what type of variable holds the descriptor.