14.8 Sockets and Standard I/O In all our examples so far, we have used what is sometimes called Unix I/O , the read and write functions and their variants ( recv , send , etc.). These functions work with descriptors and are normally implemented as system calls within the Unix kernel. Another method of performing I/O is the standard I/O library . It is specified by the ANSI C standard and is intended to be portable to non-Unix systems that support ANSI C. The standard I/O library handles some of the details that we must worry about ourselves when using the Unix I/O functions, such as automatically buffering the input and output streams. Unfortunately, its handling of a stream's buffering can present a new set of problems we must worry about. Chapter 5 of APUE covers the standard I/O library in detail, and [Plauger 1992] presents and discusses a complete implementation of the standard I/O library. The term stream is used with the standard I/O library, as in "we open an input stream" or "we flush the output stream." Do not confuse this with the STREAMS subsystem, which we will discuss in Chapter 31. The standard I/O library can be used with sockets, but there are a few items to consider: -
A standard I/O stream can be created from any descriptor by calling the fdopen function. Similarly, given a standard I/O stream, we can obtain the corresponding descriptor by calling fileno . Our first encounter with fileno was in Figure 6.9 when we wanted to call select on a standard I/O stream. select works only with descriptors, so we had to obtain the descriptor for the standard I/O stream. -
TCP and UDP sockets are full-duplex. Standard I/O streams can also be full-duplex : we just open the stream with a type of r+ , which means read-write. But on such a stream, an output function cannot be followed by an input function without an intervening call to fflush , fseek , fsetpos , or rewind . Similarly, an input function cannot be followed by an output function without an intervening call to fseek , fsetpos , or rewind , unless the input function encounters an EOF. The problem with these latter three functions is that they all call lseek , which fails on a socket. -
The easiest way to handle this read-write problem is to open two standard I/O streams for a given socket: one for reading and one for writing. Example: str_echo Function Using Standard I/O We now show an alternate version of our TCP echo server (Figure 5.3), which uses standard I/O instead of read and writen . Figure 14.14 is a version of our str_echo function that uses standard I/O. (This version has a problem that we will describe shortly.) Figure 14.14 str_echo function recoded to use standard I/O. advio/str_echo_stdio02.c 1 #include "unp.h" 2 void 3 str_echo(int sockfd) 4 { 5 char line[MAXLINE]; 6 FILE *fpin, *fpout; 7 fpin = Fdopen(sockfd, "r"); 8 fpout = Fdopen(sockfd, "w"); 9 while (Fgets(line, MAXLINE, fpin) != NULL) 10 Fputs(line, fpout); 11 } Convert descriptor into input stream and output stream 7 “10 Two standard I/O streams are created by fdopen: one for input and one for output. The calls to read and writen are replaced with calls to fgets and fputs . If we run our server with this version of str_echo and then run our client, we see the following: hpux % tcpcli02 206.168.112.96 | | hello, world | we type this line, but nothing is echoed | and hi | and this one, still no echo | hello?? | and this one, still no echo | ^D | and our EOF character | hello, world | and then the three echoed lines are output | and hi | | hello?? | | There is a buffering problem here because nothing is echoed by the server until we enter our EOF character. The following steps take place: -
We type the first line of input and it is sent to the server. -
The server reads the line with fgets and echoes it with fputs . -
The server's standard I/O stream is fully buffered by the standard I/O library. This means the library copies the echoed line into its standard I/O buffer for this stream, but does not write the buffer to the descriptor, because the buffer is not full. -
We type the second line of input and it is sent to the server. -
The server reads the line with fgets and echoes it with fputs . -
Again, the server's standard I/O library just copies the line into its buffer, but does not write the buffer because it is still not full. -
The same scenario happens with the third line of input that we enter. -
We type our EOF character, and our str_cli function (Figure 6.13) calls shutdown , sending a FIN to the server. -
The server TCP receives the FIN, which fgets reads, causing fgets to return a null pointer. -
The str_echo function returns to the server main function (Figure 5.12) and the child terminates by calling exit . -
The C library function exit calls the standard I/O cleanup function (pp. 162 “164 of APUE). The output buffer that was partially filled by our calls to fputs is now output. -
The server child process terminates, causing its connected socket to be closed, sending a FIN to the client, completing the TCP four-packet termination sequence. -
The three echoed lines are received by our str_cli function and output. -
str_cli then receives an EOF on its socket, and the client terminates. The problem here is the buffering performed automatically by the standard I/O library on the server. There are three types of buffering performed by the standard I/O library: -
Fully buffered means that I/O takes place only when the buffer is full, the process explicitly calls fflush , or the process terminates by calling exit . A common size for the standard I/O buffer is 8,192 bytes. -
Line buffered means that I/O takes place when a newline is encountered , when the process calls fflush , or when the process terminates by calling exit . -
Unbuffered means that I/O takes place each time a standard I/O output function is called. Most Unix implementations of the standard I/O library use the following rules: -
Standard error is always unbuffered. -
Standard input and standard output are fully buffered, unless they refer to a terminal device, in which case, they are line buffered. -
All other streams are fully buffered unless they refer to a terminal device, in which case, they are line buffered. Since a socket is not a terminal device, the problem seen with our str_echo function in Figure 14.14 is that the output stream ( fpout ) is fully buffered. One way around this is to force the output stream to be line buffered by calling setvbuf . Another is to force each echoed line to be output by calling fflush after each call to fputs . But in practice, either of these solutions is still error-prone and may interact badly with the Nagle algorithm described in Section 7.9. In most cases, the best solution is to avoid using the standard I/O library altogether for sockets and operate on buffers instead of lines, as described in Section 3.9. Using standard I/O on sockets may make sense when the convenience of standard I/O streams outweighs the concerns about bugs due to buffering, but these are rare cases. Be aware that some implementations of the standard I/O library still have a problem with descriptors greater than 255. This can be a problem with network servers that handle lots of descriptors. Check the definition of the FILE structure in your <stdio.h> header to see what type of variable holds the descriptor. |