Function

Team-FLY

4.4 The `select` Function

The handling of I/O from multiple sources is an important problem that arises in many different forms. For example, a program may want to overlap terminal I/O with reading input from a disk or with printing. Another example occurs when a program expects input from two different sources, but it doesn't know which input will be available first. If the program tries to read from source A, and in fact, input was only available from source B, the program blocks. To solve this problem, we need to block until input from either source becomes available. Blocking until at least one member of a set of conditions becomes true is called OR synchronization . The condition for the case described is "input available" on a descriptor.

One method of monitoring multiple file descriptors is to use a separate process for each one. Program 4.11 takes two command-line arguments, the names of two files to monitor. The parent process opens both files before creating the child process. The parent monitors the first file descriptor, and the child monitors the second. Each process echoes the contents of its file to standard output. If two named pipes are monitored , output appears as input becomes available.

Program 4.11 `monitorfork.c`

A program that monitors two files by forking a child process .

 #include <errno.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <unistd.h> #include "restart.h" int main(int argc, char *argv[]) {    int bytesread;    int childpid;    int fd, fd1, fd2;    if (argc != 3) {       fprintf(stderr, "Usage: %s file1 file2\n", argv[0]);       return 1;    }    if ((fd1 = open(argv[1], O_RDONLY)) == -1) {       fprintf(stderr, "Failed to open file %s:%s\n", argv[1], strerror(errno));       return 1;    }    if ((fd2 = open(argv[2], O_RDONLY)) == -1) {       fprintf(stderr, "Failed to open file %s:%s\n", argv[2], strerror(errno));       return 1;    }    if ((childpid = fork()) == -1) {       perror("Failed to create child process");       return 1;    }    if (childpid > 0)                                         /* parent code */       fd = fd1;    else       fd = fd2;    bytesread = copyfile(fd, STDOUT_FILENO);    fprintf(stderr, "Bytes read: %d\n", bytesread);    return 0; }

While using separate processes to monitor two file descriptors can be useful, the two processes have separate address spaces and so it is difficult for them to interact.

Exercise 4.16

How would you modify Program 4.11 so that it prints the total number of bytes read from the two files?

Answer:

Set up some form of interprocess communication before creating the child. For example, the parent process could create a pipe and the child could send its byte count to the pipe when it has finished. After the parent has processed its file, the parent could wait for the child and read the byte count from the pipe.

The select call provides a method of monitoring file descriptors from a single process. It can monitor for three possible conditions ”a read can be done without blocking, a write can be done without blocking, or a file descriptor has error conditions pending. Older versions of UNIX defined the select function in sys/time.h , but the POSIX standard now uses sys/select.h .

The nfds parameter of select gives the range of file descriptors to be monitored. The value of nfds must be at least one greater than the largest file descriptor to be checked. The readfds parameter specifies the set of descriptors to be monitored for reading. Similarly, writefds specifies the set of descriptors to be monitored for writing, and errorfds specifies the file descriptors to be monitored for error conditions. The descriptor sets are of type fd_set . Any of these parameters may be NULL , in which case select does not monitor the descriptor for the corresponding event. The last parameter is a timeout value that forces a return from select after a certain period of time has elapsed, even if no descriptors are ready. When timeout is NULL , select may block indefinitely.

  SYNOPSIS  #include <sys/select.h>    int select(int nfds, fd_set *restrict readfds,               fd_set *restrict writefds, fd_set *restrict errorfds,               struct timeval *restrict timeout);    void FD_CLR(int fd, fd_set *fdset);    int FD_ISSET(int fd, fd_set *fdset);    void FD_SET(int fd, fd_set *fdset);    void FD_ZERO(fd_set *fdset);  POSIX

On successful return, select clears all the descriptors in each of readfds , writefds and errorfds except those descriptors that are ready. If successful, the select function returns the number of file descriptors that are ready. If unsuccessful , select returns “1 and sets errno . The following table lists the mandatory errors for select .

`errno`	cause
`EBADF`	one or more file descriptor sets specified an invalid file descriptor
`EINTR`	the `select` was interrupted by a signal before timeout or selected event occurred
`EINVAL`	an invalid timeout interval was specified, or `nfds` is less than 0 or greater than `FD_SETSIZE`

Historically, systems implemented the descriptor set as an integer bit mask, but that implementation does not work for more than 32 file descriptors on most systems. The descriptor sets are now usually represented by bit fields in arrays of integers. Use the macros FD_SET , FD_CLR , FD_ISSET and FD_ZERO to manipulate the descriptor sets in an implementation-independent way as demonstrated in Program 4.12.

The FD_SET macro sets the bit in *fdset corresponding to the fd file descriptor, and the FD_CLR macro clears the corresponding bit. The FD_ZERO macro clears all the bits in *fdset . Use these three macros to set up descriptor masks before calling select . Use the FD_ISSET macro after select returns, to test whether the bit corresponding to the file descriptor fd is set in the mask.

Program 4.12 `whichisready.c`

A function that blocks until one of two file descriptors is ready .

 #include <errno.h> #include <string.h> #include <sys/select.h> int whichisready(int fd1, int fd2) {    int maxfd;    int nfds;    fd_set readset;    if ((fd1 < 0)  (fd1 >= FD_SETSIZE)         (fd2 < 0)  (fd2 >= FD_SETSIZE)) {       errno = EINVAL;       return -1;    }    maxfd = (fd1 > fd2) ? fd1 : fd2;    FD_ZERO(&readset);    FD_SET(fd1, &readset);    FD_SET(fd2, &readset);    nfds = select(maxfd+1, &readset, NULL, NULL, NULL);    if (nfds == -1)       return -1;    if (FD_ISSET(fd1, &readset))       return fd1;    if (FD_ISSET(fd2, &readset))       return fd2;    errno = EINVAL;    return -1; }

The function whichisready blocks until at least one of the two file descriptors passed as parameters is ready for reading and returns that file descriptor. If both are ready, it returns the first file descriptor. If unsuccessful, whichisready returns “1 and sets errno .

Program 4.13 `copy2files.c`

A function that uses select to do two concurrent file copies .

 #include <errno.h> #include <stdio.h> #include <string.h> #include <sys/time.h> #include "restart.h" int copy2files(int fromfd1, int tofd1, int fromfd2, int tofd2) {    int bytesread;    int maxfd;    int num;    fd_set readset;    int totalbytes = 0;    if ((fromfd1 < 0)  (fromfd1 >= FD_SETSIZE)         (tofd1 < 0)  (tofd1 >= FD_SETSIZE)         (fromfd2 < 0)  (fromfd2 >= FD_SETSIZE)         (tofd2 < 0)  (tofd2 >= FD_SETSIZE))       return 0;    maxfd = fromfd1;                     /* find the biggest fd for select */    if (fromfd2 > maxfd)       maxfd = fromfd2;    for ( ; ; ) {       FD_ZERO(&readset);       FD_SET(fromfd1, &readset);       FD_SET(fromfd2, &readset);       if (((num = select(maxfd+1, &readset, NULL, NULL, NULL)) == -1) &&          (errno == EINTR))          continue;       if (num == -1)          return totalbytes;       if (FD_ISSET(fromfd1, &readset)) {          bytesread = readwrite(fromfd1, tofd1);          if (bytesread <= 0)             break;          totalbytes += bytesread;       }       if (FD_ISSET(fromfd2, &readset)) {          bytesread = readwrite(fromfd2, tofd2);          if (bytesread <= 0)             break;          totalbytes += bytesread;       }    }    return totalbytes; }

The whichisready function of Program 4.12 is problematic because it always chooses fd1 if both fd1 and fd2 are ready. The copy2files function copies bytes from fromfd1 to tofd1 and from fromfd2 to tofd2 without making any assumptions about the order in which the bytes become available in the two directions. The function returns if either copy encounters an error or end-of-file.

The copy2files function of Program 4.13 can be generalized to monitor multiple file descriptors for input. Such a problem might be encountered by a command processor that was monitoring requests from different terminals. The program cannot predict which source will produce the next input, so it must use a method such as select . In addition, the set of monitored descriptors is dynamic ”the program must remove a source from the monitoring set if an error condition arises on that source's descriptor.

The monitorselect function in Program 4.14 monitors an array of open file descriptors fd . When input is available on file descriptor fd[i] , the program reads information from fd[i] and calls docommand . The monitorselect function has two parameters: an array of open file descriptors and the number of file descriptors in the array. The function restarts the select or read if either is interrupted by a signal. When read encounters other types of errors or an end-of-file, monitorselect closes the corresponding descriptor and removes it from the monitoring set. The monitorselect function returns when all descriptors have indicated an error or end-of-file.

The waitfdtimed function in Program 4.15 takes two parameters: a file descriptor and an ending time. It uses gettimeout to calculate the timeout interval from the end time and the current time obtained by a call to gettimeofday . (See Section 9.1.3.) If select returns prematurely because of a signal, waitfdtimed recalculates the timeout and calls select again. The standard does not say anything about the value of the timeout parameter or the fd_set parameters of select when it is interrupted by a signal, so we reset them inside the while loop.

You can use the select timeout feature to implement a timed read operation, as shown in Program 4.16. The readtimed function behaves like read except that it takes an additional parameter, seconds , specifying a timeout in seconds. The readtimed function returns “1 with errno set to ETIME if no input is available in the next seconds interval. If interrupted by a signal, readtimed restarts with the remaining time. Most of the complication comes from the need to restart select with the remaining time when select is interrupted by a signal. The select function does not provide a direct way of determining the time remaining in this case. The readtimed function in Program 4.16 sets the end time for the timeout by calling add2currenttime in Program 4.15. It uses this value when calling waitfdtimed from Program 4.15 to wait until the file descriptor can be read or the time given has occurred.

Program 4.14 `monitorselect.c`

A function to monitor file descriptors using select .

 #include <errno.h> #include <string.h> #include <unistd.h> #include <sys/select.h> #include <sys/types.h> #include "restart.h" #define BUFSIZE 1024 void docommand(char *, int); void monitorselect(int fd[], int numfds) {    char buf[BUFSIZE];    int bytesread;    int i;    int maxfd;    int numnow, numready;    fd_set readset;    maxfd = 0;                  /* set up the range of descriptors to monitor */    for (i = 0; i < numfds; i++) {        if ((fd[i] < 0)  (fd[i] >= FD_SETSIZE))           return;        if (fd[i] >= maxfd)           maxfd = fd[i] + 1;    }    numnow = numfds;    while (numnow > 0) {            /* continue monitoring until all are done */       FD_ZERO(&readset);                  /* set up the file descriptor mask */       for (i = 0; i < numfds; i++)          if (fd[i] >= 0)             FD_SET(fd[i], &readset);       numready = select(maxfd, &readset, NULL, NULL, NULL);  /* which ready? */       if ((numready == -1) && (errno == EINTR))     /* interrupted by signal */          continue;       else if (numready == -1)                          /* real select error */          break;       for (i = 0; (i < numfds) && (numready > 0); i++) { /* read and process */          if (fd[i] == -1)                         /* this descriptor is done */             continue;          if (FD_ISSET(fd[i], &readset)) {        /* this descriptor is ready */             bytesread = r_read(fd[i], buf, BUFSIZE);             numready--;             if (bytesread > 0)                docommand(buf, bytesread);             else  {           /* error occurred on this descriptor, close it */                r_close(fd[i]);                fd[i] = -1;                numnow--;             }          }       }    }    for (i = 0; i < numfds; i++)        if (fd[i] >= 0)            r_close(fd[i]); }

Program 4.15 `waitfdtimed.c`

A function that waits for a given time for input to be available from an open file descriptor .

 #include <errno.h> #include <string.h> #include <sys/select.h> #include <sys/time.h> #include "restart.h" #define MILLION 1000000L #define D_MILLION 1000000.0 static int gettimeout(struct timeval end,                                struct timeval *timeoutp) {    gettimeofday(timeoutp, NULL);    timeoutp->tv_sec = end.tv_sec - timeoutp->tv_sec;    timeoutp->tv_usec = end.tv_usec - timeoutp->tv_usec;    if (timeoutp->tv_usec >= MILLION) {       timeoutp->tv_sec++;       timeoutp->tv_usec -= MILLION;    }    if (timeoutp->tv_usec < 0) {       timeoutp->tv_sec--;       timeoutp->tv_usec += MILLION;    }    if ((timeoutp->tv_sec < 0)         ((timeoutp->tv_sec == 0) && (timeoutp->tv_usec == 0))) {       errno = ETIME;       return -1;    }    return 0; } struct timeval add2currenttime(double seconds) {    struct timeval newtime;    gettimeofday(&newtime, NULL);    newtime.tv_sec += (int)seconds;    newtime.tv_usec += (int)((seconds - (int)seconds)*D_MILLION + 0.5);    if (newtime.tv_usec >= MILLION) {       newtime.tv_sec++;       newtime.tv_usec -= MILLION;    }    return newtime; } int waitfdtimed(int fd, struct timeval end) {    fd_set readset;    int retval;    struct timeval timeout;    if ((fd < 0)  (fd >= FD_SETSIZE)) {       errno = EINVAL;       return -1;    }    FD_ZERO(&readset);    FD_SET(fd, &readset);    if (gettimeout(end, &timeout) == -1)       return -1;    while (((retval = select(fd + 1, &readset, NULL, NULL, &timeout)) == -1)            && (errno == EINTR)) {       if (gettimeout(end, &timeout) == -1)          return -1;       FD_ZERO(&readset);       FD_SET(fd, &readset);    }    if (retval == 0) {       errno = ETIME;       return -1;    }    if (retval == -1)       return -1;    return 0; }

Program 4.16 `readtimed.c`

A function do a timed read from an open file descriptor .

 #include <sys/time.h> #include "restart.h" ssize_t readtimed(int fd, void *buf, size_t nbyte, double seconds) {    struct timeval timedone;    timedone = add2currenttime(seconds);    if (waitfdtimed(fd, timedone) == -1)       return (ssize_t)(-1);    return r_read(fd, buf, nbyte); }

Exercise 4.17

Why is it necessary to test whether newtime.tv_usec is greater than or equal to a million when it is set from the fractional part of seconds ? What are the consequences of having that value equal to one million?

Answer:

Since the value is rounded to the nearest microsecond, a fraction such as 0.999999999 might round to one million when multiplied by MILLION . The action of functions that use struct timeval values are not specified when the tv_usec field is not strictly less than one million.

Exercise 4.18

One way to simplify Program 4.15 is to just restart the select with the same timeout whenever it is interrupted by a signal. What is wrong with this?

Answer:

If your program receives signals regularly and the time between signals is smaller than the timeout interval, waitfdtimed never times out.

The 2000 version of POSIX introduced a new version of select called pselect . The pselect function is identical to the select function, but it uses a more precise timeout structure, struct timespec , and allows for the blocking or unblocking of signals while it is waiting for I/O to be available. The struct timespec structure is discussed in Section 9.1.4. However, at the time of writing, (March 2003), none of the our test operating systems supported pselect .

Team-FLY

4.4 The select Function

Program 4.11 monitorfork.c

Exercise 4.16

Program 4.12 whichisready.c

Program 4.13 copy2files.c

Program 4.14 monitorselect.c

Program 4.15 waitfdtimed.c

Program 4.16 readtimed.c

Exercise 4.17

Exercise 4.18

4.4 The `select` Function

Program 4.11 `monitorfork.c`

Program 4.12 `whichisready.c`

Program 4.13 `copy2files.c`

Program 4.14 `monitorselect.c`

Program 4.15 `waitfdtimed.c`

Program 4.16 `readtimed.c`