22.10 Influence of Disk IO

Team-FLY

22.10 Influence of Disk I/O

Disk accesses can be a million times slower than memory accesses . This section explores the effect of disk I/O on server performance.

To measure this performance, modify the various servers to access the disk rather than a memory buffer to satisfy requests. If your server selects from a small number of request files, your measurements may not be accurate because the operating system buffers file I/O and most of the requests may be satisfied from memory rather than from disk.

One possibility is to create a large number of files whose names are numeric, say, 00000 , 00001 , 00002 , etc. When a request comes in, the server could pick one of these files at random and access it to satisfy the request. Some users might not have enough free disk space to implement this solution.

Another possibility is to use the system files that already exist. The idea is to create a list of the files on the server for which the user has read access. When a request comes in, the server randomly selects one of the files that is large enough to satisfy that request. Care must be taken to ensure that the process of selecting the file does not significantly burden the server.

Program 22.1 illustrates one method of ensuring careful file selection. To enable easy access, the program creates lists of files of different sizes by organizing entries according to the logarithm of their sizes. Each list consists of records that each contain the full pathname and size of a file that is of at least a given size but less than 10 times the given size. The first list contains files of at least 10 bytes, the second has files of at least 100 bytes, etc. Each list contains files 10 times the size of the previous list. If a server receives a request for a resource of size 1234 bytes, it should select at random one of the files from the list of files containing at least 10,000 bytes and transmit the required number of bytes from the selected file. Since each list is an array rather than a linked list, the server uses a random index to directly access the name of the file.

Program 22.1 creates NUMRANGES lists. For NUMRANGES equal to 5, the lists contain files of sizes at least 10, 100, 1000, 10,000 and 100,000 bytes, so makefileinfo can satisfy access requests of up to 100,000 bytes. The makefileinfo program stores the full pathname and size of each file in a record of type fileinfo . Only files whose full pathname is of size at most MAXPATH are inserted in the list. A value of 100 for MAXPATH picks up almost all files on most systems. We avoid using the system value PATH_MAX , which may be 1024 or greater, because this choice takes too much space.

Program 22.1 takes two command-line arguments, the first specifying the base path of the directory tree under which to search for files and the second specifying the number of files to find for each list. The program uses the nftw system function to step through the file system. Each time makefileinfo visits a file, it calls insertfile with the full pathname and other parameters that give information about the file. This function keeps track of how many of the lists are full and returns 1 when all are full. The function nftw stops stepping through the directory tree when insertfile returns a nonzero value.

The function insertfile first checks that it was passed a file rather than a directory by checking the info parameter against FTW_F . It also verifies that the path fits in the list and uses the stat information to make sure that the file is a regular file. If all these conditions are satisfied, insertfile attempts a nonblocking open of the file for reading to make sure that the current process has read access to that file. A nonblocking open guarantees that the attempt does not block. If all these operations are successful, insertfile calls whichlist to determine which list the file should go into. The size of each list is kept in the array filecounts , and the function keeps track of the number of these entries that are equal to the maximum size of the list.

After the list is created, makefileinfo displays a list of counts and then calls showfiles to display the sizes and names of the files in each list. Comment out the call to showfiles after you are convinced that the program is working.

Modify Program 22.1 to make it usable by your servers. Replace the main function with a create_lists function that takes two parameters ”the same values as the two command-line arguments of Program 22.1. This function creates the lists. Write an additional function, openfile , that takes a size as a parameter. The openfile function chooses one of the files that is at least as large as the size parameter, opens the file for reading, and returns the open file descriptor. If an error occurs, openfile returns “1 with errno set.

Modify one of the servers from Section 22.6, 22.7, 22.8 or 22.9 so that it satisfies requests from the disk rather than from a memory buffer. The server now takes two additional command-line arguments like those of Program 22.1 and creates the lists before accepting any connection requests. The server should display a message after creating the lists so that you can tell when to start your clients . Compare the results with those of the corresponding server that did not access the disk.

Program 22.1 makefileinfo.c

A program that creates a list of files by walking through a directory tree.

 #include <fcntl.h> #include <ftw.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include "restart.h" #define MAXPATH 100 #define NUMRANGES 5 typedef struct {    off_t filesize;    char path[MAXPATH+1]; } fileinfo; static int filecounts[NUMRANGES]; static fileinfo *files[NUMRANGES]; static int maxnum; static int whichlist(off_t size) {    int base = 10;    int limit;    int lnum;    if (size < base)       return -1;    for (lnum = 0, limit = base*base;         lnum < NUMRANGES - 1;         lnum++, limit *= 10)       if (size < limit)          break;    return lnum; } static int insertfile(const char *path, const struct stat *statbuf,            int info, struct FTW *ftwinfo) {    int fd;    int lnum;    static int numfull = 0;    if (info != FTW_F)       return 0;    if (strlen(path) > MAXPATH)       return 0;    if ((statbuf->st_mode & S_IFREG) == 0)       return 0;    if ((fd = open(path, O_RDONLY  O_NONBLOCK)) == -1)       return 0;    if (r_close(fd) == -1)       return 0;    lnum = whichlist(statbuf->st_size);    if (lnum < 0)       return 0;    if (filecounts[lnum] == maxnum)       return 0;    strcpy(files[lnum][filecounts[lnum]].path, path);    files[lnum][filecounts[lnum]].filesize = statbuf->st_size;    filecounts[lnum]++;    if (filecounts[lnum] == maxnum) numfull++;    if (numfull == NUMRANGES)       return 1;    return 0; } void showfiles(int which) {    int i;    fprintf(stderr, "List %d contains %d entries\n", which, filecounts[which]);    for (i = 0; i < filecounts[which]; i++)       fprintf(stderr, "%*d: %s\n",which + 6,files[which][i].filesize,                       files[which][i].path); } int main(int argc, char *argv[]) {    int depth = 10;    int ftwflags = FTW_PHYS;    int i;    if (argc != 3) {       fprintf(stderr, "Usage: %s directory maxnum\n", argv[0]);       return 1;    }    maxnum = atoi(argv[2]);    for (i = 0; i < NUMRANGES; i++) {       filecounts[i] = 0;       files[i] = (fileinfo *)calloc(maxnum, sizeof(fileinfo));       if (files[i] == NULL) {          fprintf(stderr,"Failed to allocate memory for list %d\n", i);          return 1;       }    }    fprintf(stderr, "Max number for each range is %d\n", maxnum);    if (nftw(argv[1], insertfile, depth, ftwflags) == -1) {       perror("Failed to execute nftw");       return 1;    }    fprintf(stderr, "**** nftw is done\n");    fprintf(stderr, "Counts are as follows with sizes at most %d\n", maxnum);    for (i = 0; i < NUMRANGES; i++)       fprintf(stderr, "%d:%d\n", i, filecounts[i]);    for (i = 0; i < NUMRANGES; i++)       showfiles(i);    return 0; } 
Team-FLY


Unix Systems Programming
UNIX Systems Programming: Communication, Concurrency and Threads
ISBN: 0130424110
EAN: 2147483647
Year: 2003
Pages: 274

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net