4.9 Exercise: Atomic Logging

Team-FLY

Sometimes multiple processes need to output to the same log file. Problems can arise if one process loses the CPU while it is outputting to the log file and another process tries to write to the same file. The messages could get interleaved, making the log file unreadable. We use the term atomic logging to mean that multiple writes of one process to the same file are not mixed up with the writes of other processes writing to the same file.

This exercise describes a series of experiments to help you understand the issues involved when multiple processes try to write to the same file. We then introduce an atomic logging library and provide a series of examples of how to use the library. Appendix D.1 describes the actual implementation of this library, which is used in several places throughout the book as a tool for debugging programs.

The experiments in this section are based on Program 3.1, which creates a chain of processes. Program 4.19 modifies Program 3.1 so that the original process opens a file before creating the children. Each child writes a message to the file instead of to standard error. Each message is written in two pieces. Since the processes share an entry in the system file table, they share the file offset. Each time a process writes to the file, the file offset is updated.

Exercise 4.40

Run Program 4.19 several times and see if it generates output in the same order each time. Can you tell which parts of the output came from each process?

Answer:

On most systems, the output appears in the same order for most runs and each process generates a single line of output. However, this outcome is not guaranteed by the program. It is possible (but possibly unlikely ) for one process to lose the CPU before both parts of its output are written to the file. In this, case the output is jumbled.

Program 4.19 chainopenfork.c

A program that opens a file before creating a chain of processes .

 #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/stat.h> #define BUFSIZE 1024 #define CREATE_FLAGS (O_WRONLY  O_CREAT  O_TRUNC) #define CREATE_PERMS (S_IRUSR  S_IWUSR S_IRGRP  S_IROTH) int main  (int argc, char *argv[]) {    char buf[BUFSIZE];    pid_t childpid = 0;    int fd;    int i, n;    if (argc != 3){       /* check for valid number of command-line arguments */       fprintf (stderr, "Usage: %s processes filename\n", argv[0]);       return 1;    }                                         /* open the log file before the fork */    fd = open(argv[2], CREATE_FLAGS, CREATE_PERMS);    if (fd < 0) {       perror("Failed to open file");       return 1;    }    n = atoi(argv[1]);                              /* create a process chain */    for (i = 1; i < n; i++)        if (childpid = fork())           break;    if (childpid == -1) {       perror("Failed to fork");       return 1;    }                                        /* write twice to the common log file */    sprintf(buf, "i:%d process:%ld ", i, (long)getpid());    write(fd, buf, strlen(buf));    sprintf(buf, "parent:%ld child:%ld\n", (long)getppid(), (long)childpid);    write(fd, buf, strlen(buf));    return 0; } 
Exercise 4.41

Put sleep(1); after the first write function in Program 4.19 and run it again. Now what happens?

Answer:

Most likely, each process outputs the values of the first two integers and then each process outputs the last two integers.

Exercise 4.42

Copy chainopenfork.c to a file called chainforkopen.c and move the code to open the file after the loop that forks the children. How does the behavior of chainforkopen.c differ from that of chainopenfork.c ?

Answer:

Each process now has a different system file table entry, and so each process has a different file offset. Because of O_TRUNC , each open deletes what was previously written to the file. Each process starts writing from the beginning of the file, overwriting what the other processes have written. The last process to write has control of the final file contents.

Exercise 4.43

Run chainforkopen several times and see if it generates the same order of the output each time. Which process was executed last? Do you see anything unusual about the contents of the file?

Answer:

The process that outputs last may be different on different systems. If the last process writes fewer bytes than another process, the file contains additional bytes after the line written by the last process.

If independent processes open the same log file, the results might be similar to that of Exercise 4.43. The last process to output overwrites what was previously written. One way to try to solve this problem is to call lseek to move to the end of the file before writing.

Exercise 4.44

Copy chainforkopen.c to a file called chainforkopenseek.c . Add code before each write to perform lseek to the end of the file. Also, remove the O_TRUNC flag from CREATE_FLAGS . Run the program several times and observe the behavior. Use a different file name each time.

Answer:

The lseek operation works as long as the process does not lose the CPU between lseek and write . For fast machines, you may have to run the program many times to observe this behavior. You can increase the likelihood of creating mixed-up output, by putting sleep(1); between lseek and write .

If a file is opened with the O_APPEND flag, then it automatically does all writes to the end of the file.

Exercise 4.45

Copy chainforkopen.c to a file called chainforkappend.c . Modify the CREATE_FLAGS constant by replacing O_TRUNC with O_APPEND . Run the program several times, possibly inserting sleep(1) between the write calls. What happens?

Answer:

The O_APPEND flag solves the problem of processes overwriting the log entries of other processes, but it does not prevent the individual pieces written by one process from being mixed up with the pieces of another.

Exercise 4.46

Copy chainforkappend.c to a file called chainforkonewrite.c . Combine the pair of sprintf calls so that the program uses a single write call to output its information. How does the program behave?

Answer:

The output is no longer interleaved.

Exercise 4.47

Copy chainforkonewrite.c to a file called chainforkfprintf.c . Replace open with a corresponding fopen function. Replace the single write with fprintf . How does the program behave?

Answer:

The fprintf operation causes the output to be written to a buffer in the user area. Eventually, the I/O subsystem calls write to output the contents of the buffer. You have no control over when write is called except that you can force a write operation by calling fflush . Process output can be interleaved if the buffer fills in the middle of the fprintf operation. Adding sleep(1) ; shouldn't cause the problem to occur more or less often.

4.9.1 An atomic logging library

To make an atomic logger, we have to use a single write call to output information that we want to appear together in the log. The file must be opened with the O_APPEND flag. Here is the statement about the O_APPEND flag from the write man page that guarantees that the writing is atomic if we use the O_APPEND flag.

If the O_APPEND flag of the file status flags is set, the file offset will be set to the end of the file prior to each write and no intervening file modification operation will occur between changing the file offset and the write operation.

In the examples given here, it is simple to combine everything into a single call to write , but later we encounter situations in which it is more difficult. Appendix D.1 contains a complete implementation of a module that can be used with a program in which atomic logging is needed. A program using this module should include Program 4.20, which contains the prototypes for the publicly accessible functions. Note that the interface is simple and the implementation details are completely hidden from the user.

Program 4.20 atomic_logger.h

The include file for the atomic logging module .

 int atomic_log_array(char *s, int len); int atomic_log_clear(); int atomic_log_close(); int atomic_log_open(char *fn); int atomic_log_printf(char *fmt, ...); int atomic_log_send(); int atomic_log_string(char *s); 

The atomic logger allows you to control how the output of programs that are running on the same machine is interspersed in a log file. To use the logger, first call atomic_log_open to create the log file. Call atomic_log_close when all logging is completed. The logger stores in a temporary buffer items written with atomic_log_array, atomic_log_string and atomic_log_printf . When the program calls atomic_log_send , the logger outputs the entire buffer, using a single write call, and frees the temporary buffers. The atomic_log_clear operation frees the temporary buffers without actually outputting to the log file. Each function in the atomic logging library returns 0 if successful. If unsuccessful , these functions return “1 and set errno .

The atomic logging facility provides three formats for writing to the log. Use atomic_log_array to write an array of a known number of bytes. Use atomic_log_string to log a string. Alternatively, you can use atomic_log_printf with a syntax similar to fprintf . Program 4.21 shows a version of the process chain that uses the first two forms for output to the atomic logger.

Exercise 4.48

How would you modify Program 4.21 to use atomic_log_printf ?

Answer:

Eliminate the buf array and replace the four lines of code involving sprintf , atomic_log_array and atomic_log_string with the following.

 atomic_log_printf("i:%d process:%ld ", i, (long)getpid()); atomic_log_printf("parent:%ld child ID:%ld\n",                   (long)getppid(), (long)childpid); 

Alternatively use the following single call.

 atomic_log_printf("i:%d process:%ld parent:%ld child:%ld\n",                   i, (long)getpid(), (long)getppid(), (long)childpid); 
Program 4.21 chainforkopenlog.c

A program that uses the atomic logging module of Appendix D.1 .

 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include "atomic_logger.h" #define BUFSIZE 1024 int main  (int argc, char *argv[]) {    char buf[BUFSIZE];    pid_t childpid = 0;    int i, n;    if (argc != 3){       /* check for valid number of command-line arguments */       fprintf (stderr, "Usage: %s processes filename\n", argv[0]);       return 1;    }    n = atoi(argv[1]);                              /* create a process chain */    for (i = 1; i < n; i++)        if (childpid = fork())           break;    if (childpid == -1) {       perror("Failed to fork");       return 1;    }    if (atomic_log_open(argv[2]) == -1) {             /* open atomic log file */       fprintf(stderr, "Failed to open log file");       return 1;    }                                 /* log the output, using two different forms */    sprintf(buf, "i:%d process:%ld", i, (long)getpid());    atomic_log_array(buf, strlen(buf));    sprintf(buf, " parent:%ld child:%ld\n", (long)getppid(), (long)childpid);    atomic_log_string(buf);    if (atomic_log_send() == -1) {       fprintf(stderr, "Failed to send to log file");       return 1;    }    atomic_log_close();    return 0; } 
Exercise 4.49

Modify Program 4.19 to open an atomic log file after forking the children. (Do not remove the other open function call.) Repeat Exercises 4.40 through Exercise 4.47 after adding code to output the same information to the atomic logger as to the original file. Compare the output of the logger with the contents of the file.

Exercise 4.50

What happens if Program 4.19 opens the log file before forking the children?

Answer:

Logging should still be atomic. However, if the parent writes information to the log and doesn't clear it before the fork, the children have a copy of this information in their logging buffers.

Another logging interface that is useful for debugging concurrent programs is the remote logging facility described in detail in Appendix D.2. Instead of logging information being sent to a file, it is sent to another process that has its own environment for displaying and saving the logged information. The remote logging process has a graphical user interface that allows the user to display the log. The remote logger does not have a facility for gathering information from a process to be displayed in a single block in the log file, but it allows logging from processes on multiple machines.

Team-FLY


Unix Systems Programming
UNIX Systems Programming: Communication, Concurrency and Threads
ISBN: 0130424110
EAN: 2147483647
Year: 2003
Pages: 274

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net