Waiting on Processes

They also serve who only stand and wait.

John Milton 16081674 On His Blindness [1652]

More often than not, a parent process needs to synchronize its actions by waiting until a child process has either stopped or terminated its actions. The wait system call allows the parent process to suspend its activity until one of these actions has occurred (Table 3.9).

Table 3.9. Summary of the wait System Call.

Include File(s)


 

Manual Section

2

Summary

pid_t wait(int *status);

Return

Success

Failure

Sets errno

Child process ID or 0

-1

Yes

The activities of wait are summarized in Figure 3.11.

Figure 3.11. Summary of wait activities.

graphics/03fig11.gif

The wait system call accepts a single argument, which is a pointer to an integer, and returns a value defined as type pid_t . Data type pid_t is found in the header file and is most commonly a long int. If the calling process does not have any child processes associated with it, wait will return immediately with a value of -1 and errno will be set to ECHILD (10) . However, if any child processes are still active, the calling process will block (suspend its activity) until a child process terminates. When a waited-for child process terminates, the status information for the child and its process ID (PID) are returned to the parent. The status information is stored as an integer value at the location referenced by the pointer status . The low-order 16 bits of the location contain the actual status information, and the high-order bits ( assuming a 32-bit machine) are set to zero. The low-order bit information can be further subdivided into a low- and high-order byte. This information is interpreted in one of two ways:

  1. If the child process terminated normally, the low-order byte will be 0 and the high-order byte will contain the exit code (0255):

    byte 3

    byte 2

    byte 1

    byte 0

       

    exit code

  2. If the child process terminated due to an uncaught signal, the low-order byte will contain the signal number and the high-order byte will be 0:

    byte 3

    byte 2

    byte 1

    byte 0

       

    signal #

In this second situation, if a core file has been produced, the leftmost bit of byte 0 will be a 1. If a NULL argument is specified for wait , the child status information is not returned to the parent process, the parent is only notified of the child's termination.

Here are two programs, a parent (Program 3.9) and child (Program 3.10), that demonstrate the use of wait .

Program 3.9 The parent process.

File : p3.9.cxx
 /*
 A parent process that waits for a child to finish
 */
 #include 
 + #include 
 #include 
 #include 
 #include 
 #include 
 10 using namespace std;
 int
 main(int argc, char *argv[] ){
 pid_t pid, w;
 int status;
 + if ( argc < 4 ) {
 cerr << "Usage " << *argv << " value_1 value_2 value_3
";
 return 1;
 }
 for (int i = 1; i < 4; ++i) // generate 3 child processes
 20 if ((pid = fork( )) == 0)
 execl("./child", "child", argv[i], (char *) 0);
 else // assuming no failures here
 cout << "Forked child " << pid << endl;
 /*
 + Wait for the children
 */
 while ((w=wait(&status)) && w != -1)
 cout << "Wait on PID: " << dec << w << " returns status of "
 << setw(4) << setfill('0') << hex
 30 << setiosflags(ios::uppercase) << status << endl;
 return 0;
 }

The parent program forks three child processes. Each child process is overlaid with the executable code for the child (found in Program 3.10). The parent process passes to each child, from the parent's command line, a numeric value. As each child process is produced, the parent process displays the child process ID. After all three processes have been generated; the parent process initiates a loop to wait for the child processes to finish their execution. As each child process terminates, the value returned to the parent process is displayed.

Program 3.10 The child process.

File : p3.10.cxx
 /*
 The child process
 */
 #define _GNU_SOURCE
 + #include 
 #include 
 #include 
 #include 
 #include 
 10 #include 
 using namespace std;
 int
 main(int argc, char *argv[ ]){
 pid_t pid = getpid( );
 + int ret_value;
 srand((unsigned) pid);
 ret_value = int(rand( ) % 256); // generate a return value
 sleep(rand( ) % 3); // sleep a bit
 if (atoi(*(argv + 1)) % 2) { // assuming argv[1] exists!
 20 cout << "Child " << pid << " is terminating with signal 0009" << endl;
 kill(pid, 9); // commit hara-kiri
 } else {
 cout << "Child " << pid << " is terminating with exit("
 << setw(4) << setfill('0') << setiosflags(ios::uppercase)
 + << hex << ret_value << ")" << endl;
 exit(ret_value);
 }
 }

In the child program, the child process obtains its own PID using the getpid call. The PID value is used as a seed value to initialize the srand function. A call to rand is used to generate a unique value to be returned when the process exits. The child process then sleeps a random number of seconds (03). After sleeping, if the argument passed to the child process on the command line is odd (i.e., not evenly divisible by 2), the child process kills itself by sending a signal 9 (SIGKILL) to its own PID. If the argument on the command line is even, the child process exits normally, returning the previously calculated return value. In both cases, the child process displays a message indicating what it will do before it actually executes the statements.

The source programs are compiled and the executables named parent and child respectively. They are run by calling the parent program. Two sample output sequences are shown in Figure 3.12.

Figure 3.12 Two runs of Programs 3.9 and 3.10.

linux$ parent 2 1 2

<-- 1

Forked child 8975
Forked child 8976
Child 8976 is terminating with signal 0009
Forked child 8977
Wait on PID: 8976 returns status of 0009
Child 8977 is terminating with exit(008F)
Wait on PID: 8977 returns status of 8F00
Child 8975 is terminating with exit(0062)
Wait on PID: 8975 returns status of 6200

linux$ parent 2 2 1

<-- 2

Forked child 8980
Forked child 8981
Forked child 8982
Child 8982 is terminating with signal 0009
Wait on PID: 8982 returns status of 0009
Child 8980 is terminating with exit(00B0)
Wait on PID: 8980 returns status of B000
Child 8981 is terminating with exit(00D3)
Wait on PID: 8981 returns status of D300

(1) Two even values and one odd

(2) Two even values and one odd but in a different order.

There are several things of interest to note in this output. In the first output sequence, one child processes (PID 8976) has terminated before the parent has finished its process generation. Processes that have terminated but have not been wait ed upon by their parent process are called zombie processes. Zombie processes occupy a slot in the process table, consume no other system resources, and will be marked with the letter Z when a process status command is issued (e.g., ps -alx or ps -el ). A zombie process cannot be killed [11] even with the standard Teflon bullet (e.g., at a system level: kill -9 process_id_number ). Zombies are put to rest when their parent process performs a wait to obtain their process status information. When this occurs, any remaining system resources allocated for the process are recovered by the kernel. Should the child process become an orphan before its parent issues the wait , the process will be inherited by init , which, by design, will issue a wait for the process. On some very rare occasions, even this will not cause the zombie process to "die." In these cases, a system reboot may be needed to clear the process table of the entry.

[11] This miraculous ability is the source of the name zombie .

Both sets of output clearly show that when the child process terminates normally, the exit value returned by the child is stored in the second byte of the integer value referenced by argument to the wait call in the parent process. Likewise, if the child terminates due to an uncaught signal, the signal value is stored in the first byte of the same referenced location. It is also apparent that wait will return with the information for the first child process that terminates, which may or may not be the first child process generated.

EXERCISE

Add the wait system call to the huh shell program (Program 3.7).

EXERCISE

Write a program that produces three zombie processes. Submit evidence, via the output of the ps command, that these processes are truly generated and are eventually destroyed .

EXERCISE

In Program 3.10 if the child process uses a signal 8 (versus 9) to terminate, what is returned to the parent as the signal value? Why?

It is easy to see that the interpretation of the status information can be cumbersome, to say the least. At one time, programmers wrote their own macros to interrogate the contents of status. Now most use one of the predefined status macros. These macros are shown in Table 3.10.

Table 3.10. The wstat Macros.

Macro

Description

WIFEXITED(status)

Returns a true if the child process exited normally.

WEXITSTATUS(status)

Returns the exit code or return value from main. Should be called only if WIFEXITED(status) has returned a true.

WIFSIGNALED(status)

Returns a true if the child exited due to uncaught signal.

WTERMSIG(status)

Returns the signal that terminated the child. Should be called only if WIFSIGNALED(status) has returned a true.

WIFSTOPPED(status)

Returns a true if the child process is stopped.

WSTOPSIG(status)

Returns the signal that stopped the child. Should be called only if WIFSTOPPED(status)has returned a true.

The argument to each of these macros is the integer status value (not the pointer to the value) that is returned to the wait call. The macros are most often used in pairs. The WIF macros are used as a test for a given condition. If the condition is true, the second macro of the pair is used to return the specified value. As shown below, these macros could be incorporated in the wait loop in the parent Program 3.9 to obtain the child status information:

...
while ((w = wait(&status)) && w != -1)
 if (WIFEXITED(status)) // test with macro
 cout << "Wait on PID: " << dec << w << " returns a value of "
 << hex << WEXITSTATUS(status) << endl; // obtain value
 else if (WIFSIGNALED(status)) // test with macro
 cout << "Wait on PID: " << dec << w << " returns a signal of "
 << hex << WTERMSIG(status) << endl; // obtain value
...

EXERCISE

Some systems support a WCOREDUMP macro. This macro is only called if the WIFSIGNALED macro returns a true. WCOREDUMP returns a true if the offending signal generates a core dump. Write your own version of the WCOREDUMP macro (inline function). You may need to check the signal manual page (Section 7) to determine what signals generate a core dump or do a bit of bit manipulation (see earlier discussion). Show that your macro works when a process receives a terminating signal that generates or does not generate a core image file.

While the wait system call is helpful, it does have some limitations. It will always return the status of the first child process that terminates or stops. Thus, if the status information returned by wait is not from the child process we want, the information may need to be stored on a temporary basis for possible future reference and additional calls to wait made. Another limitation of wait is that it will always block if status information is not available. Fortunately, another system call, waitpid , which is more flexible (and thus more complex), addresses these shortcomings. In most invocations, the waitpid call will block the calling process until one of the specified child processes changes state. The waitpid system call summary is shown in Table 3.11.

Table 3.11. Summary of the waitpid System Call.

Include File(s)


 

Manual Section

2

Summary

pid_t waitpid(pid_t pid, int *status, int options);

Return

Success

Failure

Sets errno

Child PID or 0

-1

Yes

The first argument of the waitpid system call, pid , is used to stipulate the set of child process identification numbers that should be waited for (Table 3.12).

Table 3.12. Interpretation of pid Values by waitpid .

pid Value

Wait for

< -1

Any child process whose process group ID equals the absolute value of pid .


-1

Any child processin a manner similar to wait .

Any child process whose process group ID equals the caller's process group ID.

> 0

The child process with this process ID.

The second argument, *status , as with the wait call, references an integer status location where the status information of the child process will be stored if the waitpid call is successful. This location can be examined directly or with the previously presented wstat macros.

The third argument, options , may be 0 (don't care), or it can be formed by a bitwise OR of one or more of the flags listed in Table 3.13 (these flags are usually defined in the header file). The flags are applicable to the specified child process set discussed previously.

Table 3.13. Flag Values for waitpid .

FLAG Value

Specifies

WNOHANG

Return immediately if no child has exiteddo not block if the status cannot be obtained; return a value of 0, not the PID .

WUNTRACED

Return immediately if child is blocked.

If the value given for pid is -1 and the option flag is set to 0, the waitpid and wait system call act in a similar fashion. If waitpid fails, it returns a value of 1 and sets errno to indicate the source of the error (Table 3.14).

Table 3.14. waitpid Error Messages.

#

Constant

perror Message

Explanation

4

EINTR

Interrupted system call

Signal was caught during the system call.

10

ECHILD

No child process

Process specified by pid does not exist, or child process has set action of SIGCHILD to be SIG_IGN (ignore signal).

22

EINVAL

Invalid argument

Invalid value for options.

85

ERESTART

Interrupted system call should be restarted

WNOHANG not specified, and unblocked signal or SIGCHILD was caught.

We can modify a few lines in our current version of the parent process (Program 3.9) to save the generated child PIDs in an array. This information can be used with the waitpid system call to coerce the parent process into displaying status information from child processes in the order of child process generation instead of their termination order. Program 3.11 shows how this can be done.

Program 3.11 A parent program using waitpid .

File : p3.11.cxx
 #include 
 #include 
 #include 
 #include 
 + #include 
 #include 
 using namespace std;
 int
 main(int argc, char *argv[] ){
 10 pid_t pid[3], w;
 int status;
 if ( argc < 4 ) {
 cerr << "Usage " << *argv << " value_1 value_2 value_3
";
 return 1;
 + }
 for (int i=1; i < 4; ++i) // generate 3 child processes
 if ((pid[i-1] = fork( )) == 0)
 execl("./child", "child", argv[i], (char *) 0);
 else // assuming no failures here
 20 cout << "Forked child " << pid[i-1] << endl;
 /*
 Wait for the children
 */
 for (int i=0;(w=waitpid(pid[i], &status,0)) && w != -1; ++i){
 + cout << "Wait on PID " << dec << w << " returns ";
 if (WIFEXITED(status)) // test with macro
 cout << " a value of " << setw(4) << setfill('0') << hex
 << setiosflags(ios::uppercase) << WEXITSTATUS(status) << endl;
 else if (WIFSIGNALED(status)) // test with macro
 30 cout << " a signal of " << setw(4) << setfill('0') << hex
 << setiosflags(ios::uppercase) << WTERMSIG(status) << endl;
 else
 cout << " unexpectedly!" << endl;
 }
 + return 0;
 }

A run of this program (using the same child processProgram 3.10) confirms that the status information returned to the parent is indeed ordered based on the sequence of child processes generation, not the order in which the processes terminated. Also, note that the status macros are used to evaluate the return from waitpid system call (Figure 3.13).

Figure 3.13 Output of Program 3.11.

linux$ p3.11 2 2 1
Forked child 9772
Forked child 9773

<-- 1

Child 9773 is terminating with exit(008B)

<-- 2

Forked child 9774
Child 9772 is terminating with exit(00CD)
Wait on PID 9772 returns a value of 00CD

<-- 3

Wait on PID 9773 returns a value of 008B
Child 9774 is terminating with signal 0009
Wait on PID 9774 returns a signal of 0009

(1) Order of creation :

(2) Order of termination :

(3) Order of wait :

EXERCISE

The discussion in the text centers on a parent process waiting for a child process to terminate or stop. We already have the tools necessary for a child process to determine if its parent process has terminated. Show how this can be done. What are the advantages and disadvantages of your implementation?

On some occasions, the information returned from wait or waitpid may be insufficient. Additional information on resource usage by a child process may be sought. There are two BSD compatibility library functions, wait3 and wait4 , [12] that can be used to provide this information (Table 3.15).

[12] It is not clear if these functions will be supported in subsequent versions of the GNU compiler, and they may limit the portability of programs that incorporate them. As these are BSD-based functions, _USE_BSD must be defined in the program code or defined on the command line when the source code is compiled.

Table 3.15. Summary of the wait3/wait4 Library Functions.

Include File(s)

#define _USE_BSD
#include 
#include 
#include 

Manual Section

3

Summary

pid_t wait3(int *status, int options,
 struct rusage *rusage);
pid_t wait4(pid_t pid, int *status,
 int options, struct rusage *rusage);

Return

Success

Failure

Sets errno

Child PID or 0

-1

Yes

The wait3 and wait4 functions parallel the wait and waitpid functions respectively. The wait3 function waits for the first child process to terminate or stop. The wait4 function waits for the specified PID ( pid ). In addition, should the pid value passed to the wait4 function be set to 0, wait4 will wait on the first child process in a manner similar to wait3 . Both functions accept option flags to indicate whether or not they should block and/or report on stopped child processes. These option flags are shown in Table 3.16.

Table 3.16. Option Flag Values for wait3 / wait4 .

FLAG Value

Specifies

WNOHANG

Return immediately if no child has exiteddo not block if the status cannot be obtained; return a value of 0 not the PID .

WUNTRACED

Return immediately if child is blocked.

Both functions contain an argument that is a reference to a rusage structure. This structure is defined in the header file . [13]

[13] On some systems, you may need the header file instead of , and you may need to explicitly link in the BSD library that contains the object code for the wait3/wait4 functions.

struct rusage {
 struct timeval ru_utime; /* user time used */
 struct timeval ru_stime; /* system time used */
 long ru_maxrss; /* maximum resident set size */
 long ru_ixrss; /* integral shared memory size */
 long ru_idrss; /* integral unshared data size */
 long ru_isrss; /* integral unshared stack size */
 long ru_minflt; /* page reclaims */
 long ru_majflt; /* page faults */
 long ru_nswap; /* swaps */
 long ru_inblock; /* block input operations */
 long ru_oublock; /* block output operations */
 long ru_msgsnd; /* messages sent */
 long ru_msgrcv; /* messages received */
 long ru_nsignals; /* signals received */
 long ru_nvcsw; /* voluntary context switches */
 long ru_nivcsw; /* involuntary context switches */
 };

If the rusage argument is non-null, the system populates the rusage structure with the current information from the specified child process. See the getrusage system call in Section 2 of the manual pages for additional information. The status macros (see previous section on wait and waitpid ) can be used with the status information returned by wait3 and wait4 . See Table 3.17.

Table 3.17. wait3 / wait4 Error Messages.

#

Constant

perror Message

Explanation

4

EINTR

Interrupted system call

Signal was caught during the system call.

10

ECHILD

No child process

Process specified by pid does not exist, or child process has set action of SIGCHILD to be SIG_IGN (ignore signal).

22

EINVAL

Invalid argument

Invalid value for options .

85

ERESTART

Interrupted system call should be restarted

WNOHANG not specified, and unblocked signal or SIGCHILD was caught.

EXERCISE

Modify Program 3.11 to use the wait4 library function. After each child terminates, have the parent process display the number of page faults the child process incurred. A page fault occurs when a program requests data that is not currently in memory. To satisfy the request the operating system must locate the data and load it into memory. As loading data from a device takes time and slows down processing the fewer page faults generated the better.

Programs and Processes

Processing Environment

Using Processes

Primitive Communications

Pipes

Message Queues

Semaphores

Shared Memory

Remote Procedure Calls

Sockets

Threads

Appendix A. Using Linux Manual Pages

Appendix B. UNIX Error Messages

Appendix C. RPC Syntax Diagrams

Appendix D. Profiling Programs

show all menu





Interprocess Communication in Linux
Interprocess Communications in Linux: The Nooks and Crannies
ISBN: 0130460427
EAN: 2147483647
Year: 2001
Pages: 136
Similar book on Amazon

Flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net