Section 2.9. Process Termination | Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)

2.9. Process Termination

The termination of a process results from one of three possible events. First, the process explicitly calling exit(2) or _exit(2) causes all the threads in a multi-threaded process to exit. The threads libraries include tHR_exit(3T) and pthread_exit(3T) interfaces for programmatically terminating an individual user thread without causing the entire process to exit. Second, the process simply completes execution and falls through to the end of the main() functionwhich is essentially an implicit exit. Third, a signal is delivered, and the disposition for the signal is to terminate the process. This disposition is the default for some signals (see Section 2.11). One other possibility is that a process can explicitly call the abort(3C) function and cause a SIGABRT signal to be sent to the process. The default disposition for SIGABRT is to terminate the process and create a core file.

Regardless of which event causes the process to terminate, the kernel exit function is ultimately executed, freeing whatever resources have been allocated to the process, such as the address space mappings, open files, etc., and setting the process state to SZOMB, or the zombie state. A zombie process is one that has exited and that requires the parent process to issue a wait(2) system call to gather the exit status. The only kernel resource that a process in the zombie state is holding is the process table slot. Successful execution of a wait(2) call frees the process table slot. Orphaned processes are inherited by the init process solely for this purpose.

An exception to the above scenario is possible if a parent process uses the sigaction(2) system call to establish a signal handler for the SIGCLD signal and sets the SA_NOCLDWAIT flag (no child wait) in the sa_flags field of the sigaction structure. A process is sent a SIGCLD signal by the kernel when one of its child processes terminates. If a process installs a SIGCLD handler as described, the kernel sets the SNOWAIT bit in the calling (parent) process's p_flag field, signifying that the parent process is not interested in obtaining status information on child processes that have exited. The actual mechanics happen in two places: when the signal handler is installed and when the kernel gets ready to post a SIGCLD signal.

First, when the sigaction() call is executed and the handler is installed, if SA_NOCLDWAIT is true, then SNOWAIT is set in p_flags and the code loops through the child process list, looking for child processes in the zombie state. For each such child process found, the kernel freeproc() function is called to release the process table entry. (The kernel exit code, described below, will have already executed, since the process must have terminatedotherwise, it would not be in the zombie state.) In the second occurrence, the kernel calls its internal sigcld() function to post a SIGCLD signal to a process that has had a child terminate. The sigcld() code calls freeproc() instead of posting the signal if SNOWAIT is set in the parent's p_flags field.

Having jumped ahead there for a second, let's turn our attention back to the kernel exit() function, starting with a summary of the actions performed.

exit()         Exit all but 1 LWP (exitlwps())         Clean up any doors created by the process         Clean up any pending async I/Os         Clean up any realtime timers         Flush signal information (set ignore for all signals, clear posted signals)         Set process LWP count to zero (p_lwpcnt = 0)         NULL-terminate the process kernel thread linked list         Set process termination time (p_mterm)         Close all open file descriptors         if (process is a session leader)                 Release control terminal         Clean up any semaphore resources being held         Release the process's address space         Reassign orphan processes to next-of-kin         Reassign child processes to init         Set process state to zombie         Set process p_wdata and p_wcode for parent to interrogate         Call kernel sigcld() function to send SIGCLD to parent                 if (SNOWAIT flag is set in parent)                         freeproc() /* free the proc table slot - no zombie */                 else                         post the signal to the parent

The sequence of events outlined above is reasonably straightforward. It's a matter of walking through the process structure, cleaning up resources that the process may be holding, and reassigning child and orphan processes. Child processes are handed over to init, and orphan processes are linked to the next-of-kin process, which is typically the parent. Still, we can point out a few interesting things about process termination and the LWP/kthread model as implemented in Solaris.

2.9.1. LWP and Kernel Thread Exit

The exitlwps() code is called immediately upon entry to the kernel exit() function, which, as the name implies, is responsible for terminating all but one LWP in the process. If the number of LWPs in the process is 1 (the p_lwpcnt field in the proc structure) and there are no zombie LWPs (p_zombcnt is 0), then exitlwps() simply turns off the SIGWAITING signal and returns. SIGWAITING creates more LWPs in the process if runnable user threads are waiting for a resource. We certainly do not want to catch SIGWAITING signals and create LWPs when we're terminating.

If the process has more than one LWP, the LWPs must be stopped (quiesced) so that they are not actively changing state or attempting to grab resources (file opens, stack/address space growth, etc.). Essentially what happens is this:

The kernel loops through the list of LWP/kthreads in the process, setting the t_astflag in the kernel thread. If the LWP/kthread is running on a processor, the processor is forced to enter the kernel through the cross-call interrupt mechanism.
Inside the trap handler, which is entered as a result of the cross-call, the kernel tests the t_astflag (which is set) and tests for what condition it is that requires post-trap processing. The t_astflag specifically instructs the kernel that some additional processing is required following a trap.
The trap handler tests the process HOLDFORK flag and if it is set in p_flags (which it is in this case), calls a holdlwp() function that, under different circumstances, would suspend the LWP/kthread.
During an exit, with EXITLWPS set in p_flags, the lwp_exit() function is called to terminate the LWP. If the LWP/kthread is in a sleep or stopped state, then it is set to run so that it can ultimately be quiesced as described.

The kernel lwp_exit() function does per-LWP/kthread cleanup, such as timers, doors, signals, and scheduler activations. Finally, the LWP/kthread is placed on the process's linked list of zombie LWPs, p_zomblist. Once all but one of the LWP/kthreads in the process have been terminated and placed on the process zombie list, the exit() code executes the functions summarized on the previous page.

The pseudocode below summarizes the exitlwps() function.

exitlwps()         if (process LWP count == 1)                 nuke SIGWAITING                 return         else                 for (each LWP/kthread on the process linked list)                         if (LWP/kthread is sleeping or stopped)                                 make it runnable                         if (LWP/kthread is running on a processor)                                 t_astflag = 1;                                 poke_cpu() /* cross-call, to trap into the kernel */                                         holdlwp()                                                 lwp_exit()                                                 place kthread/LWP on zombie list                 done (loop)         place zombie threads on deathrow         return to kernel exit()

Once the exit() code has completed, the process is in a zombie state, occupying only a process table entry and PID structure. When a wait() call is issued on the zombie, the kernel freeproc() function is called to free the process and PID structures.

2.9.2. Deathrow List

exitlwps() does one last bit of work before it returns to exit(). It places a zombie's kernel threads on deathrow.

The kernel maintains a list, called deathrow, of LWPs and kernel threads that have exited, in order to reap a terminated LWP/kthread when a new one needs to be created (fork()). If an LWP/kthread is available on the list of zombies, the kernel does not need to allocate the data structures and stack for a new kthread; it simply uses the structures and stack from the zombie kthread and links the kthread to the process that issued the fork(2) (or thread_create()) command.

In the process creation flow, when the forklwp() code calls lwp_create(), lwp_create() first looks on deathrow for a zombie thread. If one exists, the LWP, kthread, and stack are linked to the process, and the kernel is spared the need to allocate a new kthread, an LWP, and stack space during the fork() process. The kernel simply grabs the structures from the deathrow list, links the pointers appropriately, and moves on. thread_create() (kernel thread create, not the user thread API), called from lwp_create() is passed the LWP data and stack and thus avoids doing any kernel memory allocations.

A kernel thread, thread_reaper(), runs periodically and cleans up zombie threads that are sitting on deathrow. The list of zombie threads on deathrow is not allowed to grow without bounds (no more than 32 zombies), and the zombies are not left on deathrow forever.