3.5. Process TerminationA process can terminate voluntarily and explicitly, voluntarily and implicitly, or involuntarily. Voluntary termination can be attained in two ways:
Executing a return from the main() function literally translates into a call to exit(). The linker introduces the call to exit() under these circumstances. Involuntary termination can be attained in three ways:
The termination of a process is handled differently depending on whether the parent is alive or dead. A process can
In the first case, the child is turned into a zombie process until the parent makes the call to wait/waitpid(). In the second case, the child's parent status will have been inherited by the init() process. We see that when any process terminates, the kernel reviews all the active processes and verifies whether the terminating process is parent to any process that is still alive and active. If so, it changes that child's parent PID to 1. Let's look at the example again and follow it through its demise. The process explicitly calls exit(0). (Note that it could have just as well called _exit(), return(0), or fallen off the end of main with neither call.) The exit() C library function then calls the sys_exit() system call. We can review the following code to see what happens to the process from here onward. We now look at the functions that terminate a process. As previously mentioned, our process foo calls exit(), which calls the first function we look at, sys_exit(). We delve through the call to sys_exit() and into the details of do_exit(). 3.5.1. sys_exit() Function----------------------------------------------------------------------- kernel/exit.c asmlinkage long sys_exit(int error_code) { do_exit((error_code&0xff)<<8); } ----------------------------------------------------------------------- sys_exit() does not vary between architectures, and its job is fairly straightforwardall it does is call do_exit() and convert the exit code into the format required by the kernel. 3.5.2. do_exit() Function----------------------------------------------------------------------- kernel/exit.c 707 NORET_TYPE void do_exit(long code) 708 { 709 struct task_struct *tsk = current; 710 711 if (unlikely(in_interrupt())) 712 panic("Aiee, killing interrupt handler!"); 713 if (unlikely(!tsk->pid)) 714 panic("Attempted to kill the idle task!"); 715 if (unlikely(tsk->pid == 1)) 716 panic("Attempted to kill init!"); 717 if (tsk->io_context) 718 exit_io_context(); 719 tsk->flags |= PF_EXITING; 720 del_timer_sync(&tsk->real_timer); 721 722 if (unlikely(in_atomic())) 723 printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n", 724 current->comm, current->pid, 725 preempt_count()); ----------------------------------------------------------------------- Line 707The parameter code comprises the exit code that the process returns to its parent. Lines 711716Verify against unlikely, but possible, invalid circumstances. These include the following:
Line 719Here, we set PF_EXITING in the flags field of the processes' task struct. This indicates that the process is shutting down. For example, this is used when creating interval timers for a given process. The process flags are checked to see if this flag is set and thus helps prevent wasteful processing. ----------------------------------------------------------------------- kernel/exit.c ... 727 profile_exit_task(tsk); 728 729 if (unlikely(current->ptrace & PT_TRACE_EXIT)) { 730 current->ptrace_message = code; 731 ptrace_notify((PTRACE_EVENT_EXIT << 8) | SIGTRAP); 732 } 733 734 acct_process(code); 735 __exit_mm(tsk); 736 737 exit_sem(tsk); 738 __exit_files(tsk); 739 __exit_fs(tsk); 740 exit_namespace(tsk); 741 exit_thread(); ... ----------------------------------------------------------------------- Lines 729732If the process is being ptraced and the PT_TRACE_EXIT flag is set, we pass the exit code and notify the parent process. Lines 735742These lines comprise the cleaning up and reclaiming of resources that the task has been using and will no longer need. __exit_mm() frees the memory allocated to the process and releases the mm_struct associated with this process. exit_sem() disassociates the task from any IPC semaphores. __exit_files() releases any files the task allocated and decrements the file descriptor counts. __exit_fs() releases all file system data. ----------------------------------------------------------------------- kernel/exit.c ... 744 if (tsk->leader) 745 disassociate_ctty(1); 746 747 module_put(tsk->thread_info->exec_domain->module); 748 if (tsk->binfmt) 749 module_put(tsk->binfmt->module); ... ----------------------------------------------------------------------- Lines 744745If the process is a session leader, it is expected to have a controlling terminal or tty. This function disassociates the task leader from its controlling tty. Lines 747749In these blocks, we decrement the reference counts for the module: ----------------------------------------------------------------------- kernel/exit.c ... 751 tsk->exit_code = code; 752 exit_notify(tsk); 753 754 if (tsk->exit_signal == -1 && tsk->ptrace == 0) 755 release_task(tsk); 756 757 schedule(); 758 BUG(); 759 /* Avoid "noreturn function does return". */ 760 for (;;) ; 761 } ... ----------------------------------------------------------------------- Line 751Set the task's exit code in the task_struct field exit_code. Line 752Send the SIGCHLD signal to parent and set the task state to TASK_ZOMBIE. exit_notify() notifies the relations of the impending task's death. The parent is informed of the exit code while the task's children have their parent set to the init process. The only exception to this is if another existing process exists within the same process group: In this case, the existing process is used as a surrogate parent. Line 754If exit_signal is -1 (indicating an error) and the process is not being ptraced, the kernel calls on the scheduler to release the process descriptor of this task and to reclaim its timeslice. Line 757Yield the processor to a new process. As we see in Chapter 7, the call to schedule() will not return. All code past this point catches impossible circumstances or avoids compiler warnings. 3.5.3. Parent Notification and sys_wait4()When a process is terminated, its parent is notified. Prior to this, the process is in a zombie state where all its resources have been returned to the kernel, but the process descriptor remains. The parent task (for example, the Bash shell) receives the signal SIGCHLD that the kernel sends to it when the child process terminates. In the example, the shell calls wait() when it wants to be notified. A parent process can ignore the signal by not implementing an interrupt handler and can instead choose to call wait() (or waitpid()) at any point. The wait family of functions serves two general roles:
Our parent program can choose to call one of the four functions in the wait family:
Each function will in turn call sys_wait4(), which is where the bulk of the notification occurs. A process that calls one of the wait functions is blocked until one of its children terminates or returns immediately if the child has terminated (or if the parent is childless). The sys_wait4() function shows us how the kernel manages this notification: ----------------------------------------------------------------------- kernel/exit.c 1031 asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struct rusage * ru) 1032 { 1033 DECLARE_WAITQUEUE(wait, current); 1034 struct task_struct *tsk; 1035 int flag, retval; 1036 1037 if (options & ~(WNOHANG|WUNTRACED|__WNOTHREAD|__WCLONE|__WALL)) 1038 return -EINVAL; 1039 1040 add_wait_queue(¤t->wait_chldexit,&wait); 1041 repeat: 1042 flag = 0; 1043 current->state = TASK_INTERRUPTIBLE; 1044 read_lock(&tasklist_lock); ... ----------------------------------------------------------------------- Line 1031The parameters include the PID of the target process, the address in which the exit status of the child should be placed, flags for sys_wait4(), and the address in which the resource usage information of the child should be placed. Lines 1033 and 1040Declare a wait queue and add the process to it. (This is covered in more detail in the "Wait Queues" section.) Line 10371038This code mostly checks for error conditions. The function returns a failure code if the system call is passed options that are invalid. In this case, the error EINVAL is returned. Line 1042The flag variable is set to 0 as an initial value. This variable is changed once the pid argument is found to match one of the calling task's children. Line 1043This code is where the calling process is set to blocking. The state of the task is moved from TASK_RUNNING to TASK_INTERRUPTIBLE. ----------------------------------------------------------------------- kernel/exit.c ... 1045 tsk = current; 1046 do { 1047 struct task_struct *p; 1048 struct list_head *_p; 1049 int ret; 1050 1051 list_for_each(_p,&tsk->children) { 1052 p = list_entry(_p,struct task_struct,sibling); 1053 1054 ret = eligible_child(pid, options, p); 1055 if (!ret) 1056 continue; 1057 flag = 1; 1058 switch (p->state) { 1059 case TASK_STOPPED: 1060 if (!(options & WUNTRACED) && 1061 !(p->ptrace & PT_PTRACED)) 1062 continue; 1063 retval = wait_task_stopped(p, ret == 2, 1064 stat_addr, ru); 1065 if (retval != 0) /* He released the lock. */ 1066 goto end_wait4; 1067 break; 1068 case TASK_ZOMBIE: ... 1072 if (ret == 2) 1073 continue; 1074 retval = wait_task_zombie(p, stat_addr, ru); 1075 if (retval != 0) /* He released the lock. */ 1076 goto end_wait4; 1077 break; 1078 } 1079 } ... 1091 tsk = next_thread(tsk); 1092 if (tsk->signal != current->signal) 1093 BUG(); 1094 } while (tsk != current); ... ----------------------------------------------------------------------- Lines 1046 and 1094The do while loop iterates once through the loop while looking at itself, then continues while looking at other tasks. Line 1051Repeat the action on every process in the task's children list. Remember that this is the parent process that is waiting on its children's exit. The process is currently in TASK_INTERRUPTIBLE and iterating over its children list. Line 1054Determine if the pid parameter passed is unreasonable. Line 10581079Check the state of each of the task's children. Actions are performed only if a child is stopped or if it is a zombie. If a task is sleeping, ready, or running (the remaining states), nothing is done. If a child is in TASK_STOPPED and the UNtrACED option has been used (which means that the task wasn't stopped because of a process trace), we verify if the status of that child has been reported and return the child's information. If a child is in TASK_ZOMBIE, it is reaped. ----------------------------------------------------------------------- kernel/exit.c ... 1106 retval = -ECHILD; 1107 end_wait4: 1108 current->state = TASK_RUNNING; 1109 remove_wait_queue(¤t->wait_chldexit,&wait); 1110 return retval; 1111 } ----------------------------------------------------------------------- Line 1106If we have gotten to this point, the PID specified by the parameter is not a child of the calling process. ECHILD is the error used to notify us of this event. Line 11071111At this point, the children list has been processed, and any children that needed to be reaped have been reaped. The parent's block is removed and its state is set to TASK_RUNNING once again. Finally, the wait queue is removed. At this point, you should be familiar with the various stages that a process goes through during its lifecycle, the kernel functions that make all this happen, and the structures the kernel uses to keep track of all this information. Now, we look at how the scheduler manipulates and manages processes to create the effect of a multithreaded system. We also see in more detail how processes go from one state to another. |