To run any program the operating system must first create a process. When a new process is created, a new entry is placed in the main process table. A new PCB is created and initialized and the process identification portion of the PCB contains a unique process id number and the parent process id. The program counter is set to point to the program entry point and the system stack pointers are set to define the stack boundaries for the process. The process is initialized with any of the attributes requested . If the process is not given a priority value, it is given the lowest priority value by default. The process initially does not own any resources unless there is an explicit request for resources or they have been inherited from the creator process. The state of the process is runnable and placed in the runnable or ready queue. Address space is allocated for the process. How much space to be set aside can be determined by default based on the type of process. The size can also be set as a request by the creator of the process. The creator process can pass the size of the address space to the system at the time the process is created. 3.6.1 Parent “Child Process RelationshipA process that creates or spawns another process is a parent process to the spawned child process. The init process is the parent of all user processes. The init process is the very first process visible to the UNIX system when booted up. The init process brings the system up, runs other programs when necessary, and starts daemons. It has a PID of 1. The child process has its own unique PID, PCB, and a separate entry in the process table. The child process can also spawn a process. An executing application can create a tree of processes. For example, a parent process searches a hard drive for a specified HTML document. The HTML document name is written to a global data structure like a list, which contains all the request for documents. Once the document is located, it is removed from the request list and the path is written to another global data structure, which contains the paths of located documents. To ensure a good response to the user requests, the process has a limit of five requests pending in the list. Once the limit has been reached, two new processes are spawned to handle to work load. For each process that reaches its limits, two new processes are spawned. Figure 3-9 shows a tree of processes created in this manner. A process has only one parent process, but a parent process can have many children. Figure 3-9. A tree of processes. A process spawns two new processes if a certain condition is met.
A child process can be created with its own executable image or as a duplication of the parent process. As a duplicate of the parent, the child inherits many of the attributes of the parent, including its environment, priority and scheduling policy, resource limits, open files, and shared memory segments. If the child process advances a file's position pointer, or closes the file, this will also be seen by the parent process. If the parent allocates any additional resources after the child has been created, they are not accessible to the child. In turn , if the child process allocates any resources, they are not accessible by the parent. Some attributes of the parent are not inherited by the child. As mentioned earlier, the child does not inherit the parent's PID or PCB. Of course, each process will have different parents. The child does not inherit any file locks created by the parent or any pending signals. Timing information such as processor usage and creation time are reset for the child process. Although these processes have this relationship, they function as separate processes. The program and stack counters operate separately. Because the data segments are copied , not shared, the child can change the values of its variables without affecting the parent's copy. The child and parent share the code segment and execute the instructions immediately following the system call that creates the child process. They do not execute those instructions in lock step because they compete for the processor with all the other processes loaded in the memory. Once created, the child process image can be replaced with another executable image. The code, data, and stack segments as well as its heap is over-written with the new process image. The new process preserves its PID and PPID. Table 3-3 lists the attributes preserved by the new process after its executable image has been replaced. It also lists the system calls that return these attributes. The environment variables are also preserved unless new environment variables were specified at time of the executable was replaced. Files that were open before the executable was replaced will still be open afterward. The new process will create files with the same file permissions. The CPU time will not be reset. Table 3-3. Attributes Preserved by the New Process After Its Process Image Has Been Replaced with a New Process Image
3.6.1.1 The pstree UtilityThe pstree utility in the Linux environment displays a tree of processes. It shows the running processes in a tree structure. The root of the tree is the init process.
These are some of the options that can be used with this utility:
Figure 3-10 shows the output of pstree -h in the Linux environment. Figure 3-10 Output of pstree -h in the Linux environment.ka:~ # pstree -h init-+-applix -atd -axmain -axnet -cron -gpm -inetd -9*[kdeinit] -kdeinit -+-kdeinit -kdeinit---bash---gimp---script-fu '-kdeinit---bash -+-man---sh---sh---less '-pstree -kdeinit---cat -kdm-+-X '-kdm---kde---ksmserver -kflushd -khubd -klogd -knotify -kswapd -kupdate -login---bash -lpd -mdrecoveryd -5*[mingetty] -nscd---nscd---5*[nscd] -sshd -syslogd -usbmgr '-xconsole 3.6.2 Using the fork() Function CallThe fork() call creates a new process that is a duplication of the calling process, the parent. The fork() returns two values if it succeeds, one to the parent and one to the child process. It will return to the child process and return the PID of the child to the parent process. The parent and child processes continue to execute from the instruction immediately following the fork() call. If not successful, meaning no child process was created, -1 is returned to the parent process.
The fork() will fail if the system does not have the resources to create another process. If there is a limit to the number of child processes the parent can spawn or the number of system-wide executing processes and that limit has been exceeded, the fork() will fail. In that case, errno will be set to indicate the error. 3.6.3 Using the exec Family of System CallsThe exec family of functions replaces the calling process image with a new process image. The fork() call creates a new process that is a duplication of the parent process where the exec function replaces the duplicate process image with a new one. The new process image is a regular executable file and is immediately executed. The executable can be specified as a path or a file-name. These functions can pass command-line arguments to the new process. Environment variables can also be specified. There is no return value if the function is not successful because the process image that contained the call to the exec is overwritten. If unsuccessful , -1 is returned to the calling process. All of the exec() functions can fail under these conditions:
The exec functions are used with the fork() . The fork() creates and initializes the child process with the duplicate of the parent. The child process then replaces its process image by calling an exec() . Example 3.2 shows an example of the fork-exec usage. Example 3.2 Using the fork-exec system calls.//... RtValue = fork(); if(RtValue == 0){ execl("/path/direct","direct","."); } In Example 3.2, the fork() function is called and the return value is stored in RtValue . If RtValue is , then it is the child process. The execl() function is called. The first parameter is the path to the executable module, the second parameter is the execution statement, and the third parameter is the argument. direct is utility that lists all the directories and subdirectories from a given directory. There are six versions of the exec functions, each having a different calling convention and use. 3.6.3.1 execl() FunctionsThe execl() , execle() , execlp() functions pass the command-line arguments as a list. The number of command-line arguments should be known at compile time in order for these functions to be useful.
These are examples of the syntax of the execl() functions using these arguments: char *const args[] = {"direct",".",NULL}; char *const envp[] = {"files=50",NULL}; execl("/path/direct","direct",".",NULL); execle("/path/direct","direct",".",NULL,envp); execlp("direct","direct",".",NULL); Each shows the syntax of how each execl() function creates a process that executes the direct program.
3.6.3.2 execv() FunctionsThe execv() , execve() , and execvp() functions pass the command-line arguments in a vector of pointers to null-terminated strings. The number of command-line arguments should be known at compile time in order for these functions to be useful. argv[0] is usually the execution statement.
These are examples of syntax of the execv() functions using these arguments: char *const arg[] = {"traverse",".", ">","1000",NULL}; char *const envp[] = {"files=50",NULL}; execv("/path/traverse",arg); execve("/path/traverse",arg,envp); execvp("traverse",arg); Each shows the syntax of how each execv() function creates a process that executes the traverse program. 3.6.3.3 Determining Restrictions on exec() FunctionsThere is a limit on the size argv[] and envp[] can be when passed to the exec() functions. The sysconf() can be used to determine the maximum size of command-line arguments plus the size of environment variables for the exec() functions that accept the envp[] parameter. To return the size, name should have the value _SC_ARG_MAX .
Another restriction when using exec() and the other functions used to create processes is the maximum number of simultaneous processes allowed per user id. To return this number, name has the value _SC_CHILD_MAX . 3.6.3.4 Reading and Setting Environment VariablesEnvironment variables are null-terminated strings that store system-dependent information such as paths to directories that contain commands, libraries, functions, and procedures used by a process. They can also be used to transmit any useful user-defined information between the parent and the child processes. They provide a mechanism for providing specific information to a process without having it hardcoded in the program code. System environment variables are predefined and common to all shells and processes in that system. The variables are initialized by startup files. Below are the common system variables:
They can be stored in a file or in an environment list. The environment list will contain pointers to null-terminated strings. The variable: extern char **environ points to the environment list when the process begins to execute. These strings will have the form: name=value as explained earlier. Processes initialized with the functions execl() , execlp() , execv() , and execvp() will inherit the environment of the parent process. Processes initialized with the functions execve() and execle() set the environment for the new process. There are functions and utilities that can be called to examine, add, or modify environment variables. The getenv() is used to determine whether a specific variable has been set. The parameter name is the environment variable in question. The function will return NULL if the specified variable has not been set. If the variable has been set, the function will return a pointer to a string containing the value.
For example: string Path; Path = getenv("PATH"); the string Path is assigned the value contained in the predefined environment PATH . The setenv() is used to change or add a variable to the environment of the calling process. The parameter name contains the name of the environment variable to be changed or added. It is assigned the value stored in value . If the name variable already exists, then the value is changed to value if the overwrite parameter is nonzero. If overwrite is , the content of the specified environment variable is not modified. setenv() return if it is successful and -1 if unsuccessful. The unsetenv() removes the environment variable specified by name . 3.6.4 Using system() to Spawn ProcessesThe system() is used to execute a command or executable program. The system() causes the execution of fork-exec , and a shell. The system() function executes a fork() and the child process calls an exec() with a shell that executes the given command or program.
The string parameter can be a system command or the name of an executable file. If successful, the function returns the termination status of the command or return value (if any) of the program. Errors can happen at several levels, the fork() or exec() functions may fail or the shell may not be able to execute the command or program. The function returns a value to the parent process. The function returns 127 if the exec() fails and -1 if some other error occurs. The return code of the command is returned if the function succeeds. This function does not affect the wait status of any of the children processes. 3.6.5 The POSIX Functions for Spawning ProcessesSimilar to the system() and fork-exec method of process creation, the posix_spawn() functions create new child processes from specified process images. But the posix_spawn() functions create child processes can be created with more fine-grained control. These functions control the attributes the child process inherits from the parent process including:
They also control whether signals ignored by the parent will be ignored by the child or reset to a default action. Controlling file descriptors allow the child process independent access to the data stream independent opened by the parent. Being able to set the child's process group id affects how the child's job control will relate to that of the parent. The child's scheduling policy can be set to be different from the scheduling policy of the parent.
The difference between these two functions is posix_spawn() has a path parameter and posix_spawnp() has a file parameter. The path parameter in the posix_spawn() function is the absolute or relative pathname to the executable program file. The file parameter in the posix_spawnp() function is the name of the executable program. If the parameter contains a slash, then file will be used as a pathname. If not, then the path to the executable is determined by the PATH environment variable. The file_action parameter is a pointer to a posix_spawn_file_actions_t structure: struct posix_spawn_file_actions_t{ { int __allocated; int __used; struct __spawn_action *actions; int __pad[16]; }; The posix_spawn_file_actions_t is a data structure that contains information about the actions to be performed in the new process with respect to file descriptors. The file_action parameter is used to modify the parent's set of open file descriptors to a set of file descriptors for the spawned child process. This structure can contain several file action operations to be performed in the sequence in which they were added to the spawn file action object. These file action operations are performed on the open file descriptors of the parent process. These operations can duplicate, reset, add, delete or close a specified file descriptors on behalf of the child process even before it's spawned. If the file_action parameter is a null pointer, then the file descriptors opened by the parent process will remain open for the child process without any modifications. Table 3-4 lists the functions used to add file actions to the posix_spawn_file_actions object. The attrp parameter points to a posix_spawnattr_t structure: struct posix_spawnattr_t { short int __flags; pid_t __pgrp; sigset_t __sd; sigset_t __ss; struct sched_param __sp; int __policy; int __pad[16]; }; This structure contains information about the scheduling policy, process group, signals and flags for the new process. The descriptions of individual attributes are as follows :
Table 3-4. Functions Used to Add File Actions to the posix_spawn_file_actions Object
They are bitwise-inclusive OR of 0 or more of the following: POSIX_SPAWN_RESETIDS POSIX_SPAWN_SETPGROUP POSIX_SPAWN_SETSIGDEF POSIX_SPAWN_SETSIGMASK POSIX_SPAWN_SETSCHEDPARAM POSIX_SPAWN_SETSCHEDULER Table 3-5 lists the functions used to set and retrieve the individual attributes contained in the posix_spawnattr_t structure. Table 3-5. Functions Used to Set and Retrieve the Individual Attributes Contained in the posix_spawnattr_t Structure
Example 3.3 shows how the posix_spawn() function can be used to create a process. Example 3.3 Spawning a process, using the posix_spawn() function, that calls the ps utility.#include <spawn.h> #include <stdio.h> #include <errno.h> #include <iostream> { //... posix_spawnattr_t X; posix_spawn_file_actions_t Y; pid_t Pid; char *const argv[] = {"/bin/ps","-lf",NULL}; char *const envp[] = {"PROCESSES=2"}; posix_spawnattr_init(&X); posix_spawn_file_actions_init(&Y); posix_spawn(&Pid,"/bin/ps",&Y,&X,argv,envp); perror("posix_spawn"); cout << "spawned PID: " << Pid << endl; //... return(0); } In Example 3.3, the posix_spawnattr_t and posix_spawn_file_actions_t objects are initialized. The posix_spawn() function is called with the arguments: PID , the path, Y , X , and argv , which contains the command as the first element and the argument as the second, and the envp , the environment list. If the posix_spawn() function is successful, then the value stored in Pid will be the PID of the spawned process. perror will display: posix_spawn: Success and the Pid is sent to output. The spawned process, in this case, executes: /bin/ps -lf These functions return the process id of the child process to the parent process in the pid parameter and returns as the return value. If the function is unsuccessful, no child process is created, thus no pid is returned and an error value is returned as the return value of the function. Errors can occur on three levels when using the spawn functions. An error can occur if the file_actions or attr objects are invalid. If this occurs after the function has successfully returned (the child process was spawned), then the child process may have an exit status of 127 . If the spawn attribute functions cause an error, then the error produced for that particular function (listed in Tables 3-4 and 3-5) is returned. If the spawn function has already successfully returned, then the child process may have an exit status of 127 . Errors can also occur when attempting to spawn the child process. These errors would be the same errors produced by fork() or exec() functions. If they occur, they will be the return values for the spawn functions. If the child process produces an error, it is not returned to the parent process. In order for the parent process to be aware that the child has produced an error, other mechanisms would have to be used since it will not be stored in the child's exit status. Interprocess communication can be used or the child could set some flag visible to the parent. 3.6.6 Identifying the Parent and Child with Process Management FunctionsThere are two functions that return the calling process's PID and the parent process's PID .getpid() returns the process id of the calling process. getppid() returns the parent id of the calling process. These functions are always successful, therefore no errors are defined.
|