3.6 Creating a Process

To run any program the operating system must first create a process. When a new process is created, a new entry is placed in the main process table. A new PCB is created and initialized and the process identification portion of the PCB contains a unique process id number and the parent process id. The program counter is set to point to the program entry point and the system stack pointers are set to define the stack boundaries for the process. The process is initialized with any of the attributes requested . If the process is not given a priority value, it is given the lowest priority value by default. The process initially does not own any resources unless there is an explicit request for resources or they have been inherited from the creator process. The state of the process is runnable and placed in the runnable or ready queue. Address space is allocated for the process. How much space to be set aside can be determined by default based on the type of process. The size can also be set as a request by the creator of the process. The creator process can pass the size of the address space to the system at the time the process is created.

3.6.1 Parent “Child Process Relationship

A process that creates or spawns another process is a parent process to the spawned child process. The init process is the parent of all user processes. The init process is the very first process visible to the UNIX system when booted up. The init process brings the system up, runs other programs when necessary, and starts daemons. It has a PID of 1. The child process has its own unique PID, PCB, and a separate entry in the process table. The child process can also spawn a process. An executing application can create a tree of processes. For example, a parent process searches a hard drive for a specified HTML document. The HTML document name is written to a global data structure like a list, which contains all the request for documents. Once the document is located, it is removed from the request list and the path is written to another global data structure, which contains the paths of located documents. To ensure a good response to the user requests, the process has a limit of five requests pending in the list. Once the limit has been reached, two new processes are spawned to handle to work load. For each process that reaches its limits, two new processes are spawned. Figure 3-9 shows a tree of processes created in this manner. A process has only one parent process, but a parent process can have many children.

Figure 3-9. A tree of processes. A process spawns two new processes if a certain condition is met.

graphics/03fig09.gif

A child process can be created with its own executable image or as a duplication of the parent process. As a duplicate of the parent, the child inherits many of the attributes of the parent, including its environment, priority and scheduling policy, resource limits, open files, and shared memory segments. If the child process advances a file's position pointer, or closes the file, this will also be seen by the parent process. If the parent allocates any additional resources after the child has been created, they are not accessible to the child. In turn , if the child process allocates any resources, they are not accessible by the parent.

Some attributes of the parent are not inherited by the child. As mentioned earlier, the child does not inherit the parent's PID or PCB. Of course, each process will have different parents. The child does not inherit any file locks created by the parent or any pending signals. Timing information such as processor usage and creation time are reset for the child process. Although these processes have this relationship, they function as separate processes. The program and stack counters operate separately. Because the data segments are copied , not shared, the child can change the values of its variables without affecting the parent's copy. The child and parent share the code segment and execute the instructions immediately following the system call that creates the child process. They do not execute those instructions in lock step because they compete for the processor with all the other processes loaded in the memory.

Once created, the child process image can be replaced with another executable image. The code, data, and stack segments as well as its heap is over-written with the new process image. The new process preserves its PID and PPID. Table 3-3 lists the attributes preserved by the new process after its executable image has been replaced. It also lists the system calls that return these attributes. The environment variables are also preserved unless new environment variables were specified at time of the executable was replaced. Files that were open before the executable was replaced will still be open afterward. The new process will create files with the same file permissions. The CPU time will not be reset.

Table 3-3. Attributes Preserved by the New Process After Its Process Image Has Been Replaced with a New Process Image

Attributes preserved

Description

Process ID

getpid()

Parent Process ID

getppid()

Process Group ID

getpgid()

Session membership

getsid()

Real User ID

getuid()

Real Group ID

getgid()

Supplementary Group IDs

getgroups()

Time left on an alarm signal

alarm()

Nice value

nice()

Time used so far

times()

Process signal mask

sigprocmask ()

Pending signals

sigpending ()

File size limit

ulimit()

Resource limit

getrlimit ()

File mode creation mask

umask()

Current working directory

getcwd ()

Root directory

 
3.6.1.1 The pstree Utility

The pstree utility in the Linux environment displays a tree of processes. It shows the running processes in a tree structure. The root of the tree is the init process.

Synopsis

 pstree [-a] [-c] [-h  -Hpid] [-l] [-n] [-p] [-u] [-G]  -U] [pid  user] pstree -V 

These are some of the options that can be used with this utility:

-a

Show command-line arguments

-h

Highlight the current process and its ancestors

-H

Like -h but highlight the specified process instead

-n

Sort processes with the same ancestor by PID instead of by name

-p

Show PIDs

Figure 3-10 shows the output of pstree -h in the Linux environment.

Figure 3-10 Output of pstree -h in the Linux environment.
 ka:~ # pstree -h init-+-applix      -atd      -axmain      -axnet      -cron      -gpm      -inetd      -9*[kdeinit]  -kdeinit  -+-kdeinit               -kdeinit---bash---gimp---script-fu  '-kdeinit---bash  -+-man---sh---sh---less  '-pstree  -kdeinit---cat      -kdm-+-X           '-kdm---kde---ksmserver      -kflushd      -khubd      -klogd      -knotify      -kswapd      -kupdate      -login---bash      -lpd      -mdrecoveryd      -5*[mingetty]      -nscd---nscd---5*[nscd]      -sshd      -syslogd      -usbmgr      '-xconsole 

3.6.2 Using the fork() Function Call

The fork() call creates a new process that is a duplication of the calling process, the parent. The fork() returns two values if it succeeds, one to the parent and one to the child process. It will return to the child process and return the PID of the child to the parent process. The parent and child processes continue to execute from the instruction immediately following the fork() call. If not successful, meaning no child process was created, -1 is returned to the parent process.

Synopsis

 #include <unistd.h> pid_t fork(void); 

The fork() will fail if the system does not have the resources to create another process. If there is a limit to the number of child processes the parent can spawn or the number of system-wide executing processes and that limit has been exceeded, the fork() will fail. In that case, errno will be set to indicate the error.

3.6.3 Using the exec Family of System Calls

The exec family of functions replaces the calling process image with a new process image. The fork() call creates a new process that is a duplication of the parent process where the exec function replaces the duplicate process image with a new one. The new process image is a regular executable file and is immediately executed. The executable can be specified as a path or a file-name. These functions can pass command-line arguments to the new process. Environment variables can also be specified. There is no return value if the function is not successful because the process image that contained the call to the exec is overwritten. If unsuccessful , -1 is returned to the calling process.

All of the exec() functions can fail under these conditions:

  • Permissions are denied

    Search permission is denied for the executable's file directory

    Execution permission is denied for the executable file

  • Files do not exist

    Executable file does not exist

    Directory does not exist

  • File is not executable

    File is not executable because it is open for writing by another process

    File is not an executable file

  • Problems with symbolic links

    Loop exists when symbolic links are encountered while resolving the pathname to the executable

    Symbolic links cause the pathname to the executable to be too long

The exec functions are used with the fork() . The fork() creates and initializes the child process with the duplicate of the parent. The child process then replaces its process image by calling an exec() . Example 3.2 shows an example of the fork-exec usage.

Example 3.2 Using the fork-exec system calls.
 //... RtValue = fork(); if(RtValue == 0){    execl("/path/direct","direct","."); } 

In Example 3.2, the fork() function is called and the return value is stored in RtValue . If RtValue is , then it is the child process. The execl() function is called. The first parameter is the path to the executable module, the second parameter is the execution statement, and the third parameter is the argument. direct is utility that lists all the directories and subdirectories from a given directory. There are six versions of the exec functions, each having a different calling convention and use.

3.6.3.1 execl() Functions

The execl() , execle() , execlp() functions pass the command-line arguments as a list. The number of command-line arguments should be known at compile time in order for these functions to be useful.

  •  int execl(const char *path,const char *arg0,.../*, (char *)0 */); 

    path is the pathname to the program executable. It can be specified as an absolute pathname or a relative pathname from the current directory. The next arguments are the list the command-line arguments, from arg0 to argn . There can be n number of arguments. The list is to be followed by a NULL pointer.

  •  int execle(const char *path,const char *arg0,.../*, (char *)0 *, char *const envp[]*/); 

    This function is identical to execl() except it has an additional parameter, envp[] . This parameter contains the new environment for the new process. envp[] is a pointer to a null-terminated array of null- terminated strings. Each string has the form:

     name=value 

    where name is the name of the environment variable and value is the string to be stored. envp[] can be assigned in this manner:

     char *const envp[] = {"PATH=/opt/kde2:/sbin", "HOME=/home",NULL}; 

    PATH and HOME are the environment variables in this case.

  •  int execlp(const char *file,const char *arg0,.../*, (char *)0 */); 

    file is the name of the program executable. It uses the PATH environment variable to locate the executables. The remaining arguments list the command-line arguments as explained for execl() function.

These are examples of the syntax of the execl() functions using these arguments:

 char *const args[] = {"direct",".",NULL}; char *const envp[] = {"files=50",NULL}; execl("/path/direct","direct",".",NULL); execle("/path/direct","direct",".",NULL,envp); execlp("direct","direct",".",NULL); 

Each shows the syntax of how each execl() function creates a process that executes the direct program.

Synopsis

 #include <unistd.h> int execl(const char *path,const char *arg0,.../*,(char *)0 */); int execle(const char *path,const char *arg0,.../*,           (char *)0 *,char *const envp[]*/); int execlp(const char *file,const char *arg0,.../*,(char *)0 */); int execv(const char *path,char *const arg[]); int execve(const char *path,char *const arg[],            char *const envp[]); int execvp(const char *file,char *const arg[]); 

3.6.3.2 execv() Functions

The execv() , execve() , and execvp() functions pass the command-line arguments in a vector of pointers to null-terminated strings. The number of command-line arguments should be known at compile time in order for these functions to be useful. argv[0] is usually the execution statement.

  •  int execv(const char *path,char *const arg[]); 

    path is the pathname to the program executable. It can be specified as an absolute pathname or relative pathname to the current directory. The next argument is the null-terminated vector that contains the command-line arguments as null-terminated strings. There can be n number of arguments. The vector is to be followed by a NULL pointer. arg[] can be assigned in this manner:

     char *const arg[] = {"traverse",".", ">","1000",NULL}; 

    This is an example of a function call:

     execv("traverse",arg); 

    In this case, the traverse utility will list all files in the current directory larger than 1000 bytes.

  •  int execve(const char *path,char *const arg[],char *const envp[]); 

    This function is identical to execv() except it has the additional parameter envp[] , described earlier.

  •  int execvp(const char *file,char *const arg[]); 

    file is the name of the program executable. The next argument is the null-terminated vector that contains the command-line arguments as null-terminated strings. There can be n number of arguments. The vector is to be followed by a NULL pointer.

These are examples of syntax of the execv() functions using these arguments:

 char *const arg[] = {"traverse",".", ">","1000",NULL}; char *const envp[] = {"files=50",NULL}; execv("/path/traverse",arg); execve("/path/traverse",arg,envp); execvp("traverse",arg); 

Each shows the syntax of how each execv() function creates a process that executes the traverse program.

3.6.3.3 Determining Restrictions on exec() Functions

There is a limit on the size argv[] and envp[] can be when passed to the exec() functions. The sysconf() can be used to determine the maximum size of command-line arguments plus the size of environment variables for the exec() functions that accept the envp[] parameter. To return the size, name should have the value _SC_ARG_MAX .

Synopsis

 #include <unistd.h> long sysconf(int name); 

Another restriction when using exec() and the other functions used to create processes is the maximum number of simultaneous processes allowed per user id. To return this number, name has the value _SC_CHILD_MAX .

3.6.3.4 Reading and Setting Environment Variables

Environment variables are null-terminated strings that store system-dependent information such as paths to directories that contain commands, libraries, functions, and procedures used by a process. They can also be used to transmit any useful user-defined information between the parent and the child processes. They provide a mechanism for providing specific information to a process without having it hardcoded in the program code. System environment variables are predefined and common to all shells and processes in that system. The variables are initialized by startup files. Below are the common system variables:

$HOME

The absolute pathname of your home directory

$PATH

A list of directories to search for commands

$MAIL

The absolute pathname of your mailbox

$USER

Your user id

$SHELL

The absolute pathname of your login shell

$ TERM

Your terminal type

They can be stored in a file or in an environment list. The environment list will contain pointers to null-terminated strings. The variable:

 extern char **environ 

points to the environment list when the process begins to execute. These strings will have the form:

 name=value 

as explained earlier. Processes initialized with the functions execl() , execlp() , execv() , and execvp() will inherit the environment of the parent process. Processes initialized with the functions execve() and execle() set the environment for the new process.

There are functions and utilities that can be called to examine, add, or modify environment variables. The getenv() is used to determine whether a specific variable has been set. The parameter name is the environment variable in question. The function will return NULL if the specified variable has not been set. If the variable has been set, the function will return a pointer to a string containing the value.

Synopsis

 #include <stdlib.h> char *getenv(const char *name); int setenv(const char *name, const char *value,            int overwrite); void unsetenv(const char *name); 

For example:

 string Path; Path = getenv("PATH"); 

the string Path is assigned the value contained in the predefined environment PATH .

The setenv() is used to change or add a variable to the environment of the calling process. The parameter name contains the name of the environment variable to be changed or added. It is assigned the value stored in value . If the name variable already exists, then the value is changed to value if the overwrite parameter is nonzero. If overwrite is , the content of the specified environment variable is not modified. setenv() return if it is successful and -1 if unsuccessful. The unsetenv() removes the environment variable specified by name .

3.6.4 Using system() to Spawn Processes

The system() is used to execute a command or executable program. The system() causes the execution of fork-exec , and a shell. The system() function executes a fork() and the child process calls an exec() with a shell that executes the given command or program.

Synopsis

 #include <stdlib.h> int system(const char *string); 

The string parameter can be a system command or the name of an executable file. If successful, the function returns the termination status of the command or return value (if any) of the program. Errors can happen at several levels, the fork() or exec() functions may fail or the shell may not be able to execute the command or program.

The function returns a value to the parent process. The function returns 127 if the exec() fails and -1 if some other error occurs. The return code of the command is returned if the function succeeds. This function does not affect the wait status of any of the children processes.

3.6.5 The POSIX Functions for Spawning Processes

Similar to the system() and fork-exec method of process creation, the posix_spawn() functions create new child processes from specified process images. But the posix_spawn() functions create child processes can be created with more fine-grained control. These functions control the attributes the child process inherits from the parent process including:

  • file descriptors

  • scheduling policy

  • process group id

  • user and group id

  • signal mask

They also control whether signals ignored by the parent will be ignored by the child or reset to a default action. Controlling file descriptors allow the child process independent access to the data stream independent opened by the parent. Being able to set the child's process group id affects how the child's job control will relate to that of the parent. The child's scheduling policy can be set to be different from the scheduling policy of the parent.

Synopsis

 #include <spawn.h> int posix_spawn(pid_t *restrict pid, const char *restrict path,                 const posix_spawn_file_actions_t *file_actions,                 const posix_spawnattr_t *restrict attrp,                 char *const argv[restrict],                 char *const envp[restrict]); int posix_spawnp(pid_t *restrict pid, const char *restrict file,                  const posix_spawn_file_actions_t *file_actions,                  const posix_spawnattr_t *restrict attrp,                  char *const argv[restrict],                  char *const envp[restrict]); 

The difference between these two functions is posix_spawn() has a path parameter and posix_spawnp() has a file parameter. The path parameter in the posix_spawn() function is the absolute or relative pathname to the executable program file. The file parameter in the posix_spawnp() function is the name of the executable program. If the parameter contains a slash, then file will be used as a pathname. If not, then the path to the executable is determined by the PATH environment variable.

The file_action parameter is a pointer to a posix_spawn_file_actions_t structure:

 struct posix_spawn_file_actions_t{ {    int __allocated;    int __used;    struct __spawn_action *actions;    int __pad[16]; }; 

The posix_spawn_file_actions_t is a data structure that contains information about the actions to be performed in the new process with respect to file descriptors. The file_action parameter is used to modify the parent's set of open file descriptors to a set of file descriptors for the spawned child process. This structure can contain several file action operations to be performed in the sequence in which they were added to the spawn file action object. These file action operations are performed on the open file descriptors of the parent process. These operations can duplicate, reset, add, delete or close a specified file descriptors on behalf of the child process even before it's spawned. If the file_action parameter is a null pointer, then the file descriptors opened by the parent process will remain open for the child process without any modifications. Table 3-4 lists the functions used to add file actions to the posix_spawn_file_actions object.

The attrp parameter points to a posix_spawnattr_t structure:

 struct posix_spawnattr_t {    short int __flags;    pid_t __pgrp;    sigset_t __sd;    sigset_t __ss;    struct sched_param __sp;    int __policy;    int __pad[16]; }; 

This structure contains information about the scheduling policy, process group, signals and flags for the new process. The descriptions of individual attributes are as follows :

__flags

Used to indicate which process attributes are to be modified in the spawned process.

__pgrp

The id of the process group to be joined by the new process.

__sd

Represents the set of signals to be forced to use default signal handling by the new process.

__ss

Represents the signal mask to be used by the new process.

__sp

Represents the scheduling parameter to be assigned to the new process.

__policy

Represents the scheduling policy to be used by the new process.

Table 3-4. Functions Used to Add File Actions to the posix_spawn_file_actions Object

File Action Attributes Functions

Descriptions

 int posix_spawn_file_actions_addclose  (posix_spawn_file_actions_t   *file_actions, int fildes); 

Adds a close() action to a spawn file action object specified by file_actions . This causes the file descriptor fildes to be closed when the new process is spawned using this file action object.

 int posix_spawn_file_actions_addopen  (posix_spawn_file_actions_t   *file_actions, int fildes,   const char *restrict path,   int oflag, mode_t mode); 

Adds an open() action to a spawn file action object specified by file_actions . This causes the file named path with the returned file descriptor fildes to be opened when the new process is spawned using this file action object.

 int posix_spawn_file_actions_adddup2  (posix_spawn_file_actions_t   *file_actions, int fildes,   int new fildes); 

Adds a dup2() action to spawn a file action object specified by file_actions . This causes the file descriptor fildes to be duplicated with the file descriptor newfildes when the new process is spawned using this file action object.

 int posix_spawn_file_actions_destroy  (posix_spawn_file_actions_t   *file_actions); 

Destroys the specified file_actions object. This causes the object to be uninitialized . The object can then become reinitialized using posix_spawn_file_actions_init() .

 int posix_spawn_file_actions_destroy  (posix_spawn_file_actions_t   *file_actions); 

Initializes the specified file_actions object. Once initialized, it will contain no file actions to be performed.

They are bitwise-inclusive OR of 0 or more of the following:

 POSIX_SPAWN_RESETIDS POSIX_SPAWN_SETPGROUP POSIX_SPAWN_SETSIGDEF POSIX_SPAWN_SETSIGMASK POSIX_SPAWN_SETSCHEDPARAM POSIX_SPAWN_SETSCHEDULER 

Table 3-5 lists the functions used to set and retrieve the individual attributes contained in the posix_spawnattr_t structure.

Table 3-5. Functions Used to Set and Retrieve the Individual Attributes Contained in the posix_spawnattr_t Structure

Spawn Process Attributes functions

Descriptions

 int posix_spawnattr_getflags (const posix_spawnattr_t *restrict  attr, short *restrict flags); 

Returns the value of the __flags attribute stored in the specified attr object.

 int posix_spawnattr_setflags (posix_spawnattr_t *attr,  short flags); 

Sets the value of the __flags attribute stored in the specified attr object to flags .

 int posix_spawnattr_getpgroup (const posix_spawnattr_t *restrict  attr, pid_t *restrict pgroup); 

Returns the value of the __pgroup attribute stored in the specified attr object and stores it in the pgroup parameter.

 int posix_spawnattr_setpgroup (posix_spawnattr_t *attr,  pid_t pgroup); 

Sets the value of the __pgroup attribute stored in the specified attr object to the pgroup parameter if POSIX_SPAWN_SETPGROUP is set in the __flags attribute.

 int posix_spawnattr_getschedparam (const posix_spawnattr_t *restrict  attr, struct sched_param *restrict  schedparam); 

Returns the value of the __sp attribute stored in the specified attr object and stores it in the schedparam parameter.

 int posix_spawnattr_setschedparam (posix_spawnattr_t *attr  const struct sched_param *restrict  schedparam); 

Sets the value of the __sp attribute stored in the specified attr object to the schedparam parameter if POSIX_SPAWN_SETSCHEDPARAM is set in the __flags attribute.

 int posix_spawnattr_getschedpolicy (const posix_spawnattr_t *restrict  attr, int *restrict schedpolicy); 

Returns the value of the __policy attribute stored in the specified attr object and stores it in the schedpolicy parameter.

 int posix_spawnattr_setschedpolicy (posix_spawnattr_t *attr,  int schedpolicy); 

Sets the value of the __policy attribute stored in the specified attr object to the schedpolicy parameter if POSIX_SPAWN_SETSCHEDULER is set in the __flags attribute.

 int posix_spawnattr_getsigdefault (const posix_spawnattr_t *restrict  attr, sigset_t *restrict  sigdefault); 

Returns the value of the __sd attribute stored in the specified attr object and stores it in the sigdefault parameter.

 int posix_spawnattr_setsigdefault (posix_spawnattr_t *attr, const sigset_t *restrict  sigdefault); 

Sets the value of the __sd attribute stored in the specified attr object to the sigdefault parameter if POSIX_SPAWN_SETSIGDEF is set in the __flags attribute.

 int posix_spawnattr_getsigmask (const posix_spawnattr_t *restrict  attr, sigset_t *restrict sigmask); 

Returns the value of the __ss attribute stored in the specified attr object and stores it in the sigmask parameter.

 int posix_spawnattr_setsigmask (posix_spawnattr_t *restrict attr,  const sigset_t *restrict sigmask); 

Sets the value of the __ss attribute stored in the specified attr object to the sigmask parameter if POSIX_SPAWN_SETSIGMASK is set in the __flags attribute.

 int posix_spawnattr_destroy (posix_spawnattr_t *attr); 

Destroys the specified attr object. The object can then become reinitialized using posix_spawnattr_init() .

 int posix_spawnattr_init (posix_spawnattr_t *attr); 

Initializes the specified attr object with default values for all of the attributes contained in the structure. The object can then become reinitialized using posix_spawnattr_init() .

Example 3.3 shows how the posix_spawn() function can be used to create a process.

Example 3.3 Spawning a process, using the posix_spawn() function, that calls the ps utility.
 #include <spawn.h> #include <stdio.h> #include <errno.h> #include <iostream> {    //...    posix_spawnattr_t X;    posix_spawn_file_actions_t Y;    pid_t Pid;    char *const argv[] = {"/bin/ps","-lf",NULL};    char *const envp[] = {"PROCESSES=2"};    posix_spawnattr_init(&X);    posix_spawn_file_actions_init(&Y);    posix_spawn(&Pid,"/bin/ps",&Y,&X,argv,envp);    perror("posix_spawn");    cout << "spawned PID: " << Pid << endl;    //...    return(0); } 

In Example 3.3, the posix_spawnattr_t and posix_spawn_file_actions_t objects are initialized. The posix_spawn() function is called with the arguments: PID , the path, Y , X , and argv , which contains the command as the first element and the argument as the second, and the envp , the environment list. If the posix_spawn() function is successful, then the value stored in Pid will be the PID of the spawned process. perror will display:

 posix_spawn: Success 

and the Pid is sent to output. The spawned process, in this case, executes:

 /bin/ps -lf 

These functions return the process id of the child process to the parent process in the pid parameter and returns as the return value. If the function is unsuccessful, no child process is created, thus no pid is returned and an error value is returned as the return value of the function.

Errors can occur on three levels when using the spawn functions. An error can occur if the file_actions or attr objects are invalid. If this occurs after the function has successfully returned (the child process was spawned), then the child process may have an exit status of 127 . If the spawn attribute functions cause an error, then the error produced for that particular function (listed in Tables 3-4 and 3-5) is returned. If the spawn function has already successfully returned, then the child process may have an exit status of 127 .

Errors can also occur when attempting to spawn the child process. These errors would be the same errors produced by fork() or exec() functions. If they occur, they will be the return values for the spawn functions. If the child process produces an error, it is not returned to the parent process. In order for the parent process to be aware that the child has produced an error, other mechanisms would have to be used since it will not be stored in the child's exit status. Interprocess communication can be used or the child could set some flag visible to the parent.

3.6.6 Identifying the Parent and Child with Process Management Functions

There are two functions that return the calling process's PID and the parent process's PID .getpid() returns the process id of the calling process. getppid() returns the parent id of the calling process. These functions are always successful, therefore no errors are defined.

Synopsis

 #include <unistd.h> pid_t getpid(void); pid_t getppid(void); 



Parallel and Distributed Programming Using C++
Parallel and Distributed Programming Using C++
ISBN: 0131013769
EAN: 2147483647
Year: 2002
Pages: 133

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net