In simple terms, a process is a single executable program that is running in its own address space.[10] It is distinct from a job or a command, which, on Unix systems, may be composed of many processes working together to perform a specific task. Simple commands like ls are executed as a single process. A compound command containing pipes will execute one process per pipe segment. For Unix systems, managing CPU resources must be done in large part by controlling processes, because the resource allocation and batch execution facilities available with other multitasking operating systems are underdeveloped or missing.
Unix processes come in several types. We'll look at the most common here. 2.2.1 Interactive ProcessesInteractive processes are initiated from and controlled by a terminal session. Interactiveprocesses may run either in the foreground or the background. Foreground processes remain attached to the terminal; the foreground process is the one with which the terminal communicates directly. For example, typing a Unix command and waiting for its output means running a foreground process. While a foreground process is running, it alone can receive direct input from the terminal. For example, if you run the diff command on two very large files, you will be unable to run another command until it finishes (or you kill it with CTRL-C). Job control allows a process to be moved between the foreground and the background at will. For example, when a process is moved from the foreground to the background, the process is temporarily stopped, and terminal control returns to its parent process (usually a shell). The background job may be resumed and continue executing unattached to the terminal session that launched it. Alternatively, it may eventually be brought to the foreground, and once again become the terminal's current process. Processes may also be started initially as background processes. Table 2-6 reviews the ways to control foreground and background processes provided by most current shells.
2.2.2 Batch ProcessesBatchprocesses are not associated with any terminal. Rather, they are submitted to a queue, from which jobs are executed sequentially. Unix offers a very primitive batch command, but vendors whose customers require queuing have generally implemented something more substantial. Some of the best known are the Network Queuing System (NQS), developed by NASA and used on many high-performance computers including Crays, as well as several network-based process-scheduling systems from various vendors. These facilities usually support heterogeneous as well as homogeneous networks, and they attempt to distribute the aggregate CPU load evenly among the workstations in the network, a process known as load balancing or load leveling. 2.2.3 DaemonsDaemons are serverprocesses, often initiated at boot time, that run continuously while the system is up, waiting in the background until a process requires their service.[11] For example, network daemons are idle until a process requests network access.
Table 2-7 provides a brief overview of the most important Unix daemons.
2.2.4 Process AttributesUnix processes have many associated attributes. Some of the most important are:
2.2.4.1 The life cycle of a processA new process is created in the following manner. An existing process makes an exact copy of itself, a procedure known as forking. The new process, called the child process, has the same environment as its parent process, although it is assigned a different process ID. Then, this image in the child process's address space is overwritten by the one the child will run; this is done via the exec system call. Hence, the often-used phrase fork-and-exec. The new program (or command) completely replaces the one duplicated from the parent. However, the environment of the parent still remains, including the values of environment variables; the assignments of standard input, standard output, and standard error; and its execution priority. Let's make this picture a bit more concrete. What happens when a user runs a command like grep? First, the user's shell process forks, creating a new shell process to run the command. Then, the new shell process execs grep, which overlays the shell's executable image in memory with grep's, which begins executing. When the grep command finishes, the process dies. This is the way that all Unix processes are created. The ultimate ancestor for every process on a Unix system is the process with PID 1, init, created during the boot process (see Chapter 4). init creates many other processes (all by fork-and-exec). Among them are usually one or more executing the getty program. The gettys are each assigned to a different serial line; they display the login prompt and wait for someone to respond to it. When someone does, the getty process execs the login program, which validates user logins, among other activities.[12]
Once the username and password are verified,[13] login execs the user's shell. Forking is not always required to run a new program, and login does not fork in this case. After logging in, the user's shell is the same process as the getty that was watching the unused serial line. That process changed programs twice by execing a new executable, and it will go on to create new processes to execute the commands that the user types. Figure 2-3 illustrates Unixprocess creation in the context of initial user login.
Figure 2-3. Unix process creation: fork and execWhen any process exits, it sends a signal to inform its parent process that is has completed. So, when a user logs out, her login shell sends a signal to its parent, init, as it dies, letting init know that it's time to create a new getty process for the terminal. init forks again and starts the getty, and the whole cycle repeats itself again and again as different users use that terminal. 2.2.4.2 Setuid and setgid file access and process executionThe purpose of the setuid and setgid access modes is to allow ordinary users to perform tasks requiring privileges and access rights that are ordinarily denied to them. For example, on many systems the write command is owned by the tty group, which also owns all of the terminal and pseudo-terminal device files. The write command has setgid access, allowing any user to use it to write a message to another user's terminal or window (to which they do not normally have any access). When users execute write, their effective GID is set to that of the group owner of the executable file (often /usr/bin/write) for the duration of the command. Setuid and/or setgid access are also used by the printing subsystem, by programs like mailers, and by some other system facilities. However, setuid programs are also notorious security risks. In practice, setuid almost always means setuid to root, and the danger is that somehow, through program stupidity or their own cleverness or both, users will figure out a way to perform additional, unauthorized functions while the setuid command is running or to retain their inherited root status after the command ends. In general, setuid access should be avoided since it involves greater security risks than setgid, and almost any function can be performed by using the latter in conjunction with carefully designed groups. See Chapter 7 for a more detailed discussion of the security issues involved with setuid and setgid programs. Keep in mind, though, that while setgid programs are safer than setuid ones, they are not risk-free themselves. 2.2.4.3 The relationship between commands and filesThe Unix operating system does not distinguish between commands and files in the ways that some systems do. Aside from a few commands that are built into each Unix shell, Unix commands are executable files stored in one of several standard locations within the filesystem. Access to commands is exactly equivalent to access to these files. By default, there is no other privilege mechanism. Even I/O is handled via special files, stored in the directory /dev, which function as interfaces to the device drivers. All I/O operations look just like ordinary file operations from the user's point of view. Unix shells use search paths to locate the executable's images for commands that users enter. In its simplest form, a search path is simply an ordered list of directories in which to look for command executables, and it is typically set in an initialization file ($HOME/.profile or $HOME/.login). A faulty (incomplete) search path is the most common cause for "Command not found" error messages. Search paths are stored in the PATH environment variable. Here is a typical PATH: $ echo $PATH /bin:/usr/ucb:/usr/bin:/usr/local/bin:.:$HOME/bin The various directories in the PATH are separated by colons. The search path is used whenever a command name is entered without an explicit directory location. As an example, consider the following command: $ od data.raw The od command is used to display a raw dump of a file. To locate this command, the operating system first looks for a file named od in /bin. If such a file exists, it is executed. If there is no od file in the /bin directory, /usr/ucb is checked next, followed by /usr/bin (where od is in fact usually located). If it were necessary, the search would continue in /usr/local/bin, the current directory, and finally the bin subdirectory of the user's home directory. The order of the directories in the search path is important when more than one version of a command exists. Such effects come into play most frequently when both the BSD and the System V versions of commands are available on a system. In this case, you should put the directory holding the versions you want to use first in your search path. For example, if you want to use the BSD versions of commands such as ls and ln on a System V-based system, then put /usr/ucb ahead of /usr/bin in your search path. Similarly, if you want to use the System V-compatible commands available on some systems, put /usr/5bin ahead of /usr/bin and /usr/ucb in your search path. These same considerations will obviously apply to users' search paths that you define for them in their initialization files (see Section 4.2). Most of the Unix administrative utilities are located in the directories /sbin and /usr/sbin. However, the locations of administrative commands can vary widely between Unix versions. These directories typically aren't in the search path unless you put them there explicitly. When executing administrative commands, you can either add these directories to your search path or provide the full pathname for the command, as in the example below: # /usr/sbin/ping hamlet I'm going to assume in my examples that the administrative directories have been added to the search path. Thus, I won't be including the full pathname for any of the commands I'll be discussing.
|