|< Day Day Up >|
We mentioned earlier that typing CTRL-Z to suspend a job is similar to typing CTRL-C to stop a job, except that you can resume the job later. They are actually similar in a deeper way: both are particular cases of the act of sending a signal to a process.
A signal is a message that one process sends to another when some abnormal event takes place or when it wants the other process to do something. Most of the time, a process sends a signal to a subprocess it created. You're undoubtedly already comfortable with the idea that one process can communicate with another through an I/O pipeline; think of a signal as another way for processes to communicate with each other. (In fact, any textbook on operating systems will tell you that both are examples of the general concept of interprocess communication, or IPC.)
Depending on the version of UNIX, there are two or three dozen types of signals, including a few that can be used for whatever purpose a programmer wishes. Signals have numbers (from 1 to the number of signals the system supports) and names; we'll use the latter. You can get a list of all the signals on your system, by name and number, by typing kill -l. Bear in mind, when you write shell code involving signals, that signal names are more portable to other versions of UNIX than signal numbers.
8.3.1. Control-Key Signals
When you type CTRL-C, you tell the shell to send the INT (for "interrupt") signal to the current job; CTRL-Z sends TSTP (on most systems, for "terminal stop"). You can also send the current job a QUIT signal by typing CTRL-\ (control-backslash); this is sort of like a "stronger" version of CTRL-C. You would normally use CTRL-\ when (and only when) CTRL-C doesn't work.
As we'll see soon, there is also a "panic" signal called KILL that you can send to a process when even CTRL-\ doesn't work. But it isn't attached to any control key, which means that you can't use it to stop the currently running process. INT, TSTP, and QUIT are the only signals you can use with control keys.
You can customize the control keys used to send signals with options of the stty command. These vary from system to system consult your manpage for the command but the usual syntax is stty signame char. signame is a name for the signal that, unfortunately, is often not the same as the names we use here. Table 1-7 in Chapter 1 lists stty names for signals found on all versions of UNIX. char is the control character, which you can give using the convention that ^(circumflex) represents "control." For example, to set your INT key to CTRL-X on most systems, use:
stty intr ^X
Now that we've told you how to do this, we should add that we don't recommend it. Changing your signal keys could lead to trouble if someone else has to stop a runaway process on your machine.
Most of the other signals are used by the operating system to advise processes of error conditions, like a bad machine code instruction, bad memory address, or division by zero, or "interesting" events such as a timer ("alarm") going off. The remaining signals are used for esoteric error conditions of interest only to low-level systems programmers; newer versions of UNIX have even more signal types.
You can use the built-in shell command kill to send a signal to any process you created not just the currently running job. kill takes as an argument the process ID, job number, or command name of the process to which you want to send the signal. By default, kill sends the TERM ("terminate") signal, which usually has the same effect as the INT signal you send with CTRL-C. But you can specify a different signal by using the signal name (or number) as an option, preceded by a dash.
kill is so named because of the nature of the default TERM signal, but there is another reason, which has to do with the way UNIX handles signals in general. The full details are too complex to go into here, but the following explanation should suffice.
Most signals cause a process that receives them to die; therefore, if you send any one of these signals, you "kill" the process that receives it. However, programs can be set up to Section 8.4 specific signals and take some other action. For example, a text editor would do well to save the file being edited before terminating when it receives a signal such as INT, TERM, or QUIT. Determining what to do when various signals come in is part of the fun of UNIX systems programming.
Here is an example of kill. Say you have an alice process in the background, with process ID 150 and job number 1, which needs to be stopped. You would start with this command:
$ kill %1
If you were successful, you would see a message like this:
+ Terminated alice
If you don't see this, then the TERM signal failed to terminate the job. The next step would be to try QUIT:
$ kill -QUIT %1
If that worked, you would see this message:
+ Exit 131 alice
The 131 is the exit status returned by alice. But if even QUIT doesn't work, the "last-ditch" method would be to use KILL:
$ kill -KILL %1
This produces the message:
+ Killed alice
It is impossible for a process to Section 8.4 a KILL signal the operating system should terminate the process immediately and unconditionally. If it doesn't, then either your process is in one of the "funny states" we'll see later in this chapter, or (far less likely) there's a bug in your version of UNIX.
Here's another example.
The solution to this task is simple, relying on jobs -p:
kill "$@" $(jobs -p)
You may be tempted to use the KILL signal immediately, instead of trying TERM (the default) and QUIT first. Don't do this. TERM and QUIT are designed to give a process the chance to "clean up" before exiting, whereas KILL will stop the process, wherever it may be in its computation. Use KILL only as a last resort!
You can use the kill command with any process you create, not just jobs in the background of your current shell. For example, if you use a windowing system, then you may have several terminal windows, each of which runs its own shell. If one shell is running a process that you want to stop, you can kill it from another window but you can't refer to it with a job number because it's running under a different shell. You must instead use its process ID.
This is probably the only situation in which a casual user would need to know the ID of a process. The command ps gives you this information; however, it can give you lots of extra information as well.
ps is a complex command. It takes several options, some of which differ from one version of UNIX to another. To add to the confusion, you may need different options on different UNIX versions to get the same information! We will use options available on the two major types of UNIX systems, those derived from System V (such as many of the versions for Intel Pentium PCs, as well as IBM's AIX and Hewlett-Packard's HP/UX) and BSD (Mac OS X, SunOS, BSD/OS). If you aren't sure which kind of UNIX version you have, try the System V options first.
You can invoke ps in its simplest form without any options. In this case, it will print a line of information about the current login shell and any processes running under it (i.e., background jobs). For example, if you were to invoke three background jobs, as we saw earlier in the chapter, the ps command on System V-derived versions of UNIX would produce output that looks something like this:
PID TTY TIME COMD 146 pts/10 0:03 -bash 2349 pts/10 0:03 alice 2367 pts/10 0:17 hatter 2389 pts/10 0:09 duchess 2390 pts/10 0:00 ps
The output on BSD-derived systems looks like this:
PID TT STAT TIME COMMAND 146 10 S 0:03 /bin/bash 2349 10 R 0:03 alice 2367 10 D 0:17 hatter teatime 2389 10 R 0:09 duchess 2390 10 R 0:00 ps
(You can ignore the STAT column.) This is a bit like the jobs command. PID is the process ID; TTY (or TT) is the terminal (or pseudo-terminal, if you are using a windowing system) the process was invoked from; TIME is the amount of processor time (not real or "wall clock" time) the process has used so far; COMD (or COMMAND) is the command. Notice that the BSD version includes the command's arguments, if any; also notice that the first line reports on the parent shell process, and in the last line, ps reports on itself.
ps without arguments lists all processes started from the current terminal or pseudo-terminal. But since ps is not a shell command, it doesn't correlate process IDs with the shell's job numbers. It also doesn't help you find the ID of the runaway process in another shell window.
To get this information, use ps -a (for "all"); this lists information on a different set of processes, depending on your UNIX version.
188.8.131.52 System V
Instead of listing all processes that were started under a specific terminal, ps -a on System V-derived systems lists all processes associated with any terminal that aren't group leaders. For our purposes, a "group leader" is the parent shell of a terminal or window. Therefore, if you are using a windowing system, ps -a lists all jobs started in all windows (by all users), but not their parent shells.
Assume that, in the previous example, you have only one terminal or window. Then ps -a will print the same output as plain ps except for the first line, since that's the parent shell. This doesn't seem to be very useful.
But consider what happens when you have multiple windows open. Let's say you have three windows, all running terminal emulators like xterm for the X Window System. You start background jobs alice, duchess, and hatter in windows with pseudo-terminal numbers 1, 2, and 3, respectively. This situation is shown in Figure 8-1.
Figure 8-1. Background jobs in multiple windows
Assume you are in the uppermost window. If you type ps, you will see something like this:
PID TTY TIME COMD 146 pts/1 0:03 bash 2349 pts/1 0:03 alice 2390 pts/1 0:00 ps
But if you type ps -a, you will see this:
PID TTY TIME COMD 146 pts/1 0:03 bash 2349 pts/1 0:03 alice 2367 pts/2 0:17 duchess 2389 pts/3 0:09 hatter 2390 pts/1 0:00 ps
Now you should see how ps -a can help you track down a runaway process. If it's hatter, you can type kill 2389. If that doesn't work, try kill -QUIT 2389, or in the worst case, kill -KILL 2389.
On BSD-derived systems, ps -a lists all jobs that were started on any terminal; in other words, it's a bit like concatenating the the results of plain ps for every user on the system. Given the above scenario, ps -a will show you all processes that the System V version shows, plus the group leaders (parent shells).
Unfortunately, ps -a (on any version of UNIX) will not report processes that are in certain conditions where they "forget" things like what shell invoked them and what terminal they belong to. Such processes are known as "zombies" or "orphans." If you have a serious runaway process problem, it's possible that the process has entered one of these states.
Let's not worry about why or how a process gets this way. All you need to understand is that the process doesn't show up when you type ps -a. You need another option to ps to see it: on System V, it's ps -e ("everything"), whereas on BSD, it's ps -ax.
These options tell ps to list processes that either weren't started from terminals or "forgot" what terminal they were started from. The former category includes lots of processes that you probably didn't even know existed: these include basic processes that run the system and so-called daemons (pronounced "demons") that handle system services like mail, printing, network filesystems, etc.
In fact, the output of ps -e or ps -ax is an excellent source of education about UNIX system internals, if you're curious about them. Run the command on your system and, for each line of the listing that looks interesting, invoke man on the process name or look it up in the UNIX Programmer's Manual for your system.
User shells and processes are listed at the very bottom of ps -e or ps -ax output; this is where you should look for runaway processes. Notice that many processes in the listing have ? instead of a terminal. Either these aren't supposed to have one (such as the basic daemons) or they're runaways. Therefore it's likely that if ps -a doesn't find a process you're trying to kill, ps -e (or ps -ax) will list it with ? in the TTY (or TT) column. You can determine which process you want by looking at the COMD (or COMMAND) column.
|< Day Day Up >|