Processes | Moving to Ubuntu Linux

You are going to hear a lot about processes, process status, monitoring processes, or killing processes. Reducing the whole discussion to its simplest form, all you have to remember is that any command you run is a process. Processes are also sometimes referred to as jobs.

Question: So what constitutes a process?

Answer: Everything.

The session program that executes your typed commands (the shell) is a process. The tools I am using to write this chapter are creating several processes. Every terminal session you have open, every link to the Internet, every game you have runningall these programs generate one or more processes on your system. In fact, there can be hundreds, even thousands, of processes running on your system at any given time. To see your own processes, try the following command:

 # ps   PID TTY          TIME CMD  3119 pts/11   00:00:00 su  3120 pts/11   00:00:00 bash  3132 pts/11   00:00:00 ps

For a bit more detail, try using the u option. This shows all processes owned by you that currently have a controlling terminal. Even if you are running as root, you do not see system processes in this view. If you add the a option to that, you see all the processes running on that terminalin this case, revealing the subshell that did the su to root.

# ps au USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND root      4755  3.4  9.1  65376 41288 tty7     Ss+  Feb21 858:59 /usr/X11R6/bin/ mgagne   24156  0.0  0.1   4540   824 pts/2    Ss   Mar03   0:00 /bin/bash mgagne   24449  0.0  0.2   4540  1200 pts/3    Ss   Mar03   0:00 /bin/bash mgagne   24462  1.3  3.5  63260 15916 pts/3    Sl   Mar03 132:09 ./skype mgagne   10069  0.0  0.2   4540  1148 pts/5    Ss   Mar07   0:00 /bin/bash mgagne   15479  0.0  0.1   5096   824 pts/5    S+   Mar07   0:00 ssh -X -l mgagn root      3119  0.0  0.2   3696  1192 pts/11   S    15:31   0:00 su root      3120  0.0  0.4   4024  1876 pts/11   S    15:31   0:00 bash root      3134  0.0  0.2   2396  1020 pts/11   R+   15:31   0:00 ps au mgagne   26060  0.0  0.2   4544  1188 pts/1    Ss+  Mar09   0:00 /bin/bash

The most common thing you will do is add an x option. This shows all processes, controlled by your terminal or not, as well as those of other users. The administrator also wants to know about the l option, which stands for long. It is particularly useful because it shows the parent process of every process, because every process has another process that launched (or spawned) it. This is the parent process of the process ID. In sysadmin short form, this is the PPID of the PID. When your system starts up, the first process is called init. It is the master process and the superparent of every process that comes until such a time as the system is rebooted. Try this incarnation of the ps command for an interesting view of your system:

[View full width]
 # ps alxww | more F UID  PID PPID PRI NI  VSZ RSS WCHAN  STAT TTY TIME COMMAND 4   0    1    0  16  0 1568 524 -      S    ?   0:04 init [2] 1   0    2    1  34 19    0   0 ksofti SN   ?   0:00 [ksoftirqd/0] 1   0  653    6  10 -5    0   0 serio_ S<   ?   0:00 5   0 1886    1  17 -4 2424 892 -      S<s  ?   0:01 /sbin/udevd --daemon 1   0 2630    1  20  0    0   0 -      S    ?   0:00 [shpchpd_event] 1   0 2643    6  10 -5    0   0 gamepo S<   ?   0:00 [kgameportd] 4   0 3371    1  15  0 1680 488 syslog Ss   ?   0:00 /bin/dd bs 1 if /proc/kmsg of /var /run/klogd/kmsg 1 103 3373    1  17  0 2424 956 pipe_w Ss   ?   0:00 /sbin/klogd -P /var/run/klogd/kmsg 5 104 3392    1  16  0 2188 836 -      Ss   ?   0:03 /usr/bin/dbus-daemon --system 4 108 3428 3408  16  0 2008 860 -      S    ?   0:26 /usr/lib/hal/hald-addon-storage 1   0 3447    1  15  0 1928 656 -      Ss   ?   0:00 /sbin/dhcdbd --system

Again, this is a partial listing. You noticed, of course, that I threw a couple of new flags in there. The double w, or ww, displays each process's command-line options. A single w TRuncates the options at a half a line.

The columns you see tell you a bit more about each process. The F field indicates the process flag. A 040 in that position indicates a process that forked, but didn't exec, whereas a 140 means the same, but that superuser privileges were used to start the process. The UID field represents the user ID, whereas PID and PPID are the process and parent process ID that I covered earlier. PRI and NI (priority and nice number) are featured later when I discuss performance issues. In fact, there are quite a number of information flags for the ps command. Every system administrator should take some time to read the man page. More importantly, play with the command and the various flags. You will be enlightened.

Forests and Trees

With all the information displayed through ps, you are forgiven if your head is starting to hurt. It is a little like trying to see the forest but being overwhelmed by the sheer number of trees. And yet, all these processes are linked in some way. Luckily, your stock Linux distribution contains tools to make this easier. One of them is called pstree. Here's a sample of what you get by simply typing the command and pressing <Enter>:

 init--|--NetworkManager       |--NetworkManagerD       |--atd       |--bonobo-activati       |--clock-applet       |--cron       |--cupsd       |--2*[dbus-daemon]       |--dbus-launch       |--dd       |--dhcdbd       |--esd       |--events/0       |--fish-applet-2       |--gconfd-2       |--gdm----gdm--|--Xorg                      |-x-session-managssh-agent       |--6*[getty]       |--gksu----synaptic       |--gnome-cups-icon---- {gnome-cups-icon}       |--gnome-keyring-d       |--gnome-panel----{gnome-panel}       |--gnome-power-man       |--gnome-screensav----fuzzyflakes       |--gnome-settings---- {gnome-settings-}       |--gnome-terminal--|--bash                          |--gnome-pty-helpe                          |--{gnome-terminal}       |--gnome-vfs-daemo---- {gnome-vfs-daemo}

This is only a partial listing, but notice that everything on the system stems from one super, ancestral process called init. Somewhere under there, I have a login that spawns a shell. From that shell, I start an X window session, from which spawns my GNOME display manager, then my login, and so on.

If you want a similar output, but in more detail, you can go back to your old friend, the ps command. Try the f flag, which in this case stands for forest, as in forest view. The following output is the result of my running ps axf. Again, this is a partial listing, but unlike the pstree listing, you also get process IDs, running states, and so on.

[View full width]

$ ps axf 3356 ? Ss 0:00 /bin/dd bs 1 if /proc/kmsg of /var/run/klogd/kmsg 3358 ? Ss 0:00 /sbin/klogd -P /var/run/klogd/kmsg 3377 ? Ss 0:00 /usr/bin/dbus-daemon --system 3392 ? Ss 0:02 /usr/sbin/hald 3393 ? S 0:00 \_ hald-runner 3410 ? S 0:49 \_ /usr/lib/hal/hald-addon-storage 3666 ? Ss 0:00 /usr/sbin/gdm 3674 ? S 0:03 \_ /usr/sbin/gdm 3679 tty7 Rs+ 1548:41 \_ /usr/bin/X :0 -br -audit 0 -auth /var/lib/gd 4850 ?

Ss 0:02 \_ x-session-manager 4892 ? Ss 0:00 \_ /usr/bin/ssh-agent /usr/bin/dbus-launch - 3737 ?

Ssl 0:00 /usr/sbin/hpiod 3746 ? S 0:01 python /usr/sbin/hpssd

In the Linux world, you can find a number of programs devoted to deciphering those numbers, thereby making it possible to find out what processes are doing and how much time and resources they are using to do it and making it possible to manage the resultant information.

Interrupting, Suspending, and Restarting Processes

Once in a while, I start a process that I think is going to take a few secondslike parsing a large log file, scanning for some text, extracting something else, sorting the output, and finally sending the whole thing to a file. All of these are very ad hoc in terms of reporting. The trouble is this: Two and a half minutes go by and I start to get a little impatient. Had I thought that the process would take a while, I might have started it in the background.

When you start a process (by typing a command name and pressing <Enter>), you normally start that process in the foreground. In other words, your terminal is still controlling the process and the cursor sits there at the end of the line until the process completes. At that point, it returns to the command or shell prompt. For most (not all) processes, you can run things in the background, thus immediately freeing up your command line for the next task. You do this by adding an ampersand (&) to the end of the command before you press <Enter>.

$ sh long_process &

However, I've already confessed that I wasn't thinking that far ahead and as a result, I am sitting looking at a flashing cursor wondering if I did something wrong and just how long this process will take. Now, I don't want to end the process, but I would like to temporarily pause it so I can look at its output and decide whether I want to continue. As it turns out, I can do precisely that with a running process by pressing <Ctrl+Z>.

$ sh long_process Ctrl-Z [1]+  Stopped                 sh long_process

The process is now suspended. In fact, if you do a ps ax and you look for long_process, you see this:

5328 ?        RN   2267:04 ./setiathome -nice 19 11127 tty     1S      0:00 rxvt -bg black -fg white -fn fixed 11128 pts/0    S      0:00 bash 11139 pts/0    S      0:00 ssh -l www website 11177 ?        S      0:00 smbd -D 11178 ?        S      0:00 smbd -D 11219 pts/2    T      0:01 sh long_process

Quick Tip

Do you want to see what jobs you have suspended? Try the jobs command.

I added a few processes in the preceding command snapshot because I wanted to show the state of the processes. That S you see in the third column of most of these processes means they are sleeping. At any given moment or snapshot of your system, almost every single process are sleeping and a small handful show up with an R to indicate that they are currently running or runnable, sometimes referred to as being in the run queue. The T you see beside the suspended process means that it is traced, or suspended.

Two other states you might see processes in are D and Z. The D means that your process is in an uninterruptible sleep and it is likely to stay that way (usually not a good sign). The Z refers to a process that has gone zombie. It may as well be dead and will be as soon as someone gets that message across.

Getting back to the suspended process, you have a few choices. You can restart it from where it left off by typing fg at the shell prompt; in other words, you can continue the process in the foreground. The second option is to type bg, which tells the system (you guessed it) to run the suspended process in the background. If you do that, the process restarts with an ampersand at the end of the command as it did earlier.

$ bg [1]+ sh long_process &

Your other option is to terminate the process, or kill it.

Killing Processes

You can usually interrupt a foreground process by pressing <Ctrl+C>, but that does not work with background processes. The command used to terminate a process is called kill, which is an unfortunate name for a command that does more than just terminate processes. By design, kill sends a signal to a job (or jobs). That signal is sent as an option (after a hyphen) to a process ID.

kill signal_no PID

For instance, you can send the SIGHUP signal to process 7612 like this:

kill 1 7612

Signals are messages. They are usually referenced numerically, as with the ever popular kill 9 signal, but there are a number of others. The ones you are most likely to use are 1, 9, and 15. These signals can also be referenced symbolically with these names.

Signal 1 is SIGHUP. This is normally used with system processes such as xinetd and other daemons. With these types of processes, a SIGHUP tells the process to hang up, reread its configuration files, and restart. Most applications just ignore this signal.

Signal 9 is SIGKILL, an unconditional termination of the process. Some administrators I know call this "killing with extreme prejudice." The process is not asked to stop, close its files, and terminate gracefully. It is simply killed. This should be your last resort approach to killing a process and it works 99 percent of the time. Only a small handful of conditions ever ignore the 9 signal.

Signal 15, the default, is SIGTERM, a call for normal program termination. The system asks the program to wrap it up and stop doing whatever it was doing.

Remember when you suspended a process earlier? That was another signal. Try this to get a feel for how this works. If you are running in an X display, start a digital xclock with a seconds display updated every second.

xclock digital update 1 &

You should see the second digits counting away. Now, find its process ID with ps ax | grep xclock. Pretend the process ID is 12136. Let's kill that process with a SIGSTOP.

kill SIGSTOP 12136

The digits have stopped incrementing, right? Restart the clock.

kill SIGCONT 12136

As you can see, kill is probably a bad name for a command that can suspend a process and then bring it back to life. For a complete list of signals and what they do, look in the man pages with this command:

man 7 signal

If you want to kill a process by specifying the symbolic signal, you use the signal name minus the SIG prefix. For instance, to send the 1 signal to xinetd, you could do this instead:

kill HUP 'cat /var/run/xinetd.pid'

Note that those are backward quotes around the previous command string.