Unix provides the ability to monitor process execution and, to a limited extent, specify execution priorities. By doing so, you can control how CPU time is allocated and (indirectly) how memory is used. For example, you can expedite certain jobs at the expense of all others, or you can maintain interactive response times by forcing large jobs to run at lowered priority. This section discusses Unix processes and the tools available for monitoring and controlling process execution.
The uptime command gives you a rough estimate of the system load:
% uptime 3:24pm up 2 days, 2:41, 16 users, load average: 1.90, 1.43, 1.33
uptime reports the current time, how long the system has been up, and three load average figures. The load average is a rough measure of CPU use. These three figures report the average number of processes active during the last minute, the last five minutes, and the last 15 minutes. High load averages usually mean that the system is being used heavily and the response time is correspondingly slow. Note that the system's load average does not take into account the priorities of the processes that are running.
What's high? As usual, that depends on your system. Ideally, you'd like a load average under about 3-5 (per CPU), but that's not always possible given the workload that some systems are required to handle. Ultimately, "high" means high enough that you don't need uptime to tell you that the system is overloaded you can tell from its response time.
Furthermore, different systems behave differently under the same load average. For example, on some workstations, running a single CPU-bound background job at the same time as X Windows will bring interactive response to a crawl even though the load average remains quite low. A low load average is no guarantee of a fast response time, because CPU availability is just one factor affecting overall system performance. You can generally expect to see higher typical load averages on server systems than on single-user workstations.
15.2.1 The ps Command
The ps command gives a more complete picture of system activity. This utility produces a report summarizing execution statistics for current processes. The command's options control which processes are listed and what information is displayed about each one. The format of the command differs considerably between the BSD and System V forms.
To obtain an overall view of current system activity, the most useful form of the BSD-style command is ps aux, which produces a table of all processes, arranged in order of decreasing CPU usage at the moment when the ps command was executed. It is often useful to pipe this output to head, which displays the most active processes:
% ps aux | head -5 USER PID %CPU %MEM SZ RSS TTY STAT TIME COMMAND harvey 12923 74.2 22.5 223 376 p5 R 2:12 f77 -o test test.F chavez 16725 10.9 50.8 1146 1826 p6 R N 56:04 g04 HgO.dat wang 17026 3.5 1.2 354 240 co I 0:19 vi benzene.txt marj 7997 0.2 0.3 142 46 p3 S 0:04 csh
The meanings of the fields in this output (as well as others displayed by the -l option to ps) are given in Table 15-2.
The first line in the previous example shows that user harvey is running a Fortran compilation. This process has PID 12923 and is currently running or runnable. User chavez's process (PID 16725), executing the program g04, is also running or runnable, though at a lowered priority. From this display, it's obvious who is using the most system resources at this instant: harvey and chavez have about 85% of the CPU and 73% of the memory between them. However, although it does display total CPU time, ps does not average the %CPU or %MEM values over time in any way.
A vaguely similar listing is produced by the System V ps -ef command:
$ ps -ef UID PID PPID C STIME TTY TIME CMD root 0 0 0 09:36:35 ? 0:00 sched root 1 0 0 09:36:35 ? 0:02 /etc/init ... marj 7997 1 10 09:49:32 ttyp3 0:04 csh harvey 12923 11324 9 10:19:49 ttyp5 56:12 f77 -o test test.F chavez 16725 16652 15 17:02:43 ttyp6 10:04 g04 HgO.dat wang 17026 17012 14 17:23:12 console 0:19 vi benzene.txt
The columns hold the username, process ID, parent's PID (the PID of the process that created it), the current scheduler value, the time the process started, its associated terminal, its accumulated CPU time, and the command it is running. Note that the ordering is by PID, not resource usage. This form of ps is supported under Solaris, HP-UX, AIX, and Tru64. ps is also useful in pipes; a common use is:
% ps aux | grep chavez
This command lists the processes user chavez currently has running.
You can use the sort command in conjunction with the System V version of ps to extract performance-related data from its process listings. For example, the following command finds processes using large amounts of memory (shown in the SZ field):
$ ps -el | head -1 ; ps -el | sort -nkr10 | head -5 F S UID PID PPID C PRI NI SZ .. . TIME CMD 240001 A 603 630828 483460 240 120 20 9711568 29530:42 l703.exe 240001 A 603 573616 540786 240 120 20 9710404 29516:30 l802.exe 240001 A 0 221240 139322 0 60 20 6140 25:50 X 240001 A 0 303204 270428 0 60 20 2004 0:32 sendmail 240001 A 0 458898 270428 0 60 20 1996 0:07 IBM.Errmd
Some columns have been removed from this output for space reasons.
15.2.2 Other Process Listing Utilities
There are several useful, free system monitoring tools. In this section, we'll look at pstree and top.
pstree displays system processes in a tree-like structure, and it is accordingly useful for illuminating the relationships between processes and for a quick, pictorial snapshot of what is running on the system. pstree was written by Werner Almesberger. It can be found by itself on many network sites and as part of the psmisc package (ftp://sunsite.unc.edu/pub/Linux/system/status/ps). It is included by default on Linux, and FreeBSD includes it among the additional packages on the installation CDs.
Here is an example of its output:
$ pstree init-+-alarmd |-anacron |-apmd |-atd |-crond |-gpm |-inetd-+-in.rlogind---bash---vi Two remote users. | `-in.rlogind---bash---mkps---gbmat-+-grops | |-gtbl | `-gtroff |-kapm-idled |-7*[kdeinit] |-kdeinit-+-kdeinitKDE clients . | `-kdeinit---bash-+-pstree | |-xclock | |-xterm---tcsh---ls | `-2*[xterm---rlogin] |-kdeinit---cat |-keventd |-khubd |-kjournald |-klogd |-login---bash---startx---xinit-+-X X windows main processes. | `-startkde---ksmserver |-mdrecoveryd |-5*[mingetty] |-portmap |-rpc.statd |-sendmail |-sshd |-syslogd |-vmware-guestd |-xfs `-xinetd---fam
In general, all processes are listed by command name, and child processes appear to the right of their parent process. Thus, init appears at the extreme left of the display, appropriately, because it is the ultimate parent of every other process. The notation:
indicates that there are n processes running command. The sample output shows five mingetty processes.
On this system, there are three groups of user processes:
The remainder of the lines in the display are the usual system processes.
The top utility provides a continuous display of the system status and most active processes, which it automatically updates every few seconds. Versions of top are included with FreeBSD, HP-UX, Linux, and Tru64. The utility was written by William LeFebvre and is available from http://www.groupsys.com/top/.
Here is a snapshot of the display from a Linux system:
6:19pm up 13 days, 23:42, 1 user, load average: 0.03, 0.03, 0.00 28 processes: 27 sleeping, 1 running, 0 zombie, 0 stopped CPU states: 7.7% user, 14.7% system, 0.0% nice, 77.6% idle Mem: 6952K av, 6480K used, 472K free, 3996K shrd, 2368K buff Swap: 16468K av, 2064K used, 14404K free PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 1215 chavez 14 0 8908 8908 7940 S 1.1 9.4 0:03 kdeinit 1106 chavez 14 -1 12748 9420 1692 S < 0.9 9.9 0:14 X 1262 chavez 16 0 1040 1040 836 R 0.9 1.1 0:00 top 1201 chavez 9 0 10096 9.9M 9024 S 0.1 10.6 0:02 kdeinit 1 root 8 0 520 520 452 S 0.0 0.5 0:04 init 2 root 9 0 0 0 0 SW 0.0 0.0 0:00 keventd ...
The first five lines give general system information: uptime statistics, overall number of processes statistics, and current CPU, memory, and swap space usage. The rest of the display consists of output similar to that provided by various options to ps (with similar column headings), arranged in order of decreasing current CPU usage. In top displays, the %CPU column indicates very recent CPU consumption for each process (over the last minute or less of elapsed time).
The HP-UX version of top is display-only. By default, the top display is updated every five seconds. You can change that interval using these command forms:
All of these examples set the update interval to eight seconds. top runs continuously until you press the q key.
Most versions of top also allow you to interact with the processes that are being displayed. Pressing the k and r keys allow you to kill and renice a process, respectively (these actions are discussed in detail later in this chapter). In both cases, top will prompt you for the PID of the process that you want to affect.
15.2.3 The /proc Filesystem
All of the Unix versions we are considering except HP-UX support the /proc filesystem. This is a pseudo filesystem whose files are actually views into parts of kernel memory and its data structures.
On most systems, the /proc filesystem consists entirely of numbered files or subdirectories under /proc, each named for the corresponding process's PID. When these items are subdirectories, the available information about each process is divided among several files located within it. Here is an example from a Linux system:
$ ls /proc/1234 cmdline cwd environ exe fd maps mem root stat statm status
The per-process information contained in the /proc filesystem is generally available in other ways (e.g., via the ps command).
Linux systems extend the /proc filesystem to include many other files and subdirectories that hold a great many system settings and current system data. For example, the cpuinfo file contains information about the processor on the computer:
$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 7 model name : Pentium III (Katmai) stepping : 3 cpu MHz : 497.847 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 992.87
These are some of the most useful files under /proc:
There are many, many more files in the /proc tree. However, I consider many of them to be of marginal use to those who are not programmers or script writers, because their information is available in a more convenient, prettier form via standard Unix commands.
In addition, the sys subdirectory tree provides access to kernel variables. Some of these files can be modified to change the corresponding system value. For example, the file kernel/panic holds the number of seconds to wait before rebooting after a kernel panic. These commands change the default value of 0 (immediately) to 60 seconds:
# cd /proc/sys/kernel # cat panic 0 # echo "60" > panic
Such changes do not persist across boots, so you'll need to place such commands into a boot script to make them permanent.
15.2.4 Kernel Idle Processes
Occasionally, you may seeprocesses that seem to have accumulated a staggering amount of both CPU time and short-term CPU usage, as in these examples:
AIX USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND root 516 99.2 0.0 20 20 - A Mar 18 6028:47 kproc Tru64 USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND root 0 0.0 7.7 396M 17M ?? R Jan 23 49:46.53 [kernel idle]
Both listed processes are kernel idle processes, which indicate how much idle time available CPU cycles that went unused has accumulated since the last system reboot. On AIX systems, there are usually multiple kproc processes (and not all of them are necessarily idle). In any case, such processes are no cause for concern.
15.2.5 Process Resource Limits
Unix provides very simple process resource limits. These are the limits that may be defined:
Resource limits are divided into two types: soft and hard. Soft limits are resource use limits currently applied by default when a new process is created. A user may increase these values up to the systemwide hard limits, beyond which only the superuser may extend them. Hard limits are thus defined as absolute ceilings on resource use.
The C shell and tcsh have two built-in commands for displaying and setting resource limits. The limit command displays current resource limits. The hard limits may be displayed by including the -h option on the limit command:
% limit % limit -h cputime 1:00:00 cputime unlimited filesize 1048575 kbytes filesize unlimited datasize 65536 kbytes datasize 3686336 kbytes stacksize 4096 kbytes stacksize 262144 kbytes coredumpsize 1024 kbytes coredumpsize unlimited memoryuse 32768 kbytes memoryuse 54528 kbytes
The bash and ksh equivalent command is ulimit (also supported in some Bourne shells). The -a and -Ha options will display the current soft and hard limits respectively; for example:
$ ulimit -a $ ulimit -Ha time(seconds) 3600 time(seconds) unlimited file(blocks) 2097151 file(blocks) 2097151 data(kbytes) 65536 data(kbytes) 257532 stack(kbytes) 4096 stack(kbytes) 196092 memory(kbytes) 32768 memory(kbytes) unlimited coredump(blocks) 1024 coredump(blocks) unlimited
Table 15-3 lists the commands that set the values of resource limits. They would usually be placed in users' login initialization files.
For example, the following commands increase the current CPU time limit to its maximum value and increase the memory use limit to 64 MB:
Now for the bad news. On most Unix systems, resource limits are poorly implemented from an administrative standpoint, for several reasons. First, the hard limits are often hard-wired into the kernel and cannot be changed by the system administrator. Second, users can always change their own soft limits. All an administrator can do is place the desired commands into users' .profile or .cshrc files and hope. Third, the limits are on a per-process basis. Unfortunately, many real jobs consist of many processes, not just one. There is currently no way to impose limits on a parent process and all its children. Finally, in many cases, limits are not even enforced; this is most often true of the ones you probably care about the most: CPU time and memory use. You'll need to experiment to find out which ones are enforced on your system.
However, one limit which it is often worth setting in user login initialization files is the core file size limit. If the users on your system will have little use for core files, set the limit to 0, preventing their creation.
15.2.6 Process Resource Limits Under AIX
AIX includes the structure for a more elaborate version of these limits, via the file /etc/security/limits (which may be modified directly or by the chuser command). It has stanzas of the form:
chavez: fsize = 2097151 Maximum file size. core = 0 Maximum core file size. cpu = 3600 Maximum CPU seconds. data = 131072 Maximum process data segment. rss = 65536 Maximum amount of physical memory. stack = 8192 Maximum process stack size.
Each stanza specifies the resource usage limits for the username that labels the stanza. These settings specify absolute limits on resource usage, and they cannot be overridden by the user.
To change chavez's memory use limit, use a command like this one:
# chuser rss=102400 chavez
This command sets chavez's default memory use limit to 100 MB by modifying or adding the rss line for chavez in /etc/security/limits. As usual, the limits set in the default stanza are applied for any user without specific settings of her own. Setting a limit to a value of -1 will allow unlimited use of that system resource.
You can also use SMIT to specify user per-process resource limits. The dialog is illustrated in Figure 15-1, and it displays the appropriate fields from the user account addition/modification screen.
Figure 15-1. Setting per-process Resource Limits with SMIT
15.2.7 Signaling and Killing Processes
Sometimes it's necessary to eliminate aprocess entirely; this is the purpose of the kill command. The syntax of the kill command, which is actually a general purpose process signaling utility, is as follows:
# kill [-signal] pids
pid is the process's identification number (or a space-separated list of process numbers), and signal is the (optional) signal to send to the process. The default signal is number 15, theTERM signal, which asks the process to terminate. In general, either the signal number or its symbolic name may be used (although on a few older System V systems, the signal must be specified numerically). You must be the superuser in order to kill someone else's process.
Sometimes, a process may still exist after a kill command. If this happens, execute the kill command with the -9 option, which sends the process signal number 9, appropriately named KILL. This almost always guarantees that the process will be destroyed. However, it does not allow the dying process to clean up before terminating and therefore may leave the process' files in an inconsistent state.
220.127.116.11 Killing multiple processes with killall
Although you can use the kill command to kill more than one process at the same time, many systems provide a killall command to make this process slightly easier. This command began life as part of the System V system shutdown procedures. In its simplest form, it kills all processes in the same process group as the process that invoked it (but not the calling process itself); thus, when invoked by init as part of a system shutdown, it will kill all processes running on the system. Like kill, killall optionally takes a signal name or number as its argument. This form of killall may also be useful in administrative scripts, and it is provided by Tru64, AIX, HP-UX, and Solaris.
Linux and FreeBSD offer an enhanced form of killall, which accepts a second argument: the name of a command. In this form, killall kills all processes running the specified command. For example, the following command sends a KILL signal to all processes running the find command:
# killall -KILL find
18.104.22.168 Processes that won't die
Occasionally, processes will not die even after being sent the KILL signal. The vast majority of such processes fall into one of three categories:
22.214.171.124 Pausing and restarting processes
The signals STOP and CONT may be used to suspend and then resume a running process. They use the same mechanism as the Ctrl-Z facility within user shells, but these signals may be sent by the superuser to any running process.