top


The top command is one of the most familiar performance tools. Most system administrators run top to see how their Linux and UNIX systems are performing. The top utility provides a great way to monitor the performance of processes and Linux as a whole. It is more accurate to call Linux processes tasks, but in this chapter we call them processes because that is what the tools call them.[1] top can be run as a normal user as well as root. Figure 3-1 shows typical top output from an idle system.

Figure 3-1. top output


The top display has two parts. The first third or so shows information about Linux as a whole. The remaining lines are filled with individual process information. If the window is stretched, more processes are shown to fill the screen.

Much general Linux information can be obtained by using several other commands instead of top. It is nice to have it all on one screen from one command, though. The first line shows the load average for the last one, five, and fifteen minutes. Load average indicates how many processes are running on a CPU or waiting to run. The uptime command can be used to display load averages as well. Next comes process information, followed by CPU, memory, and swap. The memory and swap information is similar to the free command output. After we determine memory and CPU usage, the next question is, which processes are using it?

Most of the process information can be obtained from the ps command too, but top provides a nicer format that is easier to read. The most useful interactive top command is h for help, which lists top's other interactive commands.

Adding and Removing Fields

Fields can be added or removed from the display. The process output can be sorted by CPU, memory, or other metric. This is a great way to see what process is hogging memory. The top syntax and interactive options differ among Linux distributions. The help command quickly lists what commands are available. Many interactive options are available. Spend some time trying them out.

Figure 3-2 shows a Red Hat Enterprise Linux ES release 3 help screen.

Figure 3-2. top help screen


The f command adds or removes fields from the top output. Figure 3-3 is a Red Hat Enterprise Linux ES release 3 help screen showing what fields can be added.

Figure 3-3. top add/remove fields screen


Figure 3-4 shows a SUSE Linux 9.0 top help screen. You can see that the commands they offer differ greatly.

Figure 3-4. SUSE top help screen


Output Explained

Let's take a look at what the information from top means. We'll use the following output from top as an example:

16:30:30  up 16 days,  7:35,  2 users,  load average: 0.54, 0.30, 0.11 73 processes: 72 sleeping, 1 running, 0 zombie, 0 stopped CPU states:  cpu    user    nice  system    irq  softirq  iowait     idle            total   13.3%    0.0%   20.9%   0.0%   0.0%    0.0%    65.7% Mem:   511996k av,  498828k used,   13168k free,  0k shrd,  59712k buff                     387576k actv,   68516k in_d,  9508k in_c Swap:  105832k av,    2500k used,  103332k free            343056k cached   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND 10250 dave      20   0  1104 1104   888 R     3.8  0.2   0:00   0 top 10252 root      23   0   568  568   492 S     0.9  0.1   0:00   0 sleep     1 root      15   0   512  512   452 S     0.0  0.1   0:04   0 init


The first line from top displays the load average information:

16:30:30 up 16 days, 7:35, 2 users, load average: 0.54, 0.30, 0.11


This output is similar to the output from uptime. You can see how long Linux has been up, the time, and the number of users. The 1-, 5-, and 15-minute load averages are displayed as well. Next, the process summary is displayed:

73 processes: 72 sleeping, 1 running, 0 zombie, 0 stopped


We see 73 total processes. Of those, 72 are sleeping, and one is running. There are no zombies or stopped processes. A process becomes a zombie when it exits and its parent has not waited for it with the wait(2) or waitpid(2) functions. This often happens because the parent process exits before its children. Zombies don't take up resources other than the entry in the process table. Stopped processes are processes that have been sent the STOP signal. See the signal(7) man page for more information.

Next up is the CPU information:

CPU states:  cpu  user   nice  system   irq  softirq  iowait   idle            total 13.3%   0.0%   20.9%  0.0%     0.0%    0.0%  65.7%


The CPU lines describe how the CPUs spend their time. The top command reports the percentage of CPU time spent in user or kernel mode, running niced processes, and in idleness. The iowait column shows the percentage of time that the processor was waiting for I/O to complete while no process was executing on the CPU. The irq and softirq columns indicate time spent serving hardware and software interrupts. Linux kernels earlier than 2.6 don't report irq, softirq, and iowait.

The memory information is next:

Mem: 511996k av, 498828k used, 13168k free,   0k shrd,  59712k buff                  387576k actv, 68516k in_d,   9508k in_c


The first three metrics give a summary of memory usage. They list total usable memory, used memory, and free memory. These are all you need to determine whether Linux is low on memory.

The next five metrics identify how the used memory is allocated. The shrd field shows shared memory usage and buff is memory used in buffers. Memory that has been allocated to the kernel or user processes can be in three different states: active, inactive dirty, and inactive clean. Active, actv in top, indicates that the memory has been used recently. Inactive dirty, in_d in top, indicates that the memory has not been used recently and may be reclaimed. In order for the memory to be reclaimed, its contents must be written to disk. This process is called "laundering" and can be called a fourth temporary state for memory. Once laundered, the inactive dirty memory becomes inactive clean, in_c in top. Available at the time of this writing is an excellent white paper by Norm Murray and Neil Horman titled "Understanding Virtual Memory in Red Hat Enterprise Linux 3" at http://people.redhat.com/nhorman/papers/rhel3_vm.pdf.

The swap information is next:

Swap: 105832k av, 2500k used, 103332k free   343056k cached


The av field is the total amount of swap that is available for use, followed by the amount used and amount free. Last is the amount of memory used for cache by the kernel.

The rest of the top display is process information:

  PID USER    PRI NI SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND 10250 dave     20  0 1104 1104   888 R     3.8  0.2   0:00   0 top 10252 root     23  0  568  568   492 S     0.9  0.1   0:00   0 sleep     1 root     15  0  512  512   452 S     0.0  0.1   0:04   0 init


top shows as many processes as can fit on the screen. The field descriptions are described well in the top(1) man page. Table 3-1 provides a summary of the fields.

Table 3-1. top Process Fields

Field

Description

PID

Process id number

USER

User name of the process owner

PRI

Priority of the process

SIZE

The size in kilobytes of the process including its code, stack, and data area

RSS

Total amount of memory in kilobytes used by the process

SHARE

The amount of shared memory used by the process

STAT

State of the process, normally R for running or S for sleeping

%CPU

Percentage of CPU this process has used since the last screen update

%MEM

Percentage of memory this process uses

TIME

Amount of CPU time this process has used since the process started

CPU

The CPU where the process last executed

COMMAND

The command being executed


Saving Customization

A very nice top feature is the capability to save the current configuration. Change the display as you please using the interactive commands and then press w to save the view. top writes a .toprc file in the user's home directory that saves the configuration. The next time this user starts top, the same display options are used.

top also looks for a default configuration file, /etc/toprc. This file is a global configuration file and is read by top when any user runs the utility. This file can be used to cause top to run in secure mode and also to set the refresh delay. Secure mode prevents non-root users from killing or changing the nice value of processes. It also prevents non-root users from changing the refresh value of top. A sample /etc/toprc file for our Red Hat Enterprise Linux ES release 3 looks like the following:

$ cat /etc/toprc s3


The s indicates secure mode, and the 3 specifies three-second refresh intervals. Other distributions may have different formats for /etc/toprc. The capability to kill processes is a pretty nice feature. If some user has a runaway process, the top command makes it easy to find and kill. Run top, show all the processes for a user with the u command, and then use k to kill it. top not only is a good performance monitoring tool, but it can also be used to improve performance by killing those offensive processes.

Batch Mode

top can also be run in batch mode. Try running the following command:

$ top n 1 b >/tmp/top.out


The -n 1 tells top to only show one iteration, and the -b option indicates that the output should be in text suitable for writing to a file or piping to another program such as less. Something like the following two-line script would make a nice cron job:

# cat /home/dave/top_metrics.sh echo "**** " 'date' " ****" >> /var/log/top/top.'date +%d'.out /usr/bin/top -n 1 -b >> /var/log/top/top.'date +%d'.out


We could add it to crontab and collect output every 15 minutes.

# crontab -l */15 * * * * /home/dave/top_metrics.sh


The batch output makes it easy to take a thorough look at what is running while enjoying a good cup of coffee. All the processes are listed, and the output isn't refreshing every five seconds. If a .toprc configuration file exists in the user's home directory, it is used to format the display. The following output came from the top batch mode running on a multi-CPU Linux server. Note that we don't show all 258 processes from the top output.

10:17:21  up 125 days, 10:10,  4 users,  load average: 3.60, 3.46, 3.73 258 processes: 252 sleeping, 6 running, 0 zombie, 0 stopped CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle            total   41.0%    0.0%   21.4%   0.4%     0.4%    0.0%   36.5%            cpu00   36.7%    0.0%   22.6%   1.8%     0.0%    0.0%   38.6%            cpu01   46.2%    0.0%   17.9%   0.0%     0.9%    0.0%   34.9%            cpu02   32.0%    0.0%   28.3%   0.0%     0.0%    0.0%   39.6%            cpu03   49.0%    0.0%   16.9%   0.0%     0.9%    0.0%   33.0% Mem:  4357776k av, 4321156k used,   36620k free,       0k shrd,  43860k buff                    3261592k actv,  625088k in_d,   80324k in_c Swap: 1048536k av,  191848k used,  856688k free          3920940k cached   PID USER    PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM     TIME CPU COMMAND 17599 wwrmn    21   0  9160 6900  1740 R    12.2  0.1    0:01    1 logsw  1003 coedev   15 -10 71128  65M 66200 S <   8.0  1.5  414:42    2 vmware-vmx 17471 wwrmn    15   0 10116 7868  1740 S     6.8  0.1    0:12    2 logsw 17594 wwrmn    18   0  9616 7356  1740 R     4.4  0.1    0:01    0 logsw  6498 coedev   25   0 43108  36M 33840 R     4.0  0.8   9981m    1 vmware-vmx 17595 wwrmn    17   0  8892 6632  1740 S     3.0  0.1    0:01    3 logsw 17446 wwrmn    15   0 10196 7960  1740 S     2.8  0.1    0:13    3 logsw 17473 wwrmn    15   0  9196 6948  1740 S     2.8  0.1    0:02    1 logsw 17477 wwrmn    15   0  9700 7452  1740 S     2.3  0.1    0:04    2 logsw   958 coedev   15 -10 71128  65M 66200 S <   2.1  1.5   93:53    3 vmware-vmx  7828 coedev   15 -10 38144  33M 33524 S <  1.8   0.7   4056m    1 vmware-vmx  6505 coedev   25   0     0    0     0 RW   1.8   0.0   3933m    1 vmware-rtc  7821 coedev   15 -10 38144  33M 33524 S <  1.6   0.7   6766m    1 vmware-vmx  6478 coedev   15 -10 43108  36M 33840 S <  1.6   0.8   6224m    0 vmware-vmx 17449 wwrmn    15   0  9820 7572  1740 S    1.6   0.1    0:07    3 logsw  7783 coedev   15   0 47420  15M  1632 S    1.4   0.3   1232m    3 vmware  6497 coedev   15 -10 43108  36M 33840 S <  0.9   0.8   3905m    1 vmware-vmx  1002 coedev   15 -10 71128  65M 66200 S <  0.9   1.5   59:54    2 vmware-vmx 17600 jtk      20   0  1276 1276   884 R    0.9   0.0   0:00     2 top  7829 coedev   25   0 38144  33M 33524 R    0.7   0.7   6688m    0 vmware-vmx     1 root     15   0   256  228   200 S    0.0   0.0    2:25    0 init


By now you can see why top is such a popular performance tool. The interactive nature of top and the ability to easily customize the output makes it a great resource for identifying problems.



Linux Troubleshooting for System Administrators and Power Users
Real World Mac Maintenance and Backups
ISBN: 131855158
EAN: 2147483647
Year: 2004
Pages: 129
Authors: Joe Kissell

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net