Performance Monitoring with topThe easiest-to-use of the process-monitoring utilities is top, so named because it was originally designed to list the top 10 processes currently running on the system, in descending order of CPU usage. The current version of FreeBSD's top by default shows you every process currently running in any statesomewhere around 40 processes on a freshly installed FreeBSD system. The benefit that top provides is that it's interactive and works in real time. When you run it, it takes over your terminal and updates itself every second, giving you instantaneous information about the state of the system at that moment. You can also pass commands to top, such as the kill and renice commands (covered later in this chapter), or give it different options for filtering and sorting the processes it shows you. This makes top an immensely useful tool for reining in an out-of-control server, fine-tuning the performance of certain tasks, or simply keeping an eye on things as you work in another window. top Output ExplainedWhen you run the top program, you get output similar to what's shown in Listing 15.1. Listing 15.1. Sample Output of top
By default, top shows all the system's processes (no matter who owns them), whether they're active, idle, or in "zombie" mode, and how much CPU time they're taking up. The first useful bit of information is in the second linethe number of processes. This number varies from system to system, but chances are that many more processes are currently running than can fit on your screen. You can press the I ("eye") key to switch top into displaying only the processes that are active. Because the top program is interactive, there are a number of other commands you can issue while it is running, such as the K key followed by a process ID to kill a process. We will cover these commands a little later. The next items to notice in the top output are the "load averages," which are fairly obtuse metrics you can use to tell at a glance how busy the system is. The exact derivation of the values is from the number of jobs executed over the last 1, 5, and 15 minutes, respectively; however, it's difficult to relate this to a profile of a system running real-world applications that vary greatly in resource consumption from job to job. A load of 1 generally means that the system is processing each job as it comes in, as though each one were a single person in line at the post office; if the load is higher than 1, it means processeslike post office customersare stacking up in line and the system is becoming more congested.
The header block contains more information about the RAM in the system than you'll probably ever find useful. You won't find a simple "used/free" graph of all available RAM here; instead, you see the states of all chunks of memory in the fourth and fifth lines in Listing 15.1. Note Here is where FreeBSD's robust memory management system shows its ugly underbelly. In UNIX, there is no such thing as a simple, clear division between memory that is used and memory that isn't used. The amount of RAM you have installed in your system is a nice figure to know, but it will never have any bearing on your day-to-day usageyou can't just add up the memory requirements of every application and calculate how many such programs you can fit into the RAM you've got installed. Because UNIX's memory model is heavily dependent on virtual memory (free space on the hard disk used for caching inactive data from RAM, also known as swap), you can actually run far more applications than you'd think would normally fit into RAM. The only drawback is a decrease in speed as more data (that needs to be accessed more often) gets paged into swap. See Chapter 2, "Installing FreeBSD," for a discussion of ways to optimize your swap partition for maximum efficiency, such as putting it near the edge of the disk. Don't look at the Free block and assume that it represents all the memory available in the system. That block is only the memory that hasn't yet been used at all since the system was last brought online. What you should be looking at is the Active block because that describes memory in use by active processesprograms that are currently running and not idle. The rest of the fields describe other states of use that may or may not be mutually exclusive, so adding up all the fields won't necessarily give you the amount of RAM you have. It will, in fact, probably add up to more. The Swap fields are more straightforward. Here, data is paged in and out of the virtual memory space as needed (copied to the disk and out of RAM), and usually the only fields that top shows are Used and Free. The numbers here add up predictably. It's probably more useful to look at the Swap fields than at the actual RAM fields to see how well your system is doing; if there's a lot of data in Swap (50 percent or more used), it means that data has been paged in fairly recently as a result of your physical RAM being full, and you may want to consider adding more memory. A FreeBSD system rarely runs out of swap space. If it does, as with most UNIX implementations, the results will usually be benign (you'll see error messages, but the system won't destabilize). The occasional unpredictable behavior or instability will surface, however. You'll want to keep your swap as little used as possiblefor this reason and also because naturally everything runs faster in RAM than in swap. Next, notice that the processes are listed in descending order in the WCPU column. This column lists how much of the CPU's cycles are being used currently by each process (using a "weighted" scale, taking into account CPU cycles in which the process was in a "resident" state). Don't expect the column to add up to 100%your CPU will only be lightly used most of the time, and most of the CPU's cycles will be unused. Take a look at the headers again; the CPU states line tells you how much of the processor is being used in each of the four possible states, and you can relate these values fairly closely to the percentages in the CPU column. Note Some programs are designed to use 100 percent of the CPU, unless actively throttled by the configuration. For example, Qmail (an SMTP daemon that we will cover in Chapter 25, "Configuring Email Services") or a database back-end such as MySQL might run to 100 percent of the system's capacity during heavy load. This is normal behavior and should not cause concern if the system's primary role is in running those programs. The CPU operates in discrete cycles, many millions per second (depending on its speed). Each of these cycles is dedicated to some part of some process, and over time a process will have used enough of these cycles to add up to a number measurable in seconds. This is what the TIME column tells you. Don't let the colon separator fool you into thinking that it's an hours:minutes reading; the values in the TIME column actually represent the number of CPU seconds that the process used in system states and user states, respectively. It may take minutes or hours for a process to use enough cycles to accumulate a measurable number. If a process has a large value (such as mysqld in the sample output in Listing 15.1), it's usually because the process has been running for weeks or it has become a runaway and has been taking up some huge percentage of the CPU during its runtime. In the latter case, you can easily check by looking at the WCPU column. The next parts of top's output that you should understand are the SIZE and RES columns. SIZE is the entirety of a process's allocated size, including the text, data, and stack components. Because parts of these components are shared systemwide, this column is not accurate for seeing how much memory a process is using. Instead, RES shows the resident memory value (this column should add up to the current amount of in-use memory). Both size values are "correct" in their own way, but you should use RES for determining the "traditional" amount of memory a process uses, the equivalent to what it would be reported as using in Windows or classic Mac OS. The rest of the fields in top are less important or are self-explanatory. The C column tells you which CPU a process is using if your system has more than one. PID is the process ID, a number that is assigned to each process upon execution, and USERNAME is the user who executed the process. STATE tells you which of the possible states a process is in, which isn't very informative unless it's zomb or zombie (which refers to a child process that has terminated but has not yet fully given up its process table space). Using Interactive top CommandsYou also can give top commands interactively to help sort through the information it gives you. Earlier in this section, you learned that you can press the I key to show only active processes. You also can press the U key to be prompted for a username; top then displays only processes owned by that username (use + as the username to show them all again). You can issue a kill command with the K key, which then prompts you for a PID to kill. The T key toggles whether the top process itself is displayed. These and other options are listed in the man top page. With this feature set, top serves as a very good all-around summary of what's going on in the system, and it allows you to handle the majority of the process-management tasks you'll have to perform. But top isn't a total solution; it doesn't give you detailed information about the processes themselves, and its interactive nature keeps top from being a scriptable tool or something that can be used in conjunction with pipes and other programs. For these functions, you use ps. |