9.1 Top Five Tuning Tips | System Performance Tuning2002

These are the five most basic questions to ask when you see a system that is performing poorly. Note that all of the threshold guidelines here assume that the data is being gathered over 30-second intervals, unless otherwise noted.

9.1.1 Where Is the Disk Bottleneck?

Almost every system that feels sluggish is under a heavy disk I/O load. Look for disks that have a "service time" greater than 50 milliseconds and a disk that is more than a few percents busy. The metric of service time actually should be called "response time": it measures the delay between a process issuing a read request and that request being completed. This is very often in the critical path for user applications. It is quite possible that a disk that is heavily overloaded will have response times measured in thousands of milliseconds (not a typographic error). You can get this information from iostat -xnP 30 in Solaris:

 #  iostat -xnP 30  ...                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0s0     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0s2     0.7    2.7    5.3    9.5  0.0  0.1    0.0   16.1   0   2 c0t0d0s3 ...

In Linux, install the optional package that contains iostat , and look at the await column:

 #  iostat -x -d 30  Linux 2.4.2-2 (aiua)      07/29/2001 ... Device: rrqm/s wrqm/s  r/s  w/s  rsec/s  wsec/s avgrq-sz avgqu-sz  await svctm  %util hde       4.79  1.40  86.95 3.26 158.87  37.43     2.18     1.01   11.22   8.14  7.34 hde1      0.00  0.00  82.15 0.00  82.15   0.00     1.00     0.46    5.64   5.64  4.63 hde2      4.79  1.40   4.80 3.26  76.70  37.43    14.16     0.55   68.05  34.40  2.77 hde5      0.00  0.00   0.00 0.00   0.02   0.00     8.00     0.00  100.00 100.00  0.00 hdg       9.85  2.36  36.69 5.20 371.85  60.86    10.33     8.60  205.22  45.61 19.11 hdg1      0.00  0.00   0.06 0.00   0.07   0.00     1.03     0.00    2.94   2.94  0.00 hdg2      9.85  2.36  36.63 5.20 371.78  60.86    10.34     8.60  205.54  45.68 19.10 ...

UFS filesystems that are otherwise idle will sometimes have high service times due to the action of fsflush ; since the filesystem is basically idle, this isn't a problem. I've never seen this phenomenon on Linux systems. If this is a Solaris system, and it has more than 512 MB of memory, the inode cache is large enough. However, in small-memory systems, increasing that cache might help by reducing the number of disk I/Os needed for filesystem management.

There is basically one technique for approaching disk I/O problems, and that is spreading out the load as much as possible. This might be best accomplished by moving "hot" directories or files to another, less loaded, disk, or by investing in a disk array with a nonvolatile memory (NVRAM) cache for speeding up writes .

9.1.2 Do You Have Enough Memory?

In Solaris, there is one guiding light for whether you are short of memory, and that is the sr field of vmstat . Please remember that the first line of vmstat 's output is worthless! The other fields are largely worthless in answering this question, especially in Solaris 7 and earlier. There are a lot of misconceptions about this, two of which are the most prevalent :

The size of the free list is not an indicator of a memory shortage, because Solaris will consume any unused memory for caching recently used files.
The number of page-ins or page-outs per second is a bad metric, because Solaris handles all filesystem I/O by means of the paging mechanism. Thousands of kilobytes paged in or out just means that the system is working.

Unfortunately, if you are used to looking at vmstat output from Solaris 7, Solaris 8 changes the ballgame. It now "properly" reports information: pages that are used for filesystem caching are reported as free (because they effectively are). So, now everyone who asked "Where did all my free memory go?" in Solaris 7 is asking "Where did all this free memory come from?" after upgrading to Solaris 8.

The way to tell if you are short of memory is still the rate at which the page scanner inspects pages to see if they can be freed. If you are legitimately short of memory in Solaris 7 or earlier systems, the threshold at which you should order more memory is about 250 pages per second. If you are looking at a Solaris 8 system, any page scanner activity at all means you are short of memory.

9.1.3 Are the Processors Overloaded?

The best metric for processor overloading is the length of the run queue. The way to find this out on both Solaris and Linux is to use the vmstat command and look at the values in the procs r column. If this number is greater than about four times the number of processors in the system, then processes are probably waiting too long for a slice of time.

If the number of mutex stalls (available in Solaris as the smtx field of mpstat ) is significant (over about 250 times the number of processors), you should upgrade your CPUs to faster ones, rather than adding more.

9.1.4 Are Processes Blocked on Disk I/O?

If a process is blocked, it is a sign of a disk bottleneck. You can view the number of blocked processes by means of the vmstat command's procs b field. Whenever there are any blocked processes, the system reports all CPU idle time as wait-for-I/O time. If the number of blocked processes approaches or exceeds the number of processes in the run queue, there is a severe disk bottleneck, and you should start looking for it with iostat .

If you are administering a very batch-oriented environment, you will see your fair share of blocked processes. This isn't necessarily something to worry about, but it's an indicator that improving disk subsystem performance will improve the throughput of your batch jobs.

9.1.5 Does System Time Heavily Dominate User Time?

If any of the CPU consumption utilities ( vmstat , mpstat , etc.) indicate that there is more CPU time being spent in the kernel (system time) rather than running processes (user time), you might have a problem. The Solaris NFS server runs entirely inside the kernel, as does the Linux knfsd optional package, in which case system time is expected to dominate.

If the system in question isn't an NFS server, tracking down the problem can be difficult. The best approach is to identify the top few processes in terms of processor-time utilization (by prstat , top , or ps ) and then use a process-tracing utility like truss to see what system calls are being issued. If the system/user skew is accompanied by a high number of mutex stalls ( mpstat 's smtx column), start thinking about upgrading your processors to faster ones (but don't add more).