Section 4.11. Reading iostat | Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris

4.11. Reading `iostat`

Previously we discussed the %b and wait fields of iostat's extended output. Many more fields provide other insights into disk behavior.

4.11.1. Event Size Ratio

The extended iostat output includes per-second averages for the number of events and sizes, which are in the first four columns. To demonstrate them, we captured the following output while a find / command was also running.

$ iostat -xnz 5                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device     0.2    0.2    1.1    1.5  0.0  0.0    6.5    7.1   0   0 c0t0d0     0.0    0.0    0.0    0.0  0.0  0.0    0.0    7.7   0   0 c0t2d0     0.0    0.0    0.0    0.0  0.0  0.0    0.0    3.0   0   0 mairs:vold(pid512)                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device   179.8    0.0  290.4    0.0  0.0  0.6    0.0    3.5   0  64 c0t0d0                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device   227.0    0.0  351.8    0.0  0.0  0.8    0.0    3.7   0  83 c0t0d0                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device   217.2    0.0  358.6    0.0  0.0  0.8    0.0    3.8   0  84 c0t0d0

Observe the r/s and kr/s fields when the disk was 83% busy. Let's begin with the fact it is 83% busy and only pulling 351.8 Kbytes/sec; extrapolating from 83% to 100%, this disk would peak at a miserable 420 Kbytes/sec. Now, given that we know that this disk can be driven at over 12 Mbytes/sec, ^[7] running at a speed of 420 Kbytes/sec (3% of the maximum) is a sign that something is seriously amiss. In this case, it is likely to be caused by the nature of the I/Oheavy random disk activity caused by the find command (which we can prove by using DTrace).

^[7] We know this from watching iostat while a simple dd test runs: dd if =/dev/rdsk/c0t0d0s0 of =/dev/null bs =128K.

Had we only been looking at volume (kr/s + kw/s), then a rate of 351.8 Kbytes/ sec may have incorrectly implied that this disk was fairly idle.

Another detail to notice is that there were on average 227 reads per second for that sample. There are certain overheads involved when asking a disk to perform an I/O event, so the number of IOPS (I/O operations per second) is useful to consider. Here we would add r/s and w/s.

Finally, we can take the value of kr/s and divide by r/s, to calculate the average read size: 351.8 Kbytes / 227 = 1.55 Kbytes. A similar calculation is used for the average write size. A value of 1.55 Kbytes is small but to be expected from the find command because it reads through small directory files and inodes.

4.11.2. Service Times

Three service times are available: wsvc_t, for the average time spent on the wait queue; asvc_t, for the average time spent active (sent to the disk device); and svc_t for wsvc_t plus asvc_t. iostat prints these in milliseconds.

The active service time is the most interesting; it is the time from when a disk device accepted the event to when it sent a completion interrupt. The source code behind iostat describes active time as "run" time. The following demonstrates small active service times caused by running dd on the raw device.

$ iostat -xnz 5 ...                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device   549.4    0.0 4394.8    0.0  0.0  1.0    0.0    1.7   0  95 c0t0d0

Next, we observe longer active service times while a find / runs.

$ iostat -xnz 5 ...                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t   %w  %b device    26.2   64.0  209.6  127.1  2.8  1.5   31.2   16.9   43  80 c0t0d0

From the previous discussion on event size ratios, we can see that a dd command pulling 4395 Kbytes/sec at 95% busy is using the disks in a better manner than a find / command pulling 337 Kbytes/sec (209.6 + 127.1) at 80% busy.

Now we can consider the average active service times, which have been highlighted (asvc_t). For the dd command, this was 1.7 ms, while for the find / command, it was much slower at 16.9 ms. Faster is better, so this statistic can directly describe average disk event behavior without any further calculation. It also helps to become familiar with what values are "good" or "bad" for your disks. Note:iostat(1M) does warn against believing service times for very idle disks.

Should the disk become saturated with requests, we may also see average wait queue times (wsvc_t). This indicates the average time penalty for disk events that have queued, and as such can help us understand the effects of saturation.

Lastly, disk service times are interesting from a disk perspective, but they do not necessarily equal application latency; that depends on what the file system is doing (caching, reading ahead). See Section 5.2, to continue the discussion of application latency from the FS.