Section 2.8. Clock Tick Woes | Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris

2.8. Clock Tick Woes

At some point in a discussion on CPU statistics it is obligatory to lament the inaccuracy of a 100 hertz sample: What if each sample coincided with idle time, mis-representing the state of the server?

Once upon a time, CPU statistics were gathered every clock tick or every hundredth of a second.^[7] As CPUs became faster, it became increasingly possible for fleeting activity to occur between clock ticks, and such activity would not be measured correctly. Now we use microstate accounting. It uses high-resolution timestamps to measure CPU statistics for every event, producing extremely accurate statistics. See Section 2.10.3 in Solaris^™ Internals

^[7] In fact, once upon a time statistics were gathered every 60th of a second.

If you look through the Solaris source, you will see high-resolution counters just about everywhere. Even code that expects clock tick measurements will often source the high-resolution counters instead. For example:

cpu_sys_stats_ks_update(kstat_t *ksp, int rw) { ...         csskd->cpu_ticks_idle.value.ui64 =             NSEC_TO_TICK(csskd->cpu_nsec_idle.value.ui64);         csskd->cpu_ticks_user.value.ui64 =             NSEC_TO_TICK(csskd->cpu_nsec_user.value.ui64);         csskd->cpu_ticks_kernel.value.ui64 =             NSEC_TO_TICK(csskd->cpu_nsec_kernel.value.ui64); ...                                                                 See uts/common/os/cpu.c

In this code example, NSEC_TO_TICK converts from the microstate accounting value (which is in nanoseconds) to a ticks count. For more details on CPU microstate accounting, see Section 2.12.1.

While most counters you see in Solaris are highly accurate, sampling issues remain in a few minor places. In particular, the run queue length as seen from vmstat (kthr:r) is based on a sample that is taken every second. Running vmstat with an interval of 5 prints the average of five samples taken at one-second intervals. The following (somewhat contrived) example demonstrates the problem.

$ vmstat 2 5  kthr      memory            page            disk          faults      cpu  r b w   swap  free  re  mf pi po fr de sr cd s0 -- --   in   sy   cs us sy id  0 0 23 1132672 198460 34 47 96 2  2  0 15  6  0  0  0  261  392  170  2  1 97  0 0 45 983768 120944 1075 4141 0 0 0 0  0  0  0  0  0  355 2931  378  7 25 67  0 0 45 983768 120944 955 3851 0 0 0  0  0  0  0  0  0  342 1871  279  4 22 73  0 0 45 983768 120944 940 3819 0 0 0  0  0  0  0  0  0  336 1867  280  4 22 73  0 0 45 983768 120944 816 3561 0 0 0  0  0  0  0  0  0  338 2055  273  5 20 75 $ uptime   4:50am  up 14 day(s), 23:32,  4 users,  load average: 4.43, 4.31, 4.33

For this single CPU server, vmstat reports a run queue length of zero. However, the load averages (which are now based on microstate accounting) suggest considerable load. This was caused by a program that deliberately created numerous short-lived threads every second, such that the one-second run queue sample usually missed the activity.

The runq-sz from sar -q suffers from the same problem, as does %runocc(which for short-interval measurements defeats the purpose of %runocc).

These are all minor issues, and a valid workaround is to use DTrace, with which statistics can be created at any accuracy desired. Demonstrations of this are in Section 2.14.