Section 6.11. Using DTrace for Memory Analysis | Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris

6.11. Using DTrace for Memory Analysis

With the DTrace utility, you can probe more deeply into the sources of activity observed with higher-level memory analysis tools. For example, if you determine that a significant amount of paging activity is due to a memory shortage, you can determine which process is initiating the paging activity. In another example, if you see a significant amount of paging due to file activity, you can drill down to see which process and which file are responsible.

DTrace allows for memory analysis through a vminfo provider, and, optionally, through deeper tracing of virtual memory paging with the fbt provider.

The vminfo provider probes correspond to the fields in the "vm" named kstat. A probe provided by vminfo fires immediately before the corresponding vm value is incremented. Section 10.6.2 lists the probes available from the vm provider; these are further described in Section 10.6.2. A probe takes the following arguments:

arg0. The value by which the statistic is to be incremented. For most probes, this argument is always 1, but for some it may take other values; these probes are noted in Section 10.4.
arg1. A pointer to the current value of the statistic to be incremented. This value is a 64-bit quantity that is incremented by the value in arg0. Dereferencing this pointer allows consumers to determine the current count of the statistic corresponding to the probe.

For example, if you should see the following paging activity with vmstat, indicating page-in from the swap device, you could drill down to investigate.

# vmstat -p 3      memory           page          executable       anonymous      filesystem    swap  free  re  mf  fr  de  sr  epi  epo  epf   api  apo  apf  fpi  fpo  fpf  1512488 837792 160 20 12   0   0    0    0    0  8102    0    0   12   12   12  1715812 985116 7  82   0   0   0    0    0    0  7501    0    0   45    0    0  1715784 983984 0   2   0   0   0    0    0    0  1231    0    0   53    0    0  1715780 987644 0   0   0   0   0    0    0    0  2451    0    0   33    0    0 $ dtrace  -n  anonpgin'{@[execname]  =  count()}' dtrace:  description  'anonpgin' matched  1  probe   svc.startd                                                        1   sshd                                                              2   ssh                                                               3   dtrace                                                            6   vmstat                                                           28   filebench                                                       913

See Section 6.11.1 for examples of how to use dtrace for memory analysis and Section 10.6.2.

6.11.1. Using DTrace to Estimate Memory Slowdowns

You can use DTrace to directly measure elapsed time around the page-in probes when a process is waiting for page-in from the swap device, as in this example.

#!/usr/sbin/dtrace -s #pragma D option quiet dtrace:::BEGIN {         trace("Tracing... Hit Ctrl-C to end.\n"); } sched:::on-cpu {         self->on = vtimestamp; } sched:::off-cpu /self->on/ {         @oncpu[execname] = sum(vtimestamp - self->on);         self->on = 0; } vminfo:::anonpgin {         self->anonpgin = 1; } fbt::pageio_setup:return {         self->wait = timestamp; } fbt::pageio_done:entry /self->anonpgin == 1/ {         self->anonpgin = 0;         @pageintime[execname] = sum(timestamp - self->wait);         eslf->wait = 0; } dtrace:::END {         normalize(@oncpu, 1000000);         printf("Who's on cpu (milliseconds):\n");         printa(" %-50s %15@d\n", @oncpu);         normalize(@pageintime, 1000000);         printf("Who's waiting for pagein (milliseconds):\n");         printa(" %-50s %15@d\n", @pageintime); }

With an aggregation by execname, you can see who is being held up by paging the most.

# ./whospaging.d Tracing... Hit Ctrl-C to end. ^C Who's on cpu (milliseconds):   svc.startd                                                 1   loop.sh                                                    2   sshd                                                       2   ssh                                                        3   dtrace                                                     6   vmstat                                                    28   pageout                                                   60   fsflush                                                  120   filebench                                                913   sched                                                  84562 Who's waiting for pagein (milliseconds):   filebench                                             230704

In the output of whospaging.d, the filebench command spent 913 milliseconds on CPU (doing useful work) and 230.7 seconds waiting for anonymous page-ins.