6.11. Using DTrace for Memory AnalysisWith the DTrace utility, you can probe more deeply into the sources of activity observed with higher-level memory analysis tools. For example, if you determine that a significant amount of paging activity is due to a memory shortage, you can determine which process is initiating the paging activity. In another example, if you see a significant amount of paging due to file activity, you can drill down to see which process and which file are responsible. DTrace allows for memory analysis through a vminfo provider, and, optionally, through deeper tracing of virtual memory paging with the fbt provider. The vminfo provider probes correspond to the fields in the "vm" named kstat. A probe provided by vminfo fires immediately before the corresponding vm value is incremented. Section 10.6.2 lists the probes available from the vm provider; these are further described in Section 10.6.2. A probe takes the following arguments:
For example, if you should see the following paging activity with vmstat, indicating page-in from the swap device, you could drill down to investigate. # vmstat -p 3 memory page executable anonymous filesystem swap free re mf fr de sr epi epo epf api apo apf fpi fpo fpf 1512488 837792 160 20 12 0 0 0 0 0 8102 0 0 12 12 12 1715812 985116 7 82 0 0 0 0 0 0 7501 0 0 45 0 0 1715784 983984 0 2 0 0 0 0 0 0 1231 0 0 53 0 0 1715780 987644 0 0 0 0 0 0 0 0 2451 0 0 33 0 0 $ dtrace -n anonpgin'{@[execname] = count()}' dtrace: description 'anonpgin' matched 1 probe svc.startd 1 sshd 2 ssh 3 dtrace 6 vmstat 28 filebench 913 See Section 6.11.1 for examples of how to use dtrace for memory analysis and Section 10.6.2. 6.11.1. Using DTrace to Estimate Memory SlowdownsYou can use DTrace to directly measure elapsed time around the page-in probes when a process is waiting for page-in from the swap device, as in this example. #!/usr/sbin/dtrace -s #pragma D option quiet dtrace:::BEGIN { trace("Tracing... Hit Ctrl-C to end.\n"); } sched:::on-cpu { self->on = vtimestamp; } sched:::off-cpu /self->on/ { @oncpu[execname] = sum(vtimestamp - self->on); self->on = 0; } vminfo:::anonpgin { self->anonpgin = 1; } fbt::pageio_setup:return { self->wait = timestamp; } fbt::pageio_done:entry /self->anonpgin == 1/ { self->anonpgin = 0; @pageintime[execname] = sum(timestamp - self->wait); eslf->wait = 0; } dtrace:::END { normalize(@oncpu, 1000000); printf("Who's on cpu (milliseconds):\n"); printa(" %-50s %15@d\n", @oncpu); normalize(@pageintime, 1000000); printf("Who's waiting for pagein (milliseconds):\n"); printa(" %-50s %15@d\n", @pageintime); } With an aggregation by execname, you can see who is being held up by paging the most. # ./whospaging.d Tracing... Hit Ctrl-C to end. ^C Who's on cpu (milliseconds): svc.startd 1 loop.sh 2 sshd 2 ssh 3 dtrace 6 vmstat 28 pageout 60 fsflush 120 filebench 913 sched 84562 Who's waiting for pagein (milliseconds): filebench 230704 In the output of whospaging.d, the filebench command spent 913 milliseconds on CPU (doing useful work) and 230.7 seconds waiting for anonymous page-ins. |