2.13. Using DTrace to Explain Events from Performance ToolsDTrace can be exploited to attribute the source of events noted in higher-level tools such as mpstat(1M). For example, if we see a significant amount of system time (%sys) and a high system call rate (syscl), then we might want to know who or what is causing those system calls. # mpstat 2 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 117 0 1583 883 111 1487 593 150 6104 64 11108 7 92 0 1 1 106 0 557 842 0 1804 694 150 6553 84 10684 6 93 0 1 2 112 0 664 901 0 1998 795 143 6622 64 11227 6 93 0 1 3 95 0 770 1035 0 2232 978 131 6549 59 11769 7 92 0 1 ^C # dtrace -n 'syscall:::entry { @[execname] = count(); }' dtrace: description 'syscall:::entry ' matched 229 probes ^C inetd 1 svc.configd 1 fmd 2 snmpdx 2 utmpd 2 inetd 1 svc.configd 1 fmd 2 snmpdx 2 utmpd 2 svc.startd 13 sendmail 30 snmpd 36 nscd 105 dtrace 1311 filebench 3739725 Using the DTrace syscall provider, we can quickly identify which process is causing the most system calls. This dtrace one-liner measures system calls by process name. In this example, processes with the name filebench caused 3, 739, 725 system calls during the time the dtrace command was running. We can then drill deeper by matching the syscall probe only when the exec name matches our investigation target, filebench, and counting the syscall name. # dtrace -n 'syscall:::entry /execname == "filebench"/ { @[probefunc] = count(); }' dtrace: description 'syscall:::entry ' matched 229 probes ^C lwp_continue 4 lwp_create 4 mmap 4 schedctl 4 setcontext 4 lwp_sigmask 8 nanosleep 24 yield 554 brk 1590 pwrite 80795 lwp_park 161019 read 324159 pread 898401 semsys 1791717 Ah, so we can see that the semsys syscall is hot in this case. Let's look at what is calling semsys by using the ustack() DTrace action. # dtrace -n 'syscall::semsys:entry /execname == "filebench"/ { @[ustack()] = count();}' dtrace: description 'syscall::semsys:entry ' matched 1 probe ^C libc.so.1`_syscall6+0x1c filebench`flowop_start+0x408 libc.so.1`_lwp_start 10793 libc.so.1`_syscall6+0x1c filebench`flowop_start+0x408 libc.so.1`_lwp_start 10942 libc.so.1`_syscall6+0x1c filebench`flowop_start+0x408 libc.so.1`_lwp_start 11084 We can now identify which system call, and then even obtain the hottest stack trace for accesses to that system call. We conclude by observing that the filebench flowop_start function is performing the majority of semsys system calls on the system. |