Section 3.7. Tracing Processes

3.7. Tracing Processes

Several tools in Solaris can be used to trace the execution of a process, most notably TRuss and DTrace.

3.7.1. Using `TRuss` to Trace Processes

By default, truss traces system calls made on behalf of a process. It uses the /proc interface to start and stop the process, recording and reporting information on each traced event.

This intrusive behavior of TRuss may slow a target process down to less than half its usual speed. This may not be acceptable for the analysis of live production applications. Also, when the timing of a process changes, race-condition faults can either be relieved or created. Having the fault vanish during analysis is both annoying and ironic.^[2] Worse is when the problem gains new complexities.^[3]

^[2] It may lead to the embarrassing situation in which truss is left running perpetually.

^[3] Don't truss Xsun; it can deadlockwe did warn you!

TRuss was first written as a clever use of /proc, writing control messages to /proc/<pid>/ctl to manipulate execution flow for debugging. It has since been enhanced to trace LWPs and user-level functions. Over the years it has been an indispensable tool, and there has been no better way to get at this information.

DTrace now exists and can get similar information more safely. However TRuss will still be valuable for many situations. When you use TRuss for troubleshooting commands, speed is hardly an issue; of more interest are the system calls that failed and why. truss also provides many translations from flags into codes, allowing many system calls to be easily understood.

In the following example, we trace the system calls for a specified process ID. The trace includes the user LWP (thread) number, system call name, arguments and return codes for each system call.

$ truss -p 26274 /1:     lwp_wait(2, 0xFFFFFFFF7FFFEA4C) (sleeping...) /2:     pread(11, "\0\0\002\0\0\001\0\0\0\0".., 504, 0) = 504 /2:     pread(11, "\0\0\002\0\0\001\0\0\0\0".., 504, 0) = 504 /2:     semget(16897864, 128, 0)                         = 8 /2:     semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 /2:     pread(11, "\0\0\002\0\0\001\0\0\0\0".., 504, 0) = 504 /2:     pread(11, "\0\0\002\0\0\001\0\0\0\0".., 504, 0) = 504 /2:     semget(16897864, 128, 0)                         = 8 /2:     semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 /2:     semget(16897864, 128, 0)                         = 8 /2:     semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 /2:     semget(16897864, 128, 0)                         = 8 /2:     semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 /2:     semget(16897864, 128, 0)                         = 8 /2:     semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 /2:     semget(16897864, 128, 0)                         = 8 /2:     semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 /2:     semget(16897864, 128, 0)                         = 8 /2:     semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 ...

Optionally, we can use the -c flag to summarize rather than trace a process's system call activity.

$ truss -c -p 26274 ^C syscall               seconds   calls  errors read                     .002      10 semget                   .012      55 semtimedop               .015      55 pread                    .017      45                      --------  ------   ---- sys totals:              .047     165      0 usr time:               1.030 elapsed:                7.850

The truss command also traces functions that are visible to the dynamic linker (this excludes functions that have been locally scoped as a performance optimizationsee the Solaris Linker and Libraries Guide).

In the following example, we trace the functions within the target binary by specifying the -u option (trace functions rather than system calls) and a.out (trace within the binary, exclude libraries).

$ truss -u a.out -p 26274 /2@2:        -> flowop_endop(0xffffffff3735ef80, 0xffffffff6519c0d0, 0x0, 0x0) /2:      pread(11, "\0\0\002\0\0\001\0\0\0\0".., 504, 0) = 504 /2@2:        -> filebench_log(0x5, 0x10006ae30, 0x0, 0x0) /2@2:      -> filebench_log(0x3, 0x10006a8a8, 0xffffffff3735ef80, 0xffffffff6519c0d0) /2@2:      -> filebench_log(0x3, 0x10006a868, 0xffffffff3735ef80, 0xffffffff6519c380) /2@2:      -> filebench_log(0x3, 0x10006a888, 0xffffffff3735ef80, 0xffffffff6519c380) /2@2:      <- flowoplib_hog() = 0xffffffff3735ef80 /2@2:      -> flowoplib_sempost(0xffffffff3735ef80, 0xffffffff6519c380) /2@2:        -> filebench_log(0x5, 0x10006afa8, 0xffffffff6519c380, 0x1) /2@2:        -> flowop_beginop(0xffffffff3735ef80, 0xffffffff6519c380) /2:      pread(11, "\0\0\002\0\0\001\0\0\0\0".., 504, 0) = 504 /2@2:        -> filebench_log(0x5, 0x10006aff0, 0xffffffff651f7c30, 0x1) /2:      semget(16897864, 128, 0)                        = 8 /2:      semtimedop(8, 0xFFFFFFFF7DEFBDF4, 2, 0xFFFFFFFF7DEFBDE0) = 0 /2@2:        -> filebench_log(0x5, 0x10006b048, 0xffffffff651f7c30, 0x1) /2@2:        -> flowop_endop(0xffffffff3735ef80, 0xffffffff6519c380, 0xffffffff651f7c30) /2:      pread(11, "\0\0\002\0\0\001\0\0\0\0".., 504, 0) = 504 /2@2:      -> filebench_log(0x3, 0x10006a8a8, 0xffffffff3735ef80, 0xffffffff6519c380) ...

See truss(1M) for further information.

3.7.2. Using `apptrace` to Trace Processes

The apptrace command was added in Solaris 8 to trace calls to shared libraries while evaluating argument details. In some ways it is an enhanced version of an older command, sotruss. The Solaris 10 version of apptrace has been enhanced further, printing separate lines for the return of each function call.

In the following example, apptrace prints shared library calls from the date command.

$ apptrace date -> date     -> libc.so.1:int atexit(int (*)() = 0xff3c0090) <- date     -> libc.so.1:atexit() -> date     -> libc.so.1:int atexit(int (*)() = 0x11558) <- date     -> libc.so.1:atexit() -> date     -> libc.so.1:char * setlocale(int = 0x6, const char * = 0x11568 "") <- date     -> libc.so.1:setlocale() = 0xff05216e -> date     -> libc.so.1:char * textdomain(const char * = 0x1156c "SUNW_OST_OSCMD") <- date     -> libc.so.1:textdomain() = 0x23548 -> date     -> libc.so.1:int getopt(int = 0x1, char *const * = 0xffbffd04, const char * = 0x1157c "a:u") <- date     -> libc.so.1:getopt() = 0xffffffff -> date     -> libc.so.1:time_t time(time_t * = 0x225c0) <- date     -> libc.so.1:time() = 0x440d059e ...

To illustrate the capability of apptrace, examine the example output for the call to getopt(). The entry to getopt() can be seen after the library name it belongs to (libc.so.1); then the arguments to getopt() are printed. The option string is displayed as a string, "a:u".

apptrace can evaluate structs for function calls of interest. In this example, full details for calls to strftime() are printed.

$  apptrace -v strftime date -> date     -> libc.so.1:size_t strftime(char * = 0x225c4 "", size_t = 0x400, const char * = 0xff056c38 "%a %b %e %T %Z %Y", const struct tm * = 0xffbffc54)         arg0 = (char *) 0x225c4 ""         arg1 = (size_t) 0x400         arg2 = (const char *) 0xff056c38 "%a %b %e %T %Z %Y"         arg3 = (const struct tm *) 0xffbffc54 (struct tm) {         tm_sec: (int) 0x1         tm_min: (int) 0x9         tm_hour: (int) 0xf         tm_mday: (int) 0x7         tm_mon: (int) 0x2         tm_year: (int) 0x6a         tm_wday: (int) 0x2         tm_yday: (int) 0x41         tm_isdst: (int) 0x1         }         return = (size_t) 0x1c <- date     -> libc.so.1:strftime() = 0x1c Tue Mar  7 15:09:01 EST 2006 $

This output provides insight into how an application is using library calls, perhaps identifying faults where invalid data was used.

3.7.3. Using DTrace to Trace Process Functions

DTrace can trace system activity by using many different providers, including syscall to track system calls, sched to trace scheduling events, and io to trace disk and network I/O events. We can gain a greater understanding of process behavior by examining how the system responds to process requests. The following sections illustrate this:

Section 6.11
Section 2.15
Section 4.15

However DTrace can drill even deeper: user-level functions from processes can be traced down to the CPU instruction. Usually, however, just the function entry and return probes suffice.

By specifying the provider name as pidn, where n is the process ID, we can use DTrace to trace process functions. Here we trace function entry and return.

# dtrace  -F -p 26274 -n 'pid$target:::entry,pid$target:::return { trace(timestamp); }' dtrace: description 'pid$target:::entry, pid$target:::return ' matched 8836 probes CPU FUNCTION  18  -> flowoplib_sempost                       862876225376388  18    -> flowoplib_sempost                     862876225406704  18      -> filebench_log                       862876225479188  18        -> filebench_log                     862876225505012  18        <- filebench_log                     862876225606436  18      <- filebench_log                       862876225668788  18      -> flowop_beginop                      862876225733408  18        -> flowop_beginop                    862876225770304  18          -> pread                           862876225860508  18            -> _save_nv_regs                 862876225924036  18            <- _save_nv_regs                 862876226011512  18            -> _pread                        862876226056292  18            <- _pread                        862876226780092  18          <- pread                           862876226867256  18          -> gethrtime                       862876226940056  18          <- gethrtime                       862876227018644  18        <- flowop_beginop                    862876227106272  18      <- flowop_beginop                      862876227162292 ...

Unlike TRuss, DTrace does not stop and start the process for each traced function; instead, DTrace collects data in per-CPU buffers which the dtrace command asynchronously reads. The overhead when using DTrace on a process does depend on the frequency of traced events but is usually less than that of truss.

3.7.4. Using DTrace to Aggregate Process Functions

When processes are traced as in the previous example, the output may rush by at an incredible pace. Using aggregations can condense information of interest. In the following example, the dtrace command aggregated the user-level function calls of inetd while a connection was established.

# dtrace -n 'pid$target:a.out::entry { @[probefunc] = count(); }' -p 252 dtrace: description 'pid$target:a.out::entry ' matched 159 probes ^C ...   store_rep_vals                                                     2   store_retrieve_rep_vals                                            2   make_handle_bound                                                  6   debug_msg                                                         42   msg                                                               42   isset_pollfd                                                      58   find_pollfd                                                       71

In this example, debug_msg() was called 42 times. The column on the right counts the number of times a function was called while dtrace was running. If we drop the a.out in the probe description, dtrace TRaces function calls from all libraries as well as inetd.

3.7.5. Using DTrace to Peer Inside Processes

One of the powerful capabilities of DTrace is its ability to look inside the address space of a process and dereference pointers of interest. We demonstrate by continuing with the previous inetd example.

A function called debug_msg() sounds interesting if we were troubleshooting a problem. inetd's debug_msg() takes a format string and variables as arguments and prints them to a log file if it exists (/var/adm/inetd.log). Since the log file doesn't exist on our server, debug_msg() tosses out the messages.

Without stopping or starting inetd, we can use DTrace to see what debug_msg() would have been writing. We have to know the prototype for debug_msg(), so we either read it from the source code or guess.

# dtrace -n 'pid$target:a.out:debug_msg:entry { trace(copyinstr(arg0)); }' -p 252 dtrace: description 'pid$target:a.out:debug_msg:entry ' matched 1 probe CPU     ID                    FUNCTION:NAME   0  52162                  debug_msg:entry   Exiting poll, returned: %d   0  52162                  debug_msg:entry   Entering process_terminated_methods   0  52162                  debug_msg:entry   Entering process_network_events   0  52162                  debug_msg:entry   Entering process_nowait_req   0  52162                  debug_msg:entry   Entering accept_connection   0  52162                  debug_msg:entry   Entering run_method, instance: %s,                                               method: %s   0  52162                  debug_msg:entry   Entering read_method_context: inst: %s,                                               method: %s, path: %s   0  52162                  debug_msg:entry   Entering passes_basic_exec_checks   0  52162                  debug_msg:entry   Entering contract_prefork   0  52162                  debug_msg:entry   Entering contract_postfork   0  52162                  debug_msg:entry   Entering get_latest_contract ...

The first argument (arg0) contains the format string, and copyinstr() pulls the string from userland to the kernel, where DTrace is tracing. Although the messages printed in this example are missing their variables, they illustrate much of what inetd is internally doing. It is not uncommon to find some form of debug functions left behind in applications, and DTrace can extract them in this way.

3.7.6. Using DTrace to Sample Stack Backtraces

When we discussed the pstack command (Section 3.5.1), we suggested a crude analysis technique, by which a few stack backtraces could be taken to see where the process was spending most of its time. DTrace can turn crude into precise by taking samples at a configurable rate, such as 1000 hertz.

The following example samples user stack backtraces at 1000 hertz, matching on the PID for inetd. This is quite a useful DTrace one-liner.

# dtrace -n 'profile-1000hz /pid == $target/ { @[ustack()] = count(); }' -p 252 dtrace: description 'profile-1000hz ' matched 1 probe ^C ...               libc.so.1'_waitid+0x8               libc.so.1'waitpid+0x68               inetd'process_terminated_methods+0x74               inetd'event_loop+0x19c               inetd'start_method+0x190               inetd'_start+0x108                11               libc.so.1'__pollsys+0x4               libc.so.1'poll+0x7c               inetd'event_loop+0x70               inetd'start_method+0x190               inetd'_start+0x108                28               libc.so.1'__fork1+0x4               inetd'run_method+0x27c               inetd'process_nowait_request+0x1c8               inetd'process_network_events+0xac               inetd'event_loop+0x220               inetd'start_method+0x190               inetd'_start+0x108                53

The final stack backtrace was sampled the most, 53 times. By reading through the functions, we can determine where inetd was spending its on-CPU time.

Rather than sampling until Ctrl-C is pressed, DTrace allows us to specify an interval with ease. We added a tick-5sec probe in the following to stop sampling and exit after 5 seconds.

# dtrace -n 'profile-1000hz /pid == $target/ { @[ustack()] = count(); } tick-5sec { exit(0); }' -p 252