Finding out how the Oracle kernel measures itself is not a terribly difficult task. This section is based on studies of Oracle8 i and Oracle9 i kernels running on Linux systems. The Oracle software for your host operating system might use different system calls than our system uses. To find out, you should see how your Oracle system behaves by using a software tool that traces system calls for a specified process. For example, Linux provides a tool called strace to trace system calls. Other operating systems use different names for the tool. There is truss for Sun Solaris ( truss is actually the original system call tracing tool for Unix), sctrace for IBM AIX, tusc for HP-UX, and strace for Microsoft Windows . I shall use the Linux name strace to refer generically to the collection of tools that trace system calls.
|
The strace tool is easy to use. For example, you can observe directly what the Oracle kernel is doing, right when each action takes place, by executing a command like the following:
$ strace -p 12417 read(7,
In this example, strace shows that a Linux program with process ID 12417 (which happens to have been an Oracle kernel process on my system) has issued a read call and is awaiting fulfillment of that call (hence the absence of the right parenthesis in the output shown here).
It is especially instructive to view strace output and Oracle SQL trace output simultaneously in two windows, so that you can observe exactly when lines are emitted to both output streams. The write calls that the Oracle kernel uses to emit its trace data of course appear in the strace output exactly when you would expect them to. The appearance of these calls makes it easy to understand when Oracle kernel actions produce trace data. In Oracle9 i for Linux (and Oracle9 i for some other operating systems), it is especially easy to correlate strace output and SQL trace output, because values returned by gettimeofday appear directly in the trace file as tim values. By using strace and SQL trace simultaneously in this manner, you can positively confirm or refute whether your Oracle kernel behaves like the pseudocode that you will see in the following sections.
|
The Oracle kernel derives all of its timing statistics from the results of system calls issued upon the host operating system. Example 7-1 shows how a program like the Oracle kernel computes the durations of its own actions.
t0 = gettimeofday; # mark the time immediately before doing something do_something; t1 = gettimeofday; # mark the time immediately after doing it t = t1 - t0; # t is the approximate duration of do_something
The gettimeofday function is an operating system call found on POSIX-compliant systems. You can learn by viewing your operating system's documentation that gettimeofday provides a C language data structure to its caller that contains the number of seconds and microseconds that have elapsed since the Epoch, which is 00:00:00 Coordinated Universal Time (UTC), January 1, 1970.
|
Imagine the execution of Example 7-1 on a timeline, as shown in Figure 7-2. In the drawing, the value of the gettimeofday clock is t = 1492 when a function called do_something begins. The value of the gettimeofday clock is t 1 = 1498 when do_something ends. Thus the measured duration of do_something is t = t 1 - t = 6 clock ticks .
|
The Oracle kernel reports on two types of elapsed duration: the e statistic denotes the elapsed duration of a single database call, and the ela statistic denotes the elapsed duration of an instrumented sequence of instructions (often a system call) executed by an Oracle kernel process. The kernel performs these computations by executing code that is structured roughly like the pseudocode shown in Example 7-2. Notice that the kernel uses the method shown in Example 7-1 as the basic building block for constructing the e and ela metrics.
procedure dbcall { e0 = gettimeofday; # mark the wall time ... # execute the db call (may call wevent) e1 = gettimeofday; # mark the wall time e = e1 - e0; # elapsed duration of dbcall print(TRC, ...); # emit PARSE, EXEC, FETCH, etc. line } procedure wevent { ela0 = gettimeofday; # mark the wall time ... # execute the wait event here ela1 = gettimeofday; # mark the wall time ela = ela1 - ela0; # ela is the duration of the wait event print(TRC, "WAIT..."); # emit WAIT line }
The Oracle kernel reports not only the elapsed duration e for each database call and ela for each system call, but also the amount of total CPU capacity c consumed by each database call. In the context of the process state transition diagram shown in Figure 7-1, the c statistic is defined as the approximate amount of time that a process has spent in the following states:
On POSIX-compliant operating systems, the Oracle kernel obtains CPU usage information from a function called getrusage on Linux and many other operating systems, or a similar function called times on HP-UX and a few other systems. Although the specifications of these two system calls vary significantly, I will use the name getrusage to refer generically to either function. Each function provides its caller with a variety of statistics about a process, including data structures representing the following four characteristics:
Approximate time spent by the process in user running state
Approximate time spent by the process in kernel running state
Approximate time spent by the process's children in user running state
Approximate time spent by the process's children in kernel running state
Each of these amounts is expressed in microseconds, regardless of whether the data are accurate to that degree of precision.
|
The Oracle kernel performs c , e , and ela computations by executing code that is structured roughly like the pseudocode shown in Example 7-3. Notice that this example builds upon Example 7-2 by including executions of the getrusage system call and the subsequent manipulation of the results. In a method analogous to the gettimeofday calculations, the Oracle kernel marks the amount of user-mode CPU time consumed by the process at the beginning of the database call, and then again at the end. The difference between the two marks ( c0 and c1 ) is the approximate amount of user-mode CPU capacity that was consumed by the database call. Shortly I'll fill you in on exactly how approximate the amount is.
procedure dbcall { e0 = gettimeofday; # mark the wall time c0 = getrusage; # obtain resource usage statistics ... # execute the db call (may call wevent) c1 = getrusage; # obtain resource usage statistics e1 = gettimeofday; # mark the wall time e = e1 - e0; # elapsed duration of dbcall c = (c1.utime + c1.stime) - (c0.utime + c0.stime); # total CPU time consumed by dbcall print(TRC, ...); # emit PARSE, EXEC, FETCH, etc. line } procedure wevent { ela0 = gettimeofday; # mark the wall time ... # execute the wait event here ela1 = gettimeofday; # mark the wall time ela = ela1 - ela0; # ela is the duration of the wait event print(TRC, "WAIT..."); # emit WAIT line }
Top |