Active Session History


Active Session History (ASH) is absolutely new in Oracle Database 10g. Very simply, it is both the recording and retentionshort and long termof crucial session-state information. The reasoning behind the provisioning of this very interesting data is to allow the performance analyst to perform on-the-spot analysis of a current session as well as to be able to look at the overall breakdown of session-related information in the short term.

Session-State Information

You saw an example of a session-related analysis in Chapter 10, which presented the details of current performance issue and drilled down into the details of a Top Session. This section illustrates how ASH helps collect this information. The Database Control in Oracle Database 10g is, of course, able to roll this up and present it graphically and easily.

The session-state information for active sessions is collected by a very low-overhead direct memory sampling of the session-state information that is held in the SGA for all sessions. The display of session-state information is not exactly new, however. Way back, Oracle 7 introduced the V$SESION and V$SESSION_WAIT views, which could be combined to provide many details of the state of a particular session from a snapshot of it. The information, of course, was current at the time these views were queried, and this information was not stored anywhere for later analysis. To analyze this, let's use an example. The query in Listing 11.9 shows how you can obtain these session details. We have listed the various column and other SQL*Plus settings in the query because they control the heading, formatting, and placement, and add value to the query.

Listing 11.9. Query Against Views Showing Session-State Information
 SQL> set linesize 80 verify off SQL> column sid_serial heading "Sid,Ser#" format a10 SQL> column spid format a5 heading "OSPid" SQL> column process format a6 heading "ClntPr" SQL> column username format a17 heading "DB/OSUser" SQL> column status heading "Status" format a8 SQL> column program heading "Program" format a31  trunc SQL> column machine_prog heading "Machine/Program/Module" format a32  wrap SQL> column last_call_et format 9999.99 heading "LastCallMn" SQL> column logon_time format a18 heading "Logon Time" SQL> column waiting_event format a47 heading "Waiting on event + p1/p2/p3" trunc SQL> select s.sid || ',' || s.serial# sid_serial, p.spid, s.process,   2  s.username || '/' || s.osuser username, s.status,   3  to_char(s.logon_time, 'DD-MON-YY HH24:MI:SS') logon_time,   4  s.last_call_et/60 last_call_et,   5  w.event || ' / ' || w.p1 || ' / ' || w.p2 || ' / ' || w.p3 waiting_event,   6  s.machine || '/' ||  s.program || '/' || s.module machine_prog   7  from v$process p, v$session s, v$session_wait w   8  where s.paddr=p.addr  and s.sid in (&List_of_Oracle_SIDs)   9  and w.sid = s.sid  10  order by s.logon_time; Enter value for list_of_oracle_sids: 180,842 Sid,Ser#   OSPid ClntPr DB/OSUser         Status   Logon Time         LastCallMn ---------- ----- ------ ----------------- -------- ------------------ ---------- Waiting on event + p1/p2/p3                     Machine/Program/Module ----------------------------------------------- -------------------------------- 180,8310   3156  2368   SYSTEM/oracle     ACTIVE   25-JAN-05 06:00:16        .38 db file scattered read / 12 / 75857 / 8         tst10gclnt/dbsnmp@tst10gclnt (TN                                                 S V1-V3)/VPXDBA 842,47516  17847 17846  SYS/oracle        INACTIVE 25-JAN-05 18:55:06      10.02 SQL*Net message from client / 1650815232 / 1 /  tst10gclnt/sqlplus@tst10gclnt (T                                                 NS V1-V3)/ 

Using Session-State Information

The SQL in Listing 11.9 cleverly determines quite a lot of information from just the V$SESSION and V$SESSION_WAIT views; the V$PROCESS process is joined in order to provide operating-system process information (column OSPid) alone. Let's look in detail at what these columns are so that you can understand how ASH builds on this.

The Sid, Ser# (serial number), OSPid, ClntPr (client process ID, if available), Logon Time, and DB/OSUser (database and OS user) columns are pretty self-explanatory, and are available from the V$SESSION and V$PROCESS views. As well, the Machine/Program/Module column is taken directly from the corresponding V$SESSION columns. The MACHINE column is automatically populated and is a good one to look at when determining the source of the connection. The PROGRAM and MODULE columns are populated only when the DBMS_APPLICATION_INFO package is used within the program to record this information. For more details on usage of the DBMS_APPLICATION_INFO, refer to MetaLink Note #67549.1.

The V$SESSION.STATUS column shows whether the process was in the ACTIVE state within a database call or in the INACTIVE state after having completed it. A database call is an action being performed due to an SQL executed either directly by the session or indirectly due to a recursive SQL such as a trigger. Obviously, you would want to track all currently ACTIVE sessions.

In the case of an ACTIVE session, LastCallMn (V$SESSION.LAST_CALL_ET) shows how many minutes the session has clocked since the last new call the session issued directly or indirectly and is a good measure of long-running SQL queries. In the case of an INACTIVE session, this is a good indication of how long it has been since the client session interacted with the back-end process to make a new call.

Of course, the most interesting part is the detailed event information that is available in V$SESSION_WAIT. These columns are described in detail here so that you can understand them both in the context of older versions as well as within the context of Oracle Database 10g. Note that the meanings of these columns have not changed between versions. The columns of interest within V$SESSION_WAIT are listed in Table 11.2.

Table 11.2. Columns of Interest in V$SESSION_WAIT

Column Name

Meaning

EVENT

This VARCHAR2(64) column displays the current event on which the session is waiting or has last recorded. The contents are also known as the Wait events and form the basis for the Oracle Wait Interface tuning methodology.

P1, P2, and P3

These columns are basically parameters that can further describe the event. Their meaning changes with the event. These are discussed in great detail in the Oracle Database 10g Reference Manual, Appendix C. Also see the text columns P1TEXT, P2TEXT, and P3TEXT later in this table.

P1RAW, P2RAW, and P3RRAW

These columns expose the values of P1, P2, and P3 in a hexadecimal form. Generally not used.

P1TEXT, P2TEXT, and P2TEXT

These contain text that briefly describes the meaning of the P1, P2, and P3 columns.

WAIT_TIME

A non-zero value represents the session's last wait time in the unit shown here:

0: The session is currently waiting.

2: The duration of last wait is unknown.

1: The last wait was < 1/100th of a second in duration.

>0: The duration of the last wait in 1/100th second units (10 milliseconds).

<2: The internal time has probably wrapped.

SECONDS_IN_WAIT

When WAIT_TIME = 0, then this shows the number of seconds spent in the current wait, generally updated every three seconds on most platforms and versions. This value will not be reset on events with timeouts, such as row lock enqueues.

When WAIT_TIME != 0, this column shows the number of seconds since the start of the last wait.

STATE

This column interprets the value of WAIT_TIME and further describes the state that the session is in and takes any one of the following values:

WAITING: The session is currently waiting.

WAITED UNKNOWN TIME: The duration of the last wait is unknown (this normally appears when the TIMED_STATISTICS is defaulted to FALSE).

WAITED SHORT TIME: The last wait was < 1/100th of a second.

WAITED KNOWN TIME: WAIT_TIME equals the duration of last wait.

SEQ#

The sequence number that uniquely identifies one episode of the Wait event. This is incremented every time one Wait event completes and another is recorded. You can track the progress of a session from this column. More importantly, when this value does not increment even though the session consumes CPU, it is a sure indicator that the session is hanging. On the other hand, if the session is always ON CPU, but this value increments, it means that the session is performing some work and is probably spinning.

WAIT_CLASS

The classification text of the class of Wait event. For example, the wait could be classified as Idle, Configuration, Concurrency, User I/O, and so on. This helps to immediately identify the type of wait being suffered. This column is available in Oracle Database 10g only.


Hence, using this table, you can conclude that the session with SID 180 performed a db file scattered read event, which is another name for a multiblock read. For this event, the parameter P1 represents the file ID (12), P2 represents the starting block read (block #75857), and P3 represents the number of blocks read. In this case, you know that the DB_FILE_MULTIBLOCK_READ_COUNT parameter, which determines how many blocks should be read in a multiblock read, was set to 8, and hence eight blocks were read into the database buffer cache.

Similarly, the session with SID 842 is now inactive, waiting on the SQL*Net message from client, which normally represents a session waiting for input from a user.

The Connection to Active Session History

If a simple, static snapshot of the V$SESSION_WAIT and V$SESSION was able to provide so much information, would it not be beneficial to provide a set of snapshots of the session-state information so that a performance analyst could see what occurred within a session in terms of events? The obvious answer is a resounding yes, and this is exactly what Active Session History does.

ASH samples session-state information for active sessions once every second and exposes it via a number of views, the details of which you will see later in this section. ASH records these details in a circular memory buffer for quick access. CPU overheads associated with ASH have been claimed to be as low as 0.1% of 1 CPU. This sampling and recording is performed by the AWR-related processes, namely MMNL and MMON. This activity occurs without any latching and serialization, which results in the relatively low overhead for such collections at the expense of some accuracy.

As well, every 10th sample is also persisted to disk, effectively recording a snapshot of the session state for every 10th second in which the session was active. The persisted session-state history data is both timestamped and associated with the SNAP_ID of the corresponding AWR snapshot for that period. In fact, as noted before, ASH data is also collected via the same processes and stored within the same repository as AWR snapshots.

Thus, using ASH data, you can perform both on-the-spot analysis as well as look back into history and perform retroactive analysis because the ASH data is tied to a snapshot and a period. The advantage of immediate analysis is that you can obtain a larger number of samples directly from the memory buffer, while the data is coarser when viewed later using the persisted copy on disk.

A Little History

Sampling the SGA for performance statistics is not a new idea, although this is the first time such sampling is provided via the Oracle Database 10g kernel itself. The idea of sampling the SGA directly (Direct Memory Access, or DMA) probably evolved from a program named m2, which was the first program written by an Oracle Support analyst to determine V$SESSION_WAIT details in the early days of Oracle Version 7. Although it was never released or made available to Oracle users, m2 was probably the spark that ignited the concept of ASH.


ASH data is used in AWR snapshot and other collections as well. For example, we mentioned previously that AWR tracks and collects high-load SQL that occurs within the snapshot period rather than trolling through V$SQL for previously recorded SQL. It does this based on ASH data for SQL from sessions that either caused a majority of the waits during the period or consumed the most CPU. Other information collected directly from ASH includes "hot" files and segments as well as database metrics.

ASH data can be viewed both directly using the views as well as indirectly via OEM Database Control. We look at some of these views in greater detail next.

V$ACTIVE_SESSION_HISTORY

This view is the base ASH view that exposes the contents of the ASH buffers and in turn is based on the internal views, such as X$KEWASH and X$ASH. This view holds all the relevant details that both V$SESSION and V$SESSION_WAIT hold, along with some additional columns. Note that the column names for some of the common columns have changed. For example, the SID and SERIAL# columns are now known as SESSION_ID and SESSION_SERIAL# respectively. Table 11.3 describes only the new columns.

Table 11.3. Columns of Interest in V$ACTIVE_SESSION_HISTORY

Column Name

Meaning

SAMPLE_ID

The sample ID.

SAMPLE_TIME

The TIMESTAMP(3) column that describes the time at which that particular sample was captured.

SQL_ID

SQL identifier of the SQL statement that the session was executing at the time of sampling. This is a VARCHAR2(13) variable that has a more unusual footprint than the older SQL_HASH_VALUE from V$SESSION. This SQL_ID is used to search in V$SQL at the time of AWR snapshot collection for high-load SQL.

SQL_CHILD_NUMBER

Child number of the SQL statement that the session was executing at the time of sampling.

SQL_PLAN_HASH_VALUE

Hash value to get to the SQL plan that is stored in V$SQL_PLAN, matches with PLAN_HASH_VALUE. The information here may not always be available because it may be freed for other plans when the shared pool in the SGA is under memory pressure.

SQL_OPCODE

Indicates what phase of operation the SQL statement is in; maps to V$SESSION.COMMAND.

SERVICE_HASH

A hash that identifies the service; maps to V$ACTIVE_SERVICES.NAME_HASH.

SESSION_TYPE

Indicates the session typethat is, FOREGROUND or BACKGROUND.

SESSION_STATE

Indicates the session statethat is, whether the session is currently WAITING or ON CPU. This depends on the WAIT_TIME column. If it is 0, then the state is WAITING; if not, it is ON CPU. This column is actually a better representation of WAIT_TIME. If you see a nonblocking event such as db file sequential read, and the state is ON CPU, then it means that the process actually completed the last I/O and is currently not waiting. The event that is displayed is actually the last recorded Wait event.

QC_SESSION_ID and QC_SESSION_INSTANCE

Populated only when the session is a parallel query slave and indicates the query coordinator session and instance.

EVENT

The name of the event. If SESSION_STATE = WAITING, then it is the event for which the session was waiting at the time of sampling. If SESSION_STATE = ON CPU, then it is the event for which the session last waited before being sampled. EVENT_ID and EVENT# are also shown in this view and correspond to their equivalents in V$SESSION and V$SESSION_WAIT.

WAIT_TIME

The SESSION_STATE is derived from this column. It is 0 if the session was waiting at the time of sampling, and is the total wait time for the event for which the session last waited if the session was on the CPU when sampled. Whether or not WAIT_TIME = 0 is what is used to determine the SESSION_STATE at the time of sampling, rather than the actual value of WAIT_TIME itself. Maps to V$SESSION.WAIT_TIME.

TIME_WAITED

If SESSION_STATE = WAITING, then this is the amount of time that the session actually spent waiting for the event. This column is set for waits that were in progress at the time the sample was taken. If a Wait event lasted for more than a second and was caught waiting in more than one session sample row, then the actual time spent waiting for that Wait event will be populated in the last of those session sample rows. At any given time, this information will not be available for the latest session sample. TIME_WAITED is shown in number of microseconds (1,000th of a second). Note that this information is not recorded in the documentation.

CURRENT_OBJ#, CURRENT_FILE#, and CURRENT_BLOCK#

Along with the CURRENT_FILE# and CURRENT_BLOCK#, the CURRENT_OBJ# column is populated only if the session is waiting for some I/O-related events or for some enqueue waits. They map to the V$SESSION.ROW_WAIT_OBJ#, ROW_WAIT_FILE#, and ROW_WAIT_BLOCK# columns.

P1, P2, and P3

These have the same meanings as in the V$SESSION_WAIT view. Note, however, that the P1TEXT, P2TEXT, and P3TEXT columns as well as the P1RAW, P2RAW, and P3RAW columns are not displayed. These columns are rarely used, if ever; hence, omitting them saves memory in the ASH buffers.

PROGRAM, MODULE, ACTION, and CLIENT_ID

Along with MODULE, ACTION and CLIENT_ID describe the current environment of the connecting session. These are set using the DBMS_APPLICATION_INFO package.


Inconsistencies in Column Names

There is one more example of inconsistency in column naming in the V$ACTIVE_SESSION_HISTORY view. The view uses the name SESSION_ID and SESSION_SERIAL# for the generally well known SID and SERIAL# equivalent columns from V$SESION and V$SESSION_WAIT. This particular inconsistency in naming seems prevalent in many other views related to session information.


V$SESSION_WAIT_HISTORY

This view shows the last 10 waits for any session from V$ACTIVE_SESSION_HISTORY. This view can be accessed only via the SID and does not contain the SERIAL#, which uniquely identifies a session. Thus, when multiple sessions connect and disconnect within a short period of time, you might be getting incorrect datathat is, data from a previous session that completed but whose SID is being reused by a later, newer connection. We suggest ignoring this view and looking up such data from V$ACTIVE_SESSION_HISTORY. Unfortunately, it seems that OEM uses this view, and incorrectly.

WRH$_ACTIVE_SESSION_HISTORY

This is the persisted version of the ASH data. As indicated before, the session-state information in V$ACTIVE_SESSION_HISTORY from every 10th sample is written out to disk into this table. Each of these samples is associated with the SNAP_ID of the current AWR snapshot. In fact, output from the query listed in Listing 11.10 will confirm that this is indeed the case.

Listing 11.10. Query to Confirm AWR Snapshot Matches ASH Persisted Data

[View full width]

 select s.snap_id, min(begin_interval_time), max(end_interval_time), min(sample_time), max (sample_time) from sys.wrm$_snapshot s, sys.wrh$_active_session_history h where s.snap_id = h.snap_id group by s.snap_id; 

Size of Active Session History Buffers

As mentioned before, ASH maintains samples of the session-state information in its own buffers within the System Global Area (SGA). The amount of information that ASH maintains could quickly become substantial if not managed properly. Hence, ASH maintains a fixed size circular buffer that is allocated in the SGA at system startup using the formula:

ASH_Buffer_Size = Max (Min(No.of CPUs * 2 Mb, 5% of Shared Pool size, 30 MB), 1 Mb)

In other words, it is at least 1MB and goes up to a maximum of 30MB. In between these extremes, the size is determined by the number of CPUs (No. of CPUs * 2) or by the size of the shared pool (5% of the shared Pool size). Because ASH collects session-state information for active sessions only, it is obvious that the buffer will become full more quickly when there are more active sessions in the database. By default, this persistence, or writing out to disk, is set to occur when the buffers are 66% full. When this occurs, every 10th ASH record for each session is associated with the current snapshot and written out to disk as the rows in the WRH$_ACTIVE_SESSION_HISTORY table. This occurs in the same manner of persisting AWR data to disk. The ASH buffer space is then released for capturing newer session-state information.

Hidden Parameters for ASH

It seems that there are a few hidden parametersthose that start with _ (underscore) and should NOT be changed except under Oracle Support's directionrelated to ASH. This includes _ash_enable, which by default is set to trUE and seems to allow ASH to be switched off. Other parameters include _ash_disk_filter_ratio, the ratio of in-memory sample to flush (every 10th by default); _ash_eflush_trigger, the percentage at which an emergency disk flush should occur (66% by default); and _ash_sampling_interval, the time interval between ASH samples (1,000 milliseconds or 1 second by default). Changing these values may adversely affect the behavior of ASH. As the TV commercial says, "Don't try this at home" (or on your production instance)!


An Example of Spot Analysis Using ASH

To showcase how you can use ASH from SQL, let's use the same example of high-load SQL that we used in Chapter 10. The SQL query runs against the HR schema and joins EMPLOYEES and its related tables. In this example, the query was run simultaneously from two SQL*Plus sessions. The result was that both sessions contended on the cache buffer chain latch because they were trying to access the same blocks in memory while performing occasional physical reads. Listing 11.11 shows the query used to display the ASH data. The session and serial numbers of the sessions as they occurred in the tests are shown. Listing 11.12 shows the output of the query, partially snipped showing a time of interest in which this contention could be seen. Lines in this listing are numbered and these line numbers are used in the explanation that follows.

Listing 11.11. Query to Show Many Details from V$ACTIVE_SESSION_HISTORY for Two Known Contending Sessions
 column time format a8 column sid format a9 column event format a25 trunc column wt_time format 999999 column tm_waitd format 99999 column seq# format 9999 select to_char(sample_time,'HH24:MI:SS') time, session_id ||','|| session_serial# sid, session_state s_state,event,seq#, wait_time wt_time,time_waited tm_waitd from v$active_session_history where session_id in (136,158) and session_serial# in (67,11953) order by session_id,sample_time; 

Listing 11.12. Output of Query in Listing 11.11
  1  TIME     SID       S_STATE EVENT                      SEQ# WT_TIME TM_WAITD  2  -------- --------- ------- ------------------------- ----- ------- --------  3  <snip - First session>  4  23:53:09 136,67    ON CPU  SQL*Net message from clie    34    1134        0  5  23:53:10 136,67    ON CPU  SQL*Net message from clie    34    1134        0  6  23:53:11 136,67    ON CPU  latch: cache buffers chai    35    1949        0  7  23:53:12 136,67    ON CPU  latch: cache buffers chai    36       4        0  8  23:53:13 136,67    ON CPU  latch: cache buffers chai    36       4        0  9  23:53:14 136,67    ON CPU  latch: cache buffers chai    38   19462        0 10  23:53:15 136,67    WAITING latch: cache buffers chai    40       0    19506 11  23:53:16 136,67    ON CPU  latch: cache buffers chai    40   19506        0 12  <snip> 13  23:53:25 136,67    ON CPU  latch: cache buffers chai    49   19502        0 14  23:53:26 136,67    ON CPU  latch: cache buffers chai    49   19502        0 15  23:53:27 136,67    ON CPU  db file sequential read      51      84        0 16  23:53:28 136,67    ON CPU  latch: cache buffers chai    52       5        0 17  23:53:29 136,67    ON CPU  latch: cache buffers chai    54   19445        0 18  23:53:30 136,67    ON CPU  latch: cache buffers chai    54   19445        0 19  23:53:31 136,67    ON CPU  latch: cache buffers chai    56   19468        0 20  23:53:32 136,67    ON CPU  latch: cache buffers chai    58       2        0 21  23:53:33 136,67    ON CPU  latch: cache buffers chai    58       2        0 22  <snip - Second session> 23  23:53:08 158,11953 ON CPU  db file sequential read     206      26        0 24  23:53:09 158,11953 ON CPU  db file sequential read     206      26        0 25  23:53:10 158,11953 ON CPU  latch: cache buffers chai   216   19485        0 26  <snip> 27  23:53:25 158,11953 WAITING latch: cache buffers chai   253       0    13112 28  23:53:26 158,11953 ON CPU  latch: cache buffers chai   257   19642        0 29  23:53:27 158,11953 ON CPU  latch: cache buffers chai   262   19493        0 30  23:53:28 158,11953 ON CPU  latch: cache buffers chai   264       4        0 31  23:53:29 158,11953 ON CPU  latch: cache buffers chai   266   19583        0 32  23:53:30 158,11953 ON CPU  latch: cache buffers chai   267       4        0 

Let us now analyze the output shown in Listing 11.12. You can arrive at a number of conclusions from this data.

The Wait event in lines 4 and 5 show that the session was waiting on SQL*Net message from client. Although this event is classified as an idle event, you know that the session was active because ASH data would not have otherwise been collected! This is clarified by the STATE column, which shows ON CPU for these two lines. You can thus understand that between 23:59:09 and 23:59:10, the session was actually consuming CPU waiting on the client (the SQL*Plus session in this case), which in turn was processingprobably formatting the data sent across, which consumes some CPU.

In lines 710, you see that the session was waiting on the latch: cache buffer chain event. Lines 7 and 8 look the same, so you can conclude that the session was using CPU but never posted a Wait event during this period. In fact, from the nature and operation of a latch, you can conclude that the process was spinning on the latch. This is a basic operating system call where you can perform a set-and-test operation, where the process idly spins for a short period and tests for the latch before yielding the CPU. Although this spinning reduces operating-system process context switching (which is expensive in itself), it does consume some CPU. Monitoring of this process from the operating system supported the conclusion that a significant amount of user CPU was consumed by this process.

You see a state change between lines 8 and 9; the event is the same, but the SEQ# column has moved from 36 to 38. What happened to 37? Actually, ASH missed recording the event; the sampling period wasn't small enough to catch all the events. This looks like a weakness in ASH, and we will deal with this later. Line 10 implies some waiting on the same latch, probably for a latch sleep. The TIME_WAITED is now posted and is taken into account in other counters, such as V$SESSION_EVENT and V$SYSTEM_EVENT, as well as in V$LATCH and V$LATCH_CHILDREN. We hope you are now able to see how this information is rolled up from ASH into higher levels.

Note that in the definition of TIME_WAITED, we stated that "If a Wait event lasted for more than a second and was caught waiting in more than one session sample row, then the actual time spent waiting for that Wait event will be populated in the last of those session sample rows." You can see this occur in line 10.

Let us now look at lines 15, 23, 24, and 29. Except line 15, all are from the second session. And although you do see file I/O in the form of db file sequential read events from both sessions, you also see the contention on the cache buffer chain latch. Further analysis of other data, such as P1, P2, and P3 (not shown due to size and formatting limitations) helped you understand that these two sessions may be contending for the same blocks, and the solution may be to schedule them to run one after another.

You can also summarize information from V$ACTIVE_SESSION_HISTORY to obtain summary counts that will help you quickly get to the main bottleneck at hand, and support the conclusion arrived at here. This is made clear in the query and output shown in Listing 11.13.

Listing 11.13. Query to Confirm Conclusions Using Summaries on V$ACTIVE_SESSION_HISTORY
 SQL> column event format a35 trunc SQL> select session_id sid, event, session_state state,   2  count(*) times, sum(time_waited) tm_waited   3  from v$active_session_history   4  where program like 'sqlplus%'   5  group by session_id, event, session_state;        SID EVENT                               STATE        TIMES  TM_WAITED ---------- ----------------------------------- ------- ---------- ----------        136 db file sequential read             ON CPU          32          0        136 SQL*Net message from client         ON CPU           9          0        136 latch: cache buffers chains         ON CPU         426          0        136 latch: cache buffers chains         WAITING         15     430403        158 db file sequential read             ON CPU           2          0        158 SQL*Net message from client         ON CPU          17          0        158 latch: cache buffers chains         ON CPU         466          0        158 latch: cache buffers chains         WAITING         16     718216 

It is clear from this listing that the most waiting was on the latch: cache buffer chains event. In fact, you can conclude that both sessions spent almost 100% of their time on this latch, and were in fact contending for the same buffers in memory. This becomes apparent when the session is run alone. The output from the same query when run alone is shown in Listing 11.14.

Listing 11.14. Query to Show Wait Events When Run Alone
 SQL> column event format a35 trunc SQL> select session_id sid, event, session_state state,   2  count(*) times, sum(time_waited) tm_waited   3  from v$active_session_history   4  where session_id = 151 and session_serial# = 365   5  group by session_id, event, session_state; Sid EVENT                               STATE          TIMES    TM_WAITED ---- ----------------------------------- ------- ------------ ------------  151 db file sequential read             ON CPU           105            0  151 SQL*Net message from client         ON CPU           107            0 

This query shows that minimal if no waiting was recorded in the wait history for that session, and most work was only on the CPUprobably due to formatting and displaying the data when run alone by itself.

You should be able to cross-check these findings using the Time and Wait model views at the session level, namely the V$SESSION_WAIT_CLASS view using the SESSION and SERIAL# values. Note, however, that the data for that session in this view exists only as long as the said session is connected and alive. In contrast, the ASH data is available even after the session has disconnected.

Using OEM to Obtain Active Session History

You can use OEM Database Control to obtain session statistics from ASH. There are available from the Top Sessions link in the Performance screen or from any session-related drill-down. Figure 11.5 shows the Session Details screen for one session numbered 150.

Figure 11.5. The Session Details screen.


Note that the status, current SQL, and so on are obtained and displayed from the V$SESSION view. The menu includes the following:

  • Statistics. These are the session statistics available from the V$SESSTAT view. The names of the various statistics are derived from V$STATNAME.

  • Wait Events. This is the session wait history.

  • Open Cursors. This is a list of currently open cursors obtained from the V$OPEN_CURSORS view. It can be instrumental in figuring out the various activities completed by this session, when such cursor information is available.

  • Locks. This is a list of locks currently being faced by the session, and derived from the blocking session details available in V$SESSION.

Upon clicking the Wait Events link in the menu, the last 10 waits suffered by the session are displayed, as shown in Figure 11.6. Note, however, that these waits are shown regardless of whether the session was active. As well, this information is not from V$ACTIVE_SESSION_HISTORY, but from V$SESSION_WAIT_HISTORY; hence, you might be looking at incorrect data as explained when we discussed the latter view.

Figure 11.6. The Wait Events screen.


Dumping ASH Buffer Contents

As per the previously mentioned MetaLink Note #243132.1, it is possible to dump the contents of the ASH buffers. The command does not work as stated, however. The difference is minor and has to do with the statement in the MetaLink Note that the number specified in the command is the number of minutes, when it seems to be the number of seconds instead. To dump the contents of the ASH buffers, obtain a login as SYS. Then run the commands as shown in Listing 11.15.

Listing 11.15. Dumping Contents of the ASH Buffers to Trace
 SQL> oradebug setmypid Statement processed. SQL> oradebug dump ashdump 10000 Statement processed. 

The trace file can be found in the UDUMP directory. The file is named according to the convention of your platform, as with any other user trace file. Although the MetaLink Note suggests that the number specified is the number of minutes of historical data, it seems that only a large number produces all the output. ORADEBUG is a special command available in the SYS login to produce a variety of dumps and other traces. The file that is produced, after some header information is removed, is a comma-separated value (CSV) file that can be loaded into a Microsoft Excel spreadsheet or even another database table using SQL*Loader for analysis. The partial contents are shown in Listing 11.16. You will notice that there is a header line that specifies the full column list of V$ACTIVE_SESSION_HISTORY.

Listing 11.16. Partial Contents of the ASH Dump
 *** 2005-01-30 23:22:37.981 *** SERVICE NAME:(SYS$USERS) 2005-01-30 23:22:37.979 *** SESSION ID:(140.360) 2005-01-30 23:22:37.979 <<<ACTIVE SESSION HISTORY - PROCESS TRACE DUMP HEADER BEGIN>>> DBID, INSTANCE_NUMBER, SAMPLE_ID, SAMPLE_TIME, SESSION_ID, SESSION_SERIAL#, USER _ID, SQL_ID, SQL_CHILD_NUMBER, SQL_PLAN_HASH_VALUE, SERVICE_HASH, SESSION_TYPE, SQL_OPCODE, QC_SESSION_ID, QC_INSTANCE_ID, CURRENT_OBJ#, CURRENT_FILE#, CURRENT_ BLOCK#, EVENT_ID, SEQ#, P1, P2, P3, WAIT_TIME, TIME_WAITED, PROGRAM, MODULE, ACT ION, CLIENT_ID <<<ACTIVE SESSION HISTORY - PROCESS TRACE DUMP HEADER END>>> <<<ACTIVE SESSION HISTORY - PROCESS TRACE DUMP BEGIN>>> 845721274,1,3064021,"01-30-2005 23:22:35.178099000",166,1,0,"",0,0,165959219,2,0 ,0,0,4294967295,0,0,866018717,536,300,0,0,3004674,0,"oracle@ussccpvcs01t (CKPT)" ,"","","" <snipped> 845721274,1,3063605,"01-30-2005 23:15:26.638099000",170,1,0,"",0,0,165959219,2,0 ,0,0,4294967295,0,0,3539483025,7,300,0,0,2936573,0,"oracle@ussccpvcs01t (PMON)", "","","" 845721274,1,3063597,"01-30-2005 23:15:18.398099000",152,1,22,"c1599tjmzu1fc",0,0 ,3427055676,1,0,0,0,8548,3,34825,1148964757,1500,8552,2429532560,5,5008574,0,"em agent@ussccpvcs01t (TNS V1-V3)","emagent@ussccpvcs01t (TNS V1-V3)","","" <snipped> 845721274,1,3063515,"01-30-2005 23:13:53.928099000",145,9,54,"2b064ybzkwf1y",0,0 ,1594506128,1,47,0,0,47741,3,19916,218649935,295,2432172472,2432129528,0,1005847 ,0,"OMS","OEM.SystemPool","","" 845721274,1,3063453,"01-30-2005 23:12:50.068099000",150,19,58,"b5v2yu6f47cw2",0, 4106914013,3427055676,1,3,0,0,49788,5,10659,2652584166,252,5,10659,1,63,0,"sqlpl us@ussccpvcs01t (TNS V1-V3)","SQL*Plus","","" 845721274,1,3063452,"01-30-2005 23:12:49.038099000",150,19,58,"b5v2yu6f47cw2",0, 4106914013,3427055676,1,3,0,0,49788,5,10659,2652584166,252,5,10659,1,63,0,"sqlpl us@ussccpvcs01t (TNS V1-V3)","SQL*Plus","","" 

The Downside of ASH

Although the overhead for collection of ASH data is minimal, access to the ASH data in human-readable format, either via the OEM interface or the more traditional SQL access, is not without the normal overhead of SQL processing, including the need for a connection, some latching, as well as associated CPU usage. On a badly overloaded system, you may not be able to obtain a new SQL*Plus or OEM session in order to get to ASH data.

As well, when the system is heavily contending on the shared pool and library cache latches, your query may take a long time to parse and execute. Indeed, frequent queries to ASH data may impose additional waits for these latches. This is not the case with an external program that reads SGA directly. ASH, however, needs to run within the context of Oracle and hence will have to follow all the rules of the Oracle kernel as far as a normal process is concerned.

A workaround is to maintain a perpetual connection to the database using an account that has privileges to query ASH data. If you used precompiled code in a PL/SQL package that is pinned in the SGA and can produce canned ASH queries, then you can even get around the parsing issues.

It is very possible, indeed probable, to lose some wait information because the sampling rate is only once a second by default. You saw an example of this issue in the previous example. So how can this be countered? The answer lies in using Extended SQL Trace, better known as 10046 trace. Using this method, all the Wait events as well as the SQL executed by the session are dumped directly to the trace file, which can later be viewed and processed. These trace files are cumbersome to track and process, however. As well, you can only perform such a trace as the problem is occurring, not in a retrospective manner as with ASH.



    Oracle Database 10g Insider Solutions
    SUSE LINUX Enterprise Server 9 Administrators Handbook
    ISBN: 672327910
    EAN: 2147483647
    Year: 2006
    Pages: 214

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net