Making Sense of ADDM Data


Making Sense of ADDM Data

The best way to make sense of ADDM data is to look at the details of the ADDM report that we generated as shown in Listing 12.4. We will use portions of the report and comment on how it can be interpreted and used. This example showed many types of findings, typical in such tuning scenarios.

Finding 1: SQL Statements Consuming Significant Time

As stated previously, application-generated SQL statements are the cause of most of the issues. After all, what use would a database (and hence a database administrator!) be without users and the SQLs that they execute? Listing 12.7 shows one such section. It is, in fact, the first of the various findings.

Listing 12.7. Details of Finding 1 from the ADDM Report
 FINDING 1: 100% impact (3872 seconds) ------------------------------------- SQL statements consuming significant database time were found.    RECOMMENDATION 1: SQL Tuning, 56% benefit (1695 seconds)       ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID          "gr2244sy7phzs".          RELEVANT OBJECT: SQL statement with SQL_ID gr2244sy7phzs and          PLAN_HASH 2585813084          UPDATE HR.EMPLOYEES SET SALARY = SALARY + 10 WHERE EMPLOYEE_ID = :B1    RECOMMENDATION 2: SQL Tuning, 55% benefit (1677 seconds)       ACTION: Tune the PL/SQL block with SQL_ID "d10v1gbfqdxkz". Refer to the          "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide          and Reference"          RELEVANT OBJECT: SQL statement with SQL_ID d10v1gbfqdxkz          declare          time_to_sleep          number := 5;          start_employee_id         number := 100;          end_employee_id         number := 200;          emp_rec         hr.employees%ROWTYPE;          emp_id         number;          retval         number;          begin          emp_id := start_employee_id;          LOOP          EXIT WHEN emp_id > end_employee_id;          update hr.employees set salary = salary + 10          where employee_id = emp_id;          --          We do not invoke the sleep so this session keeps waiting on the other          ---          retval := sys.xxdba_pack.sess_sleep(time_to_sleep);          emp_id := emp_id + 1;          commit;          END LOOP;          dbms_output.put_line('Loop ended');          end;    RECOMMENDATION 3: SQL Tuning, 9.2% benefit (280 seconds)       ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID          "8ch2kqpk8snqh".          RELEVANT OBJECT: SQL statement with SQL_ID 8ch2kqpk8snqh and          PLAN_HASH 4106914013          select e.*          from hr.employees e, hr.departments d, hr.locations l,          hr.countries c, hr.regions r          where e.department_id = d.department_id          and d.location_id = l.location_id          and l.country_id = c.country_id          and c.region_id = r.region_id          and employee_id between substr(to_char(abs(dbms_random.random)),1,4)          and substr(to_char(abs(dbms_random.random)),1,4)    RECOMMENDATION 4: SQL Tuning, 4.2% benefit (129 seconds)       ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID          "5apdjnppbgpg3".          RELEVANT OBJECT: SQL statement with SQL_ID 5apdjnppbgpg3 and          PLAN_HASH 272231759          SELECT NULL FROM DUAL FOR UPDATE NOWAIT    RECOMMENDATION 5: SQL Tuning, 3% benefit (92 seconds)       ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID          "ft4twdjk99kxb".          RELEVANT OBJECT: SQL statement with SQL_ID ft4twdjk99kxb and          PLAN_HASH 645813381          select sn.snap_id, sn.instance_number, sn.end_interval_time,          sn.snap_level, (select unique 1 from dba_hist_baseline b where          (sn.snap_id >= b.start_snap_id) and b.dbid = :1 and (sn.snap_id <=          b.end_snap_id)) within_baseline_range, startup_time from          dba_hist_snapshot sn where sn.dbid = :2 and sn.instance_number = :3 

We need to point out a few things here. First, note that in each case, the findings are listed first, and the recommendations, actions, and rationale for each of these findings is listed next. We have considered the first of the findings in Listing 12.7, and this finding is considered to have an impact of 100%. This is listed along with the CPU time consumed in seconds. This value is the total number of CPU seconds consumed by each of the SQL statements that prompted this finding as recorded in each of the recommendations, and is actually the DB Time as seen from the Time and Wait model views. As noted before, ADDM uses this model to look at the resource-intensive pieces of the workload.

That said, one of the issues is apparent if you look closely at the SQLs for the first and second recommendation. The second SQL is actually a PL/SQL procedure that contains the first SQL. The variable emp_id is replaced by the bind variable :B1 and the SQL translated into upper case. Further, you may note that the second SQL took 1,677 seconds, while the previous one took 1,695 seconds. The problem is that in the case of PL/SQL blocks that execute SQL statements inside, the SQL statement is shown and considered separately, and not as part of the PL/SQL procedure. In other words, the first SQL statement was actually part of the second blocknamely, the PL/SQL block. Thus you can see that ADDM double counted the impact as well as other related statistics. Keep this in mind when looking at PL/SQL-based problems; the figures in the findings and recommendations may not exactly mean what they state, and you may have to manually inspect and adjust the values for SQL that originates from within PL/SQL. Such SQL is considered recursive in nature.

Recursive SQL

Recursive SQL is generally meant to be SQL executed by the SYS user to perform some internal action on behalf of normal users. Prime examples of this are SQLs for space management and extent allocation. Specifically, when a row is inserted into a table and no free block is found for that insert, the Oracle kernel automatically allocates the next free extent and records this in the internal data dictionary using SQL statements that are called recursive SQL. That said, recursive SQL is also SQL that is executed from within PL/SQL or from cascading actions such as triggers and constraints.

An easy method of identifying such occurrences is to look at the user ID executing this recursive SQL; this should not have a user ID of 0, meaning that it belongs to a non SYS user. This double counting of recursive SQL is not new in ADDM alone, but is spread throughout other utilities such as TKPROF. While SYS-generated recursive SQL in TKPROF can be specifically avoided using the SYS=NO option, SQL statements within PL/SQL are still counted under the Recursive SQL section of the output in TKPROF.

This topic is not well explained in the manuals and as a result has generated a lot of myths. You can determine the recursive level of SQL only in the raw form in extended SQL trace files. For further details, refer to MetaLink Note #39817.1, "Interpreting Raw SQL_TRACE and DBMS_SUPPORT.START_TRACE Output."


In Listing 12.4, we noted that the total database time spent was 3,037 seconds. This contrasts with the summary of 3,872 seconds in finding 1, which is substantially more than the listed total. Having explained the recursive SQL double counting, we can now exclude this value and recalculate the adjusted impact to be (1,677 + 280 + 129 + 92) seconds divided by the total of 3,037 seconds. This works out to about 72%, which is an understandable figure.

Another caveat is that ADDM will consider all SQL that is running in the database during the investigation period. Thus, recommendations 4 and 5 denote recursive and other performance managementrelated SQL. This is actually a good thing when comparing overall performance of the database as a whole, because you will need to include the load imposed by the internal operations of the database as well.

Note the text in the recommendations. They clearly indicate what the next step in the tuning exercise should be. For example, the action for recommendation 2 is to tune the PL/SQL block with the SQL ID d10v1gbfqdxkz. (Refer to the "Tuning PL/SQL Applications" chapter of Oracle's PL/SQL User's Guide and Reference.) In the other cases, you are advised to run the SQL advisors on the noted SQL. You will find that such clear actions are a common theme in the ADDM report. ADDM thus not only helps you, the performance analyst, with capturing the workload with the most impact, it also tells you what to do next.

Finding 2: SQL Statements Waiting for Locks

A good example of a finding that uses the Wait portion of the Time and Wait model used by ADDM is shown in Listing 12.8. You are aware that in order to be able to perform DML, the Oracle kernel must place exclusive row locks on the required rows, and that these locks will not be released until either a COMMIT or a ROLLBACK is executed. When another process wants to place the same type of lock on this row, you then have a locking conflict. This results in the blocked session posting an enqueue wait in the form of the event enq: TX row lock contention and waiting indefinitely for the blocking session to release the lock. This is deemed to be due to application logic, and the performance analyst is required to perform further analysis as seen in the Action section of this finding.

Listing 12.8. Details of Finding 2 from the ADDM Report
 FINDING 2: 55% impact (1676 seconds) ------------------------------------ SQL statements were found waiting for row lock waits.    RECOMMENDATION 1: Application Analysis, 55% benefit (1676 seconds)       ACTION: Trace the cause of row contention in the application logic. Use          given blocked SQL to identify the database objects involved.          Investigate application logic involving DML on these objects.       RATIONALE: The SQL statement with SQL_ID "gr2244sy7phzs" was blocked on          row locks.          RELEVANT OBJECT: SQL statement with SQL_ID gr2244sy7phzs          UPDATE HR.EMPLOYEES SET SALARY = SALARY + 10 WHERE EMPLOYEE_ID = :B1    SYMPTOMS THAT LED TO THE FINDING:       Wait class "Application" was consuming significant database time. (56%       impact [1688 seconds]) 

Finding 3: Session Management

Not only can ADDM point to problems observed within the database, it can suggest seeking an external solution to the problem. In Listing 12.9, you see such a case. ADDM has determined that the rate of session connection and disconnection is high, and thus suggests using a middle tier with a connection-pool mechanism to reduce this rate. Essentially, connection pooling allows user sessions to be serviced by a number of pre-established connections. User sessions connect and disconnect to the connection pool rather than directly to the database, thus overcoming this high-impact finding. As noted in Chapter 10, rates of an activity are actually determined and recorded in the V$SYSMETRIC view; ADDM checks these to determine whether these rates are being exceeded.

Listing 12.9. Details of Finding 3 from the ADDM Report
 FINDING 3: 12% impact (369 seconds) ----------------------------------- Session connect and disconnect calls were consuming significant database time.    RECOMMENDATION 1: Application Analysis, 12% benefit (369 seconds)       ACTION: Investigate application logic for possible reduction of connect          and disconnect calls. For example, you might use a connection pool          scheme in the middle tier. 

In reality, you see from Listing 12.10 that there was another knock-on effect of this high rate of session connection/disconnection. In finding 6 shown in Listing 12.10, ADDM noticed and recorded high activity in the session-allocation latch (appropriately named latch: session allocation).

Listing 12.10. Details of Finding 6 from the ADDM Report
 FINDING 6: 3.7% impact (111 seconds) ------------------------------------ Wait event "latch: session allocation" in wait class "Other" was consuming significant database time.    RECOMMENDATION 1: Application Analysis, 3.7% benefit (111 seconds)       ACTION: Investigate the cause for high "latch: session allocation"          waits. Refer to Oracle's "Database Reference" for the description of          this wait event. Use given SQL for further investigation.       RATIONALE: The SQL statement with SQL_ID "NULL-SQLID" was found waiting          for "latch: session allocation" wait event.          RELEVANT OBJECT: SQL statement with SQL_ID NULL-SQLID 

Other Findings

Everything is not yet perfect with the findings of ADDM. This is evident in Listing 12.11.

Listing 12.11. Details of Other Findings from the ADDM Report
 FINDING 4: 7.3% impact (220 seconds) ------------------------------------ Wait event "class slave wait" in wait class "Other" was consuming significant database time. <snipped> FINDING 5: 6.3% impact (190 seconds) ------------------------------------ Wait event "Queue Monitor Task Wait" in wait class "Other" was consuming significant database time. <snipped> FINDING 8: 3.4% impact (104 seconds) ------------------------------------ Wait class "Configuration" was consuming significant database time.    NO RECOMMENDATIONS AVAILABLE    ADDITIONAL INFORMATION: Waits for free buffers were not consuming       significant database time.       Waits for archiver processes were not consuming significant database       time.       Log file switch operations were not consuming significant database time       while waiting for checkpoint completion.       Log buffer space waits were not consuming significant database time.       High watermark (HW) enqueue waits were not consuming significant       database time.       Space Transaction (ST) enqueue waits were not consuming significant       database time.       ITL enqueue waits were not consuming significant database time. 

Findings 4 and 5 show that ADDM still does not properly consider idle events. The class slave wait and Queue Monitor Task Wait events are actually considered idle because they are "wait-for-work" type events. It seems that an exclusion of this type of event is not catered to; indications are that this will be fixed in Oracle Database 10g Release 2 as per MetaLink Note #3876475.8. This fix is also present in the 10.1.0.4 patch set.

Exclusion of Idle Events

STATSPACK in Oracle 9i and earlier can recognize and exclude the impact of a number of Wait events that are deemed to be idle. These are generally events where the back-end shadow process is waiting for a message or request for action either from the foreground client process or from another coordinator process. Such idle events are recorded in the STATS$IDLE_EVENT table in the STATSPACK schema and are used during STATSPACK reporting as a filter for exclusion. Note that the idle nature of these idle events to some extent depends on context; this topic generates a lot of debate in the Oracle user community.


As well, notice in finding 8 that the Configuration wait class was consuming some time, but no recommendations were available. To ADDM's credit, it did mention those configuration items that were not a problem, so you could pursue reasons other than those mentioned.

We saw other findings as well. For example, you saw that there was some contention for sequences, resulting in the recommendation of sequence caching, as well as a call for further investigation when using the ORDER setting in a RAC environment. As well, there is a separate section titled "Additional Information," as shown in Listing 12.12.

Listing 12.12. Additional Information in the ADDM Report
           ADDITIONAL INFORMATION           ---------------------- Wait class "Administrative" was not consuming significant database time. Wait class "Cluster" was not consuming significant database time. Wait class "Commit" was not consuming significant database time. Wait class "Network" was not consuming significant database time. Wait class "Scheduler" was not consuming significant database time. Wait class "User I/O" was not consuming significant database time. The analysis of I/O performance is based on the default assumption that the average read time for one database block is 10000 micro-seconds. 

This listing shows what you will not need to consider in your tuning efforts, thereby preventing wasted time.

Expected I/O Response Time

Note the last sentence in Listing 12.12. This assumption is hard-coded as a default value, as seen in the DBA_ADVISOR_DEF_PARAMETERS view. Listed as the DBIO_EXPECTED parameter for the ADDM Advisor Type, this is a seeded value and denotes that it takes 10,000 microseconds (10 milliseconds) for a single block read. If your I/O subsystem can provide better performance than this, then you should change this value. Note that this is used in the calculation of impact; hence, you should exercise caution. This value can be changed using the DBMS_ADVISOR.SET_DEFAULT_PARAMETER procedure for this parameter.

Cross-Verification from the Advisor Views

You can use some of the advisor views listed in Table 12.1 to cross-verify these findings, recommendations, and actions. Listing 12.13 shows an example of the findings.

Listing 12.13. Cross-Verify Findings from Advisor Views
 SQL> column task_name format a10 SQL> select task_name, finding_id, type, impact, message   2  from dba_advisor_findings   3  where task_id = 1494; TASK_NAME  FINDING_ID TYPE            IMPACT ---------- ---------- ----------- ---------- MESSAGE -------------------------------------------------------------------------------- TASK_1494           1 PROBLEM      478647593 SQL statements were found waiting for row lock waits. TASK_1494           2 SYMPTOM      482166171 Wait class "Application" was consuming significant database time. TASK_1494           3 PROBLEM       42730598 Soft parsing of SQL statements was consuming significant database time. TASK_1494           4 SYMPTOM       30738791 Contention for latches related to the shared pool was consuming significant data base time. <output snipped> 

Similar verifications can be obtained via the DBA_ADVISOR_RECOMMENDATIONS, DBA_ADVISOR_RATIONALE, and DBA_ADVISOR_ACTIONS views.

ADDM is still governed by a set of rules that were formed after observation of performance problems in a variety of environments. That does not necessarily mean that these ADDM rules are 100% perfect or that the recommendations that it presents for what it observes in your environment are always correct. You should exercise your judgement in following or implementing the advice provided by ADDM.




    Oracle Database 10g Insider Solutions
    SUSE LINUX Enterprise Server 9 Administrators Handbook
    ISBN: 672327910
    EAN: 2147483647
    Year: 2006
    Pages: 214

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net