Latch Free | Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning (Osborne ORACLE Press Series)

The latch free wait event has three parameters: latch address, latch number, and the number of tries . In Oracle Database 10 g , depending on the latch wait event, it can fall under the Concurrency, Configuration, or Other wait class. Keep the following key thoughts in mind when dealing with the latch free wait event.

Latches apply only to memory structures in the SGA. They do not apply to database objects. An Oracle SGA has many latches, and they exist to protect various memory structures from potential corruption by concurrent access.
The Oracle session is waiting to acquire a specific latch. The appropriate action to take depends on the type of latch the process competes for.
Up to Oracle9 i Database, the latch free wait event represents all latch waits, but starting in Oracle Database 10 g , common latches are broken out and have independent wait events.

What Is a Latch?

A latch is a form of lock. You should be very familiar with its concept and usage because you use it in your daily routines and conversations, whether you realize it or not. Each time you lock a door, you essentially apply a latch. Each time you get into your vehicle and buckle up, you latch yourself in.

In Oracle RDBMS, latches are simple locking devices. They are nothing more than memory elements that typically consist of three parts : PID (process ID), memory address, and length. They enforce exclusive access to shared data structures in the SGA and thereby protect the integrity of the memory structures against corruption. Access to data structures in the SGA must be serialized to prevent corruption that can result from multiple sessions modifying the same structure at the same time. The integrity of data structures must also be preserved while they are being inspected.

Differences between a Latch and a Lock

Latches are different from locks (enqueues) in several ways, although both are locking mechanisms. Table 6-1 compares latches and locks.

Table 6-1: Latches vs. Locks
‚	Latches	Locks
Purpose	Serve a single purpose: to provide exclusive access to memory structures. (Starting in Oracle9 i Database, the cache buffers chains latches are shareable for read-only.)	Serve two purposes: to allow multiple processes to share the same resource when the lock modes are compatible and to enforce exclusive access to the resource when the lock modes are incompatible.
Jurisdiction	Apply only to data structures in the SGA. Protect memory objects, which are temporary. Control access to a memory structure for a single operation. Not transactional.	Protect database objects such as tables, data blocks, and state objects. Application driven and control access to data or metadata in the database. Transactional.
Acquisition	Can be requested in two modes: willing-to-wait or no-wait.	Can be requested in six different modes: null, row share, row exclusive, share, share row exclusive, or exclusive.
Scope	Information is kept in the memory and is only visible to the local instance ‚ latches operate at instance level.	Information is kept in the database and is visible to all instances accessing the database ‚ locks operate at database-level.
Complexity	Implemented using simple instructions, typically, test-and-set, compare-and-swap, or simple CPU instructions. Implementation is port specific because the CPU instructions are machine dependent. Lightweight.	Implemented using a series of instructions with context switches. Heavyweight.
Duration	Held briefly (in microseconds).	Normally held for an extended period of time (transactional duration).
Queue	When a process goes to sleep after failing to acquire a latch, its request is not queued and serviced in order (with a few exceptions ‚ for example, the latch wait list latch has a queue). Latches are fair game and up for grabs.	When a process fails to get a lock, its request is queued and serviced in order, unless the NOWAIT option is specified.
Deadlock	Latches are implemented in such a way that they are not subject to deadlocks.	Locks support queuing and are subject to deadlocks. A trace file is generated each time a deadlock occurs.

Latch Family

There are three types of latches: parent, child, and solitary latches. The parent and solitary latches are fixed in the Oracle kernel code. Child latches are created at instance startup. The V$LATCH_PARENT and V$LATCH_CHILDREN views contain the statistics for parent and child latches, respectively. The V$LATCH view contains the statistics for solitary latches as well as the aggregated statistics of the parent latches and their children.

Latch Acquisition

A process may request a latch in the willing-to-wait or no-wait (immediate) mode. The no-wait mode is used on only a handful of latches. Latches that are acquired in the no-wait mode have statistics in the IMMEDIATE_GETS and IMMEDIATE_MISSES columns. These columns are part of the latch family of views: V$LATCH, V$LATCH_PARENT, and V$LATCH_CHILDREN. Generally , a no-wait mode is first used on latches with multiple children such as the redo copy latches. If a process fails to get one of the child latches in the first attempt, it will ask for the next , also with the no-wait mode. The willing-to-wait mode is used only on the last latch when all no-wait requests against other child latches have failed.

Latches that are acquired in the willing-to-wait mode have statistics in the GETS and MISSES columns. The GETS of a particular latch are incremented each time a process requests the latch in the willing-to-wait mode.

If the latch is available on the first request, the process acquires it. Before modifying the protected data structure, the process writes the recovery information in the latch recovery area so that PMON knows what to clean up if the process dies while holding the latch.

If the latch is not available, the process spins on the CPU for a short while and retries for the latch. This spin and retry activity can be repeated up to _SPIN_COUNT times, which normally defaults to 2000. Briefly, if the latch is obtained in one of the retries, the process increments the SPIN_GETS and MISSES statistics by 1; otherwise , the process posts the latch free wait event in the V$SESSION_WAIT view, yields the CPU, and goes to sleep. At the end of the sleep cycle, the process wakes up and retries the latch again for up to another _SPIN_COUNT times. This spin, retry, sleep, and wake up gyration must be performed until the latch is eventually acquired. The SLEEPS statistics (SLEEPS and SLEEP1 through SLEEPS3 only ‚ SLEEP4 through SLEEP11 are never updated) are updated only when the latch GET succeeds, not on each attempt.

The only way out of the latch GET routine is to get the latch. So, what happens if the process holding the latch is dead? When a process fails to get a latch after a few tries, it will post the PMON process to check up on the latch holder. If the latch holder is dead, PMON will cleanup and release the latch.

Every latch is associated with a level number ranging from 0 to 13 (depending on the version). The levels for solitary and parent latches are fixed in the Oracle kernel code. Child latches are created at instance startup and inherit the level number from their parent. The levels are used to prevent latch deadlocks. The following two rules govern how latches may be acquired so that deadlocks will be prevented:

When a process requests a latch in the no-wait mode, any level is allowed provided one pair of the latches shares the same level.
When a process requests a latch in the willing-to-wait mode, its level must be higher than the levels of all the latches currently held.

Short-and Long-Wait Latches

Most latches are short-wait latches, therefore, processes shouldn't have to wait very long to acquire them. For these latches, Oracle processes go to sleep using the exponential backoff sleep algorithm, which means every other sleep time is doubled (1, 1, 2, 2, 4, 4, 8, 8, 16, 16, 32, 32, 64, 64 centiseconds, and so on). The maximum exponential sleep time is internally set by the _MAX_EXPONENTIAL_SLEEP parameter, which usually defaults to 2 seconds but will be reduced to _MAX_SLEEP_HOLDING_LATCH, which defaults to 4 centiseconds if the sleeping process has one or more latches in possession. A process that owns a latch cannot be allowed to sleep for too long or it will increase the chances of other processes missing on their latch requests.

Some latches are long-wait latches. This means they are generally held longer, and Oracle processes that sleep on these latches depend on another process to wake them up. This is known as latch wait posting , and this behavior is controlled by the _LATCH_WAIT_POSTING parameter. There should be only two long-wait latches in Oracle8 i Database, as shown in the following listing (there are more in Oracle9 i Database and Oracle Database 10 g ). The _LATCH_WAIT_POSTING parameter is obsolete in Oracle9 i Database. Latches that use latch wait posting have statistics in the WAITERS_WOKEN column.

 select name, immediate_gets, immediate_misses,         gets, misses, sleeps, waiters_woken  from   v$latch  where waiters_woken > 0;                  IMMEDIATE  IMMEDIATE                              WAITERS NAME                  GETS     MISSES       GETS   MISSES   SLEEPS   WOKEN --------------- ---------- ---------- ---------- -------- -------- ------- shared pool              0          0   18464156     3124     1032     485 library cache        85508        124 1564400540  4334362  1516400  690419

Latch Classification

Starting in Oracle9 i Database Release 2, latches can be assigned to classes, and each class can have a different _SPIN_COUNT value. In earlier releases, whenever the_SPIN_COUNT value is changed to benefit one latch, it applies to all latches. This can increase the overall CPU utilization quite a bit, but the new latch classes feature solves this problem. Let ‚ s say the cache buffers chains latches have high SLEEPS numbers and CPU resources are not an issue. You can assign the cache buffers chains latches to a class with a higher _SPIN_COUNT value. A larger _SPIN_COUNT value reduces the number of MISSES and SLEEPS at the expense of CPU time. The X$KSLLCLASS (kernel service lock latches class) view contains the class metadata (or attributes) for all eight classes. The latch class is represented by the INDX column in the following listing:

 select indx, spin, yield, waittime 
 from   x$ksllclass; 
 
 INDX       SPIN      YIELD   WAITTIME 
 ---------- ---------- ---------- ---------- 
 0      20000          0          1 
 1      20000          0          1 
 2      20000          0          1 
 3      20000          0          1 
 4      20000          0          1 
 5      20000          0          1 
 6      20000          0          1 
 7      20000          0          1 
 
 8 rows selected.

Each row in the X$KSLLCLASS view corresponds to a _LATCH_CLASS_ n initialization parameter, which allows you to change the _SPIN_COUNT, YIELD and WAITTIME values. For example, latch class 0 corresponds to _LATCH_CLASS_0, latch class 1 corresponds to _LATCH_CLASS_1, and so on up to _LATCH_CLASS_7. Let ‚ s say you decide to increase the _SPIN_COUNT of the cache buffers chains latches to 10,000. To do so, you need to know the latch number and make the following two entries in the INIT.ORA file:

 select latch#, name  from   v$latchname  where name = cache buffers chains;     LATCH# NAME ---------- -------------------------------         97 cache buffers chains # Make these two entries in the INIT.ORA file and bounce the instance. # This modifies the spin count attribute of class 1 to 10000. _latch_class_1 = "10000" # This assigns latch# 97 to class 1. _latch_classes = "97:1" select indx, spin, yield, waittime  from   x$ksllclass;       INDX       SPIN      YIELD   WAITTIME ---------- ---------- ---------- ----------          0      20000          0          1                   1      10000          0          1                   2      20000          0          1          3      20000          0          1          4      20000          0          1          5      20000          0          1          6      20000          0          1          7      20000          0          1 8 rows selected. select a.kslldnam, b.kslltnum, b.class_ksllt  from   x$kslld a, x$ksllt b  where  a.kslldadr = b.addr  and    b.class_ksllt > 0; KSLLDNAM                    KSLLTNUM CLASS_KSLLT ------------------------- ---------- ----------- process allocation                 3           2 cache buffers chains              97           1

Note ‚

Do not increase the _SPIN_COUNT value if the server does not have spare CPU resources. New and faster CPUs should be able to bear a higher _SPIN_COUNT value. The default value of 2000 was established a long time ago when CPU speeds were much slower.

What Does the Latch Free Wait Event Tell You?

When you see a latch free wait event in the V$SESSION_WAIT view, it means the process failed to obtain the latch in the willing-to-wait mode after spinning _SPIN_COUNT times and went to sleep. When processes compete heavily for latches, they will also consume more CPU resources because of spinning. The result is a higher response time. Remember the response time formula from Chapter 1? In this case, the high CPU utilization is a secondary symptom. This false CPU utilization level can falsely impress your capacity planner that the server needs faster or more CPUs.

The TOTAL_WAITS statistic of the latch free wait event in the V$SYSTEM_EVENT view tracks the number of times processes fail to get a latch in the willing-to-wait mode. The SLEEPS statistic of a particular latch in the V$LATCH view tracks the number of times processes sleep on that latch. Because a process has nothing else to do but sleep when it fails to get the latch after spinning for _SPIN_COUNT times, the TOTAL_WAITS should equal the sum of SLEEPS as shown in the following code. However, there are times where TOTAL_WAITS is greater than the sum of SLEEPS. This is because SLEEPS statistics are only updated when the latch GET operation succeeds, and not on each attempt.

 select a.total_waits, b.sum_of_sleeps 
 from  (select total_waits from v$system_event where event = latch free) a, 
 (select sum(sleeps) sum_of_sleeps from v$latch) b; 
 
 TOTAL_WAITS SUM_OF_SLEEPS 
 ----------- ------------- 
 414031680     414031680

Because latch free waits are typically quite short, it is possible to see a large number of waits (TOTAL_WAITS) that only account for a small amount of time. You should only be concerned when the TIME_WAITED is significant.

Latch Miss Locations

The V$LATCH_MISSES view keeps information on the locations within the Oracle kernel code where latch misses occur. This information is helpful to Oracle Support in diagnosing obscure latch wait cases. You can see the location information with the following query. Steve Adams has an excellent article on this subject at http://www.ixora.com.au/newsletter/2001_02.htm.

 select location, 
 parent_name, 
 wtr_slp_count, 
 sleep_count, 
 longhold_count 
 from   v$latch_misses 
 where  sleep_count > 0 
 order by wtr_slp_count, location; 
 LONGHOLD 
 LOCATION             PARENT_NAME          WTR_SLP_COUNT SLEEP_COUNT    COUNT 
 -------------------- -------------------- ------------- ----------- -------- 
 . . . 
 kglupc: child        library cache              7879693    11869691        0 
 kghupr1              shared pool                8062331     5493370        0 
 kcbrls: kslbegin     cache buffers chains       9776543    14043355        0 
 kqrpfl: not dirty    row cache objects         15606317    14999100        0 
 kqrpre: find obj     row cache objects         20359370    20969580        0 
 kglhdgn: child:      library cache             23782557     9952093        0 
 kcbgtcr: fast path   cache buffers chains      26729974    23166337        0 
 kglpnc: child        library cache             27385354     7707204        0

Latches in Oracle Database 10 g Release 1

Prior to Oracle Database 10 g , all latch waits show up as the latch free wait event. You can determine the name of the latch that a process contended for by querying the V$LATCH view using the P2 parameter value of the latch free event from the V$SESSION_WAIT view or the event 10046 trace file. The P2 parameter contains the specific latch number. In Oracle Database 10 g , common latches are broken out and have independent wait event names and statistics. Following is a listing from Oracle Database 10 g Release 1.

 select name 
 from   v$event_name 
 where  name like latch% 
 order by 1; 
 
 NAME 
 ---------------------------------------------------------------- 
 latch activity 
 latch free 
 latch: In memory undo latch 
 latch: KCL gc element parent latch 
 latch: MQL Tracking Latch 
 latch: cache buffer handles 
 latch: cache buffers chains 
 latch: cache buffers lru chain 
 latch: checkpoint queue latch 
 latch: enqueue hash chains 
 latch: gcs resource hash 
 latch: ges resource hash list 
 latch: latch wait list 
 latch: library cache 
 latch: library cache lock 
 latch: library cache pin 
 latch: messages 
 latch: object queue header heap 
 latch: object queue header operation 
 latch: parallel query alloc buffer 
 latch: redo allocation 
 latch: redo copy 
 latch: redo writing 
 latch: row cache objects 
 latch: session allocation 
 latch: shared pool 
 latch: undo global data 
 latch: virtual circuit queues 
 28 rows selected.

Common Causes, Diagnosis, and Actions

A latch contention indicates that a latch is held by another process for too long. When coupled with high demands, the contention is magnified to noticeable performance degradation. Latch contentions are common in high concurrency environments. You shouldn ‚ t plan on eliminating latch free waits from your database; that is unreasonable. The latch free wait event will always show up in the V$SYSTEM_EVENT view; the question is where in relation to other wait events in terms of TIME_WAITED. You should be concerned only when the TIME_WAITED is high, bearing in mind the instance startup time. If the systemwide time waited for latches is high, you can discover the hot latches that the processes competed for from the number of SLEEPS in the V$LATCH view with the following code:

 select name, gets, misses, immediate_gets, immediate_misses, sleeps 
 from   v$latch 
 order by sleeps; 
 IMMEDIATE IMMEDIATE 
 NAME                       GETS     MISSES        GETS    MISSES     SLEEPS 
 -------------------- ---------- ---------- ----------- --------- ---------- 
 enqueue hash chains    42770950       4279           0         0       1964 
 shared pool             9106650       5400           0         0       2632 
 row cache objects      69059887      27938         409         0       7517 
 enqueues               80443314     330167           0         0      13761 
 library cache          69447172     103349      465827       190      44328 
 cache buffers chains 1691040252    1166249    61532689      5909     127478 
 . . .

Due to the variety of latches, the common causes and appropriate action to take depends on the type of latch that is being competed for. A detailed discussion on all the latches would take more space than we have available. We picked the five most common latches: shared pool, library cache, cache buffers chains, cache buffers lru chain, and row cache objects , and we ‚ ll show you how to diagnose and fix problems related to them.

Shared Pool and Library Cache Latches

The Oracle shared pool consists of many structures. The prominent ones are the dictionary cache, SQL area, and library cache. You can see other structures by querying the V$SGASTAT view. The shared pool latch protects the shared pool structures, and it is taken when allocating and freeing memory heaps. For example, it is taken when allocating space for new SQL statements (hard parsing), PL/SQL procedures, functions, packages, and triggers as well as when it is aging or freeing chunks of space to make room for new objects.

Prior to Oracle9 i Database, the shared pool memory structures were protected by a solitary shared pool latch. Beginning with Oracle9 i Database, up to seven child shared pool latches can be used to protect the shared pool structures. This is possible because Oracle9 i Database can break the shared pool into multiple subpools if the server has at least four CPUs and the SHARED_POOL_SIZE is greater than 250MB. The number of subpools can be manually adjusted by the _KGHDSIDX_COUNT initialization parameter, which also supplies the appropriate number of child shared pool latches. If you manually increase the number of subpools, you should also increase the SHARED_POOL_SIZE because each subpool has its own structure, LRU list, and shared pool latch. Otherwise, the instance may not start due to the error ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool"," unknown object","sga heap(5,0)","fixed allocation callback").

The following statistics are from an Oracle9 i Database running on a 16-CPU server and using 256MB of SHARED_POOL_SIZE. The shared pool is broken into two subpools as shown by the _KGHDSIDX_COUNT parameter. There are two LRU lists as shown by the X$KGHLU ( kernel generic heap lru ) view, and two of the seven child latches are used as shown by the V$LATCH_CHILDREN view.

 select a.ksppinm, b.ksppstvl 
 from   x$ksppi a, x$ksppsv b 
 where  a.indx = b.indx 
 and    a.ksppinm = _kghdsidx_count; 
 
 KSPPINM            KSPPSTVL 
 ------------------ ---------- 
 _kghdsidx_count    2 
 
 select addr, kghluidx, kghlufsh, kghluops, kghlurcr, kghlutrn, kghlumxa 
 from   x$kghlu; 
 
 ADDR             KGHLUIDX   KGHLUFSH   KGHLUOPS KGHLURCR KGHLUTRN   KGHLUMXA 
 ---------------- -------- ---------- ---------- -------- -------- ---------- 
 80000001001581B8        2   41588416  496096025    14820    17463 2147483647 
 8000000100157E18        1   46837096 3690967191    11661    19930 2147483647 
 
 select addr, name, gets, misses, waiters_woken 
 from   v$latch_children 
 where name = shared pool; 
 
 ADDR             NAME                 GETS     MISSES WAITERS_WOKEN 
 ---------------- ------------- ----------- ---------- ------------- 
 C00000004C5B06B0 shared pool             0          0             0 
 C00000004C5B0590 shared pool             0          0             0 
 C00000004C5B0470 shared pool             0          0             0 
 C00000004C5B0350 shared pool             0          0             0 
 C00000004C5B0230 shared pool             0          0             0 
 C00000004C5B0110 shared pool    1385021389   90748637      12734879 
 C00000004C5AFFF0 shared pool    2138031345  413319249      44738488

The library cache structures are the home for cursors , SQL statements, execution plans, and parsed trees among other things. The structures are protected by the library cache latch. Oracle processes acquire the library cache latch when modifying, inspecting, pinning, locking, loading, or executing objects in the library cache structures. The number of child library cache latches that exist in an instance can be discovered by querying the V$LATCH_CHILDREN view as shown next. The number is usually the smallest prime number that is greater than the CPU_COUNT. This number can be modified by the _KGL_LATCH_COUNT parameter. Starting in Oracle9 i Database, the V$SQLAREA has a CHILD_LATCH column, which allows you to see how cursors are distributed across the library cache latches.

 select count(*) 
 from   v$latch_children 
 where  name = library cache;

Contention for Shared Pool and Library Cache Latches ‚ Parsing

Contentions for the shared pool and library cache latches are mainly due to intense hard parsing. A hard parse applies to new cursors and cursors that are aged out and must be re-executed. Excessive hard parsing is common among applications that primarily use SQL statements with literal values. A hard parse is a very expensive operation, and a child library cache latch must be held for the duration of the parse.

Discover the magnitude of hard parsing in your database using the following query. The number of soft parses can be determined by subtracting the hard parse from the total parse.

 select a.*, sysdate-b.startup_time days_old 
 from   v$sysstat a, v$instance b 
 where a.name like parse%; 
 
 STATISTIC# NAME                      CLASS      VALUE   DAYS_OLD 
 ---------- ------------------------- ----- ---------- ---------- 
 230 parse time cpu               64   33371587  4.6433912 
 231 parse time elapsed           64   63185919  4.6433912 
 232 parse count (total)          64 2137380227  4.6433912 
 233 parse count (hard)           64   27006791  4.6433912 
 234 parse count (failures)       64      58945  4.6433912

Note ‚

A parse failure is related to the ‚“ ORA-00942: table or view does not exist ‚½ error or out of shared memory.

Discover the current sessions that perform a lot of hard parses:

 select a.sid, c.username, b.name, a.value, 
 round((sysdate - c.logon_time)*24) hours_connected 
 from   v$sesstat a, v$statname b, v$session c 
 where  c.sid        = a.sid 
 and    a.statistic# = b.statistic# 
 and    a.value      > 0 
 and    b.name       = parse count (hard) 
 order by a.value; 
 
 SID USERNAME   NAME                    VALUE HOURS_CONNECTED 
 ---- ---------- ------------------ ---------- --------------- 
 510 SYS        parse count (hard)         12               4 
 413 PMAPPC     parse count (hard)        317              51 
 37 PMHCMC     parse count (hard)      27680             111 
 257 PMAPPC     parse count (hard)      64652              13 
 432 PMAPPC     parse count (hard)     105505              13

The V$SESS_TIME_MODEL view in Oracle Database 10 g shows where the hard parses are coming from by providing the elapsed time statistics on hard and failed parses. Following is a sample from the V$SESS_TIME_MODEL view for a particular session:

 select *  from   v$sess_time_model  where  sid = (select max(sid) from v$mystat);  SID    STAT_ID STAT_NAME                                             VALUE ---- ---------- ------------------------------------------------ ----------  148 3649082374 DB time                                            11141191  148 2748282437 DB CPU                                              9530592  148 4157170894 background elapsed time                                   0  148 2451517896 background cpu time                                       0  148 4127043053 sequence load elapsed time                                0   148 1431595225 parse time elapsed                                  3868898   148  372226525 hard parse elapsed time                             3484672  148 2821698184 sql execute elapsed time                            9455020  148 1990024365 connection management call elapsed time                6726   148 1824284809 failed parse elapsed time                                 0   148 4125607023 failed parse (out of shared memory) elapsed time          0   148 3138706091 hard parse (sharing criteria) elapsed time            11552   148  268357648 hard parse (bind mismatch) elapsed time                4440  148 2643905994 PL/SQL execution elapsed time                         70350  148  290749718 inbound PL/SQL rpc elapsed time                           0  148 1311180441 PL/SQL compilation elapsed time                      268477  148  751169994 Java execution elapsed time                               0

The hard parse statistics in the preceding output can be grouped as such:

1. parse time elapsed			‚
‚	2. hard parse elapsed time
‚	‚	3. hard parse (sharing criteria) elapsed time
‚	‚	‚	4. hard parse (bind mismatch) elapsed time
2. failed parse elapsed time			‚
‚	3. failed parse (out of shared memory) elapsed time

Identify literal SQL statements that are good candidates for bind variables . The following query searches the V$SQLAREA view for statements that are identical in the first 40 characters and lists them when four or more instances of the statements exist. The logic assumes the first 40 characters of most of your application ‚ s literal statements are the same. Obviously, longer strings (for example, substr(sql_text,1,100) ) and higher occurrences (for example, count(*) > 50 ) will yield a shorter report. Once you have identified the literal statements, you can advise the application developers on which statements to convert to use bind variables.

 select hash_value, substr(sql_text,1,80) 
 from   v$sqlarea 
 where  substr(sql_text,1,40) in (select substr(sql_text,1,40) 
 from   v$sqlarea 
 having count(*) > 4 
 group by substr(sql_text,1,40)) 
 order by sql_text; 
 
 HASH_VALUE SUBSTR(SQL_TEXT,1,80) 
 ---------- ----------------------------------------------------------------- 
 2915282817 SELECT revenue.customer_id, revenue.orig_sys, revenue.service_typ 
 2923401936 SELECT revenue.customer_id, revenue.orig_sys, revenue.service_typ 
 303952184 SELECT revenue.customer_id, revenue.orig_sys, revenue.service_typ 
 416786153 SELECT revenue.customer_id, revenue.orig_sys, revenue.service_typ 
 2112631233 SELECT revenue.customer_id, revenue.orig_sys, revenue.service_typ 
 3373328808 select region_id from person_to_chair where chair_id = 988947 
 407884945 select region_id from person_to_chair where chair_id = 990165 
 3022536167 select region_id from person_to_chair where chair_id = 990166 
 3204873278 select region_id from person_to_chair where chair_id = 990167 
 643778054 select region_id from person_to_chair where chair_id = 990168 
 2601269433 select region_id from person_to_chair where chair_id = 990169 
 3453662597 select region_id from person_to_chair where chair_id = 991393 
 3621328440 update plan_storage set last_month_plan_id = 780093, pay_code 
 2852661466 update plan_storage set last_month_plan_id = 780093, pay_code 
 380292598 update plan_storage set last_month_plan_id = 780093, pay_code 
 2202959352 update plan_storage set last_month_plan_id = 780093, pay_code 
 . . .

In Oracle9 i Database, you can query the V$SQL view for SQL statements that share the same execution plan, as the statements may be identical except for the literal values. They are the candidates for bind variables.

 select plan_hash_value, hash_value 
 from   v$sql 
 order by 1,2;

You should see significant reduction in the contention for the shared pool and library cache latches when you convert literal SQL statements to use bind variables. The conversion is best done in the application. The workaround is to set the initialization parameter CURSOR_SHARING to FORCE. This allows statements that differ in literal values but are otherwise identical to share a cursor and therefore reduce latch contention, memory usage, and hard parse.

Caution ‚

The CURSOR_SHARING feature has bugs in the earlier releases of Oracle8i Database. It is not recommended in environments with materialized views because it may cause prolonged library cache pin waits. Also, setting the CURSOR_SHARING to FORCE may cause the optimizer to generate unexpected execution plans because the optimizer does not know the values of the bind variables. This may positively or negatively impact the database performance. Starting in Oracle9i Database, the optimizer will peek at the bind variable values in the session ‚ s PGA before producing an execution plan. This behavior is controlled by the parameter _OPTIM_PEEK_USER_BINDS. However, this applies to statements that require hard parsing only, which means the execution plan is based on the first value that is bound to the variable.

Whenever a SQL statement arrives, Oracle checks to see if the statement is already in the library cache. If it is, the statement can be executed with little overhead; this process is known as a soft parse. While hard parses are bad, soft parses are not good either. The library cache latch is acquired during a soft parse operation. Oracle still has to check the syntax and semantics of the statement, unless the statement is cached in the session ‚ s cursor cache. You can reduce the library cache latch hold time by properly setting the SESSION_CACHED_CURSORS parameter. (See Oracle Metalink notes #30804.1 and #62143.1 for more information.) However, the best approach is to reduce the number of soft parses, which can only be done through the application. The idea is to parse once, execute many instead of parse once, execute once. You can find the offending statements by querying the V$SQLAREA view for statements with high numbers of PARSE_CALLS.

Contention for Shared Pool Latch ‚ Oversized Shared Pool

Thanks to the new multiple subpools architecture, large shared pools are not as bad starting in Oracle9 i Database. However, if your database is not yet on Oracle9 i Database, an oversized shared pool can increase the contention for the shared pool latch. This is because free memory in the shared pool is categorized and maintained on a number of buckets or free lists according to the chunk size . Larger shared pools tend to have long free lists and processes that need to allocate space in them must spend extra time scanning the long free lists while holding the shared pool latch. In a high concurrency environment, this can result in serious shared pool latch contention (high SLEEPS and MISSES), especially if the application uses primarily literal SQL. If this is the case, there is no need for a large shared pool. Why keep unshared statements in the shared pool?

The alter session set events ‚ immediate trace name heapdump level 2 ‚ command lets you see the shared pool free lists. The command generates a trace file in the UDUMP directory. Search the trace file for the word ‚“Bucket ‚½ and you will see the free memory chunks that are associated with each bucket. Alternatively, you can use the following script to help generate a query to list the shared pool free memory chunks. Once the query is generated, it is reusable in the database that generates the trace file. Do not use it in another database of a different version because this can produce wrong results. Also, this query will not run in Oracle Database 10 g Release 1 because it limits the number of arguments in a case expression to 128. (See Oracle Metalink note #131557.1 and bug #3503496.) You may work around this limitation by rewriting the query using the DECODE and SIGN functions.

 SQL> oradebug setmypid 
 Statement processed. 
 SQL> oradebug dump heapdump 2 
 Statement processed. 
 SQL> oradebug tracefile_name 
 /u01/admin/webmon/udump/orcl_ora_17550.trc 
 SQL> exit 
 
 $ grep Bucket /u01/admin/webmon/udump/orcl_ora_17550.trc > tmp.lst 
 $ sed s/size=/ksmchsiz>=/ tmp.lst > tmp2.lst 
 $ sed s/ Bucket // tmp2.lst  sort nr > tmp.lst 
 
 # Create a shell script based on the following and run it to generate 
 # the reusable query for the database. 
 echo select ksmchidx, (case 
 cat tmp.lst  while read LINE 
 do 
 echo $LINE  awk {print "when "  " then " } 
 done 
 echo end) bucket#, 
 echo        count(*) free_chunks, 
 echo        sum(ksmchsiz) free_space, 
 echo        trunc(avg(ksmchsiz)) avg_chunk_size 
 echo from   x$ksmsp 
 echo "where  ksmchcls = free" 
 echo group by ksmchidx, (case; 
 cat tmp.lst  while read LINE 
 do 
 echo $LINE  awk {print "when "  " then " } 
 done 
 echo end);

If you discover the database has long shared pool free lists and the application uses literal SQL, then you should consider reducing the SHARED_POOL_SIZE. This will reduce the contention for the shared pool latch. Be careful not to make the shared pool too small, as this can cause the application to fail on the ORA-04031 error. You should also pin reusable objects in the shared pool using the DBMS_SHARED_POOL.KEEP procedure. The V$DB_OBJECT_CACHE view has information on kept objects.

Contention for Library Cache Latches ‚ Statements with High Version Counts

Oracle uses multiple child cursors to distinguish SQL statements that are identical in their characters but cannot be shared because they refer to different underlying objects. For example, if there are three CUSTOMER tables in the database and each table is owned by a different schema, the SELECT * FROM CUSTOMER issued by each owner will have the same hash value but different child numbers. When a hash value of a statement being parsed matches the hash value of a statement with a high number of children or version count, Oracle has to compare the statement with all the existing versions. The library cache latch must be held for the duration of the inspection, and this may cause other processes to miss on their library cache latch requests. You can minimize this problem by having unique object names in the database. The following query lists all SQL statements in the V$SQLAREA with version counts greater than 20:

 select version_count, sql_text 
 from   v$sqlarea 
 where  version_count > 20 
 order by version_count, hash_value;

Note ‚

High version counts may also be caused by a bug related to the SQL execution progression monitoring feature in Oracle8 i Database. (See Oracle Metalink note #62143.1) The bug prevents SQL statements from being shared. This feature can be turned off by setting the _SQLEXEC_PROGRESSION_COST parameter to 0, which in turn suppresses all data in the V$SESSION_LONGOPS view.

Oracle provides the V$SQL_SHARED_CURSOR view to explain why a particular child cursor is not shared with existing child cursors. Each column in the view identifies a specific reason why the cursor cannot be shared.

 -- The columns in uppercase are relevant to Oracle9  i  Database. 
 -- Oracle Database 10  g  Release 1 has eight additional columns and it 
 -- also contains the child cursor address and number. 
 select a.*, b.hash_value, b.sql_text 
 from   v$sql_shared_cursor a, v$sqltext b, x$kglcursor c 
 where  a.unbound_cursor          a.sql_type_mismatch 
 a.optimizer_mismatch      a.outline_mismatch 
 a.stats_row_mismatch      a.literal_mismatch 
 a.sec_depth_mismatch      a.explain_plan_cursor 
 a.buffered_dml_mismatch   a.pdml_env_mismatch 
 a.inst_drtld_mismatch     a.slave_qc_mismatch 
 a.typecheck_mismatch      a.auth_check_mismatch 
 a.bind_mismatch           a.describe_mismatch 
 a.language_mismatch       a.translation_mismatch 
 a.row_level_sec_mismatch  a.insuff_privs 
 a.insuff_privs_rem        a.remote_trans_mismatch 
 a.LOGMINER_SESSION_MISMATCH  a.INCOMP_LTRL_MISMATCH 
 a.OVERLAP_TIME_MISMATCH      a.sql_redirect_mismatch 
 a.mv_query_gen_mismatch      a.USER_BIND_PEEK_MISMATCH 
 a.TYPCHK_DEP_MISMATCH        a.NO_TRIGGER_MISMATCH 
 a.FLASHBACK_CURSOR <> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 
 and    a.address = c.kglhdadr 
 and    b.hash_value = c.kglnahsh 
 order by b.hash_value, b.piece;

Cache Buffers Chains Latches

When data blocks are read into the SGA, their buffer headers are placed on linked lists (hash chains) that hang off hash buckets. This memory structure is protected by a number of child cache buffers chains latches (also known as hash latches or CBC latches). Figure 6-1 illustrates the relationships between hash latches, hash buckets, buffer headers, and hash chains.

Figure 6-1: A depiction of the buffer cache in Oracle8i Database and above

A process that wants to add, remove, search, inspect, read, or modify a block on a hash chain must first acquire the cache buffers chains latch that protects the buffers on the chain. This guarantees exclusive access and prevents other processes from reading or changing the same chain from beneath . Concurrency is hereby sacrificed for the sake of integrity.

Note ‚

Starting in Oracle9 i Database, the cache buffers chains latches can be shared for read-only. This reduces some contention, but will by no means eliminate the contention all together. We have many Oracle9i production databases that are plagued by the cache buffers chains latch contention.

The hash bucket for a particular block header is determined based on the modulus of the Data Block Address (DBA) and the value of the _DB_BLOCK_HASH_BUCKETS parameter. For example, hash bucket = MOD(DBA, _DB_BLOCK_HASH_BUCKETS). The contents of the buffer header are exposed through the V$BH and X$BH views. You can dump the buffer headers using the following command:

 alter system set events immediate trace name buffers level 1;

Up to Oracle8.0, there is one cache buffers chains latch (hash latch) per hash bucket, and each hash bucket has one chain. In other words, the relationship between hash latches, hash buckets, and hash chains is 1:1:1. The default number of hash buckets is DB_BLOCK_BUFFERS / 4, rounded up to the next prime number. This value can be adjusted by the _DB_BLOCK_HASH_BUCKETS parameter. For example, if the DB_BLOCK_BUFFERS is 50000, this means the instance has 12501 hash latches, 12501 hash buckets, and 12501 hash chains.

Up to Oracle8.0: Hash latch ƒ (1:1) ƒ hash bucket ƒ (1:1) ƒ hash chain

Starting in Oracle8i Database: Hash latch ƒ (1:m) ƒ hash bucket ƒ (1:1) ƒ hash chain

Starting in Oracle8 i Database, Oracle changed the nexus between hash latches and hash buckets to 1:m, but kept the nexus between hash buckets and hash chains unchanged at 1:1. Multiple hash chains could now be protected by a single hash latch. In doing this, Oracle drastically reduced the default number of hash latches. The default number of hash latches depends on the value of the DB_BLOCK_BUFFERS parameter. Usually, the number is 1024 when the buffer cache size is less than 1GB. The number of hash latches can be adjusted by the _DB_BLOCKS_HASH_LATCHES parameter. You can quickly discover the number of hash latches in the instance in one of the following ways:

 select count(distinct(hladdr)) 
 from   x$bh; 
 
 COUNT(DISTINCT(HLADDR)) 
 ----------------------- 
 1024 
 
 select count(*) 
 from   v$latch_children 
 where  name = cache buffers chains; 
 
 COUNT(*) 
 ---------- 
 1024

The default number of hash buckets is 2 * DB_BLOCK_BUFFERS, and the value is adjustable by the _DB_BLOCK_HASH_BUCKETS parameter. So, if the DB_BLOCK_BUFFERS is 50000, this means the instance has 100000 hash buckets and 100000 hash chains but only 1024 hash latches ( assuming the block size is 8K). As you can see, there is a drastic change in the number of hash buckets and hash latches from Oracle8.0 to Oracle8 i Database. Many DBAs think this new architecture seems backward and that it will increase latch free contention because there are far fewer latches. However, Oracle was hoping that by increasing the number of hash chains by a factor of 8, each chain would be shorter and compensate for a lower number of latches. Don ‚ t bet on this; you will still see cache buffers chains latch contentions.

Oracle Database 10 g uses a different algorithm to determine the default number of hash buckets. Initially, it seems to be 1/4 of the DB_CACHE_SIZE, but further tests reveal that a hash bucket value remains constant over multiple DB_CACHE_SIZEs. For example, there are 65536 hash buckets when the DB_CACHE_SIZE is between 132MB and 260MB. This new algorithm also does not use prime numbers. Table 6-2 shows the default number of hash buckets in Oracle8 i Database through Oracle Database 10 g in various buffer cache sizes. In all cases, the DB_BLOCK_SIZE is 8K.

Table 6-2: Default Number of Hash Buckets in Select Oracle Databases
Oracle Database 10 g on the Solaris Operating System
db_cache_size	32M	64M	128M	256M	512M	1024M	2048M
_ksmg_granule_size	4M	4M	4M	4M	4M	16M	16M
_db_block_buffers	3976	7952	15904	31808	63616	127232	254464
_db_block_hash_buckets	8192	16384	32768	65536	131072	262144	524288
_db_block_hash_latches	1024	1024	1024	1024	1024	1024	2048
Oracle9 i Database on the Solaris Operating System
db_cache_size	32M	64M	128M	256M	‚ 512M	1024M	2048M
_ksmg_granule_size	4M	4M	16M	16M	16M	16M	16M
_db_block_buffers	4000	8000	16016	32032	64064	128128	256256
_db_block_hash_buckets	8009	16001	32051	64067	128147	256279	512521
_db_block_hash_latches	1024	1024	1024	1024	1024	1024	2048
Oracle8 i Database on the Solaris Operating System
db_block_buffers	4000	8000	16016	32032	64064	128128	192192
_db_block_hash_buckets	8000	16000	32032	64064	128128	256256	384384
_db_block_hash_latches	1024	1024	1024	1024	1024	1024	2048

Contention for Cache Buffers Chains Latches ‚ Inefficient SQL Statements

Inefficient SQL statements are the main cause of cache buffers chains latch contentions. When mixed with high concurrency, the time spent on latch free waits can be rather significant. This is a familiar scene in environments where the application opens multiple concurrent sessions that execute the same inefficient SQL statements that go after the same data set.

You will do well if you keep these three things in mind:

Every logical read requires a latch get operation and a CPU.
The only way to get out of the latch get routine is to get the latch.
Only one process can own a cache buffers chains latch at any one time, and the latch covers many data blocks, some of which may be needed by another process. (Again, as mentioned in a previous note, Oracle9 i Database allows the cache buffers chains latches to be shared for read-only.)

Naturally, fewer logical reads means fewer latch get operations, and thus reduces latch competition, which translates to better performance. Therefore, you must identify the SQL statements that contend for the cache buffers chains latches and tune them to reduce the number of logical reads. SQL statements with high BUFFER_GETS (logical reads) per EXECUTIONS are the main culprits.

Note ‚

Many DBAs make a grave mistake by simply increasing the number of cache buffers chains latches with the _DB_BLOCKS_HASH_LATCHES parameter without first optimizing the SQL statements. While additional latches may provide some relief, these DBAs treat the symptom without fixing the problem.

Note ‚

We experienced a lot of cache buffers chains latch contentions after upgrading a database from Oracle8.1.7.4 to Oracle9.2.0.5 on the Sun Solaris platform. The new optimizer was generating and using poor execution plans for the application. Several hidden optimizer-related parameters that are by default disabled in Oracle8i Database are enabled Oracle9i Database. The problem was fixed by unsetting some of the parameters. If you are having the same problem, the best way is to handle it through Oracle Support.

Contention for Cache Buffers Chains Latches ‚ Hot Blocks

Hot blocks are another common cause of cache buffers chains latch contention. This happens when multiple sessions repeatedly access one or more blocks that are protected by the same child cache buffers chains latch. This is mostly an application issue. In most cases, increasing the number of cache buffers chains latches will do little to improve performance. This is because blocks are hashed to hash buckets and chains based on the block address and the number of hash buckets, and not the number of cache buffers chains latches. If the block address and the number of hash buckets remain the same, chances are those few hot blocks will still be covered by one cache buffers chains latch, unless the number of latches is drastically increased.

When sessions compete for the cache buffers chains latches, the best way to find out if you have a hot blocks situation is to examine the P1RAW parameter value of the latch free wait event. (Remember, in Oracle Database 10 g this event is latch: cache buffers chains .) The P1RAW parameter contains the latch address. If the sessions are waiting on the same latch address, you have hot blocks. Based on the following example, there are hot blocks on chains covered by the 00000400837D7800 and 00000400837DE400 latches:

 select sid, p1raw, p2, p3, seconds_in_wait, wait_time, state 
 from   v$session_wait 
 where  event = latch free 
 order by p2, p1raw; 
 
 SID P1RAW             P2  P3 SECONDS_IN_WAIT  WAIT_TIME STATE 
 ---- ---------------- --- --- --------------- ---------- ------------------ 
 38 00000400837D7800  98   1               1          2 WAITED KNOWN TIME 
 42 00000400837D7800  98   1               1          2 WAITED KNOWN TIME 
 44 00000400837D7800  98   3               1          4 WAITED KNOWN TIME 
 58 00000400837D7800  98   2               1         10 WAITED KNOWN TIME 
 85 00000400837D7800  98   3               1         12 WAITED KNOWN TIME 
 214 00000400837D7800  98   1               1          2 WAITED KNOWN TIME 
 186 00000400837D7800  98   3               1         14 WAITED KNOWN TIME 
 149 00000400837D7800  98   2               1          3 WAITED KNOWN TIME 
 132 00000400837D7800  98   2               1          2 WAITED KNOWN TIME 
 101 00000400837D7800  98   3               1          4 WAITED KNOWN TIME 
 222 00000400837D7800  98   3               1         12 WAITED KNOWN TIME 
 229 00000400837D7800  98   3               1          4 WAITED KNOWN TIME 
 230 00000400837D7800  98   3               1         11 WAITED KNOWN TIME 
 232 00000400837D7800  98   1               1         20 WAITED KNOWN TIME 
 257 00000400837D7800  98   3               1         16 WAITED KNOWN TIME 
 263 00000400837D7800  98   3               1          5 WAITED KNOWN TIME 
 117 00000400837D7800  98   4               1          4 WAITED KNOWN TIME 
 102 00000400837D7800  98   3               1         12 WAITED KNOWN TIME 
 47 00000400837D7800  98   3               1         11 WAITED KNOWN TIME 
 49 00000400837D7800  98   1               1          2 WAITED KNOWN TIME 
 
 99 00000400837D9300  98   1               1         32 WAITED KNOWN TIME 
 
 51 00000400837DD200  98   1               1          1 WAITED KNOWN TIME 
 
 43 00000400837DE400  98   1               1          2 WAITED KNOWN TIME 
 130 00000400837DE400  98   1               1         10 WAITED KNOWN TIME 
 89 00000400837DE400  98   1               1          2 WAITED KNOWN TIME 
 62 00000400837DE400  98   0               1         -1 WAITED KNOWN TIME 
 150 00000400837DE400  98   1               1          9 WAITED KNOWN TIME 
 195 00000400837DE400  98   1               1          3 WAITED KNOWN TIME 
 67 00000400837DE400  98   1               1          2 WAITED KNOWN TIME

The next step is to see what blocks are covered by the latch. You should also capture the SQL statements that participate in the competition. This is because a cache buffers chains latch covers many blocks, and you can identify the hot blocks by the tables that are used in the SQL statements. In Oracle8 i Database and above, you can identify the hot blocks based on their TCH (touch count) values using the following query. Generally, hot blocks have higher touch count values. However, bear in mind that the touch count is reset to 0 when a block is moved from the cold to the hot end of the LRU list. Depending on the timing of your query, a block with a 0 touch count value is not necessarily cold.

 -- Using the P1RAW from the above example (00000400837D7800). 
 select a.hladdr, a.file#, a.dbablk, a.tch, a.obj, b.object_name 
 from   x$bh a, dba_objects b 
 where  (a.obj = b.object_id  or  a.obj = b.data_object_id) 
 and    a.hladdr = 00000400837D7800 
 union 
 select hladdr, file#, dbablk, tch, obj, null 
 from   x$bh 
 where  obj in (select obj from x$bh where hladdr = 00000400837D7800 
 minus 
 select object_id from dba_objects 
 minus 
 select data_object_id from dba_objects) 
 and    hladdr = 00000400837D7800 
 order by 4; 
 
 HLADDR           FILE#  DBABLK  TCH         OBJ OBJECT_NAME 
 ---------------- ----- ------- ---- ----------- -------------------- 
 00000400837D7800    16  105132    0       19139 ROUTE_HISTORY 
 00000400837D7800    16  106156    0       19163 TELCO_ORDERS 
 00000400837D7800    26   98877    0       23346 T1 
 00000400837D7800    16   61100    0       19163 TELCO_ORDERS 
 00000400837D7800    16   26284    0       19059 FP_EQ_TASKS 
 00000400837D7800     7  144470    0       18892 REPORT_PROCESS_QUEUE 
 00000400837D7800     8  145781    0       18854 PA_EQUIPMENT_UNION 
 00000400837D7800   249  244085    0  4294967295 
 00000400837D7800     7   31823    1       18719 CANDIDATE_EVENTS 
 00000400837D7800    13  100154    1       19251 EVENT 
 00000400837D7800     7   25679    1       18730 CANDIDATE_ZONING 
 00000400837D7800     7    8271    1       18719 CANDIDATE_EVENTS 
 00000400837D7800     7   32847    2       18719 CANDIDATE_EVENTS 
 00000400837D7800     8   49518    2       18719 CANDIDATE_EVENTS 
 00000400837D7800     7   85071    2       18719 CANDIDATE_EVENTS 
 00000400837D7800   275   76948    2  4294967295 
 00000400837D7800     7   41039    3       18719 CANDIDATE_EVENTS 
 00000400837D7800     7   37967    4       18719 CANDIDATE_EVENTS 
 00000400837D7800     8   67950    4       18719 CANDIDATE_EVENTS 
 00000400837D7800     7   33871    7       18719 CANDIDATE_EVENTS 
 00000400837D7800     7   59471    7       18719 CANDIDATE_EVENTS 
 00000400837D7800     8    8558   24       18719 CANDIDATE_EVENTS

As previously mentioned, hot blocks are an application issue. Find out why the application has to repeatedly access the same block (or blocks) and check if there is a better alternative.

As for the workaround, the idea is to spread the hot blocks across multiple cache buffers chains latches. This can be done by relocating some of the rows in the hot blocks. The new blocks have different block addresses and, with any luck, they are hashed to buckets that are not covered by the same cache buffers chains latch. You can spread the blocks in a number of ways, including:

Deleting and reinserting some of the rows by ROWID.
Exporting the table, increasing the PCTFREE significantly, and importing the data. This minimizes the number of rows per block, spreading them over many blocks. Of course, this is at the expense of storage and full table scans operations will be slower.
Minimizing the number of records per block in the table. This involves dumping a few data blocks to get an idea of the current number of rows per block. Refer to the ‚“Data Block Dump ‚½ section in Appendix C for the syntax. The ‚“nrow ‚½ in the trace file shows the number of rows per block. Export and truncate the table. Manually insert the number of rows that you determined is appropriate and then issue the ALTER TABLE table_name MINIMIZE RECORDS_PER_BLOCK command. Truncate the table and import the data.
For indexes, you can rebuild them with higher PCTFREE values, bearing in mind that this may increase the height of the index.
Consider reducing the block size. Starting in Oracle9 i Database, Oracle supports multiple block sizes. If the current block size is 16K, you may move the table or recreate the index in a tablespace with an 8K block size. This too will negatively impact full table scans operations. Also, various block sizes increase management complexity.

For other workarounds, if the database is on Oracle9 i Database Release 2 or higher, you may consider increasing the _SPIN_COUNT value as discussed earlier. As a last resort, you may increase the number of hash buckets through the _DB_BLOCK_HASH_BUCKETS parameter. This practice is rarely necessary starting in Oracle8 i Database. If you do this, make sure you provide a prime number ‚ if you don ‚ t, Oracle will round it up to the next highest prime number

Contention for Cache Buffers Chains Latches ‚ Long Hash Chains

Multiple data blocks can be hashed to the same hash bucket, and they are linked together with pointers on the hash chain that belong to the hash bucket. The number of blocks on a hash chain can run into hundreds for large databases. A process has to sequentially scan the hash chain for the required block while holding the cache buffers chains latch. We call this ‚“chasing the chain. ‚½ The latch must be held longer when scanning a long chain and may cause another process to miss on its cache buffers chains latch request.

Up to Oracle8.0, it is easy to determine the length of a particular hash chain because the relationships between hash latches, hash buckets, and hash chains are 1:1:1, and the length of a hash chain is equal to the number of blocks protected by its latch. The following query reports the number of blocks per hash chain. Chains with 10 or more blocks are considered long.

 select hladdr, count(*) 
 from   x$bh 
 group by hladdr 
 order by 2;

Oracle8 i Database changed the nexus between hash latches and hash buckets to 1: m. This means you can no longer determine the length of a particular hash chain. Rather, you can only determine the number of blocks covered by a particular hash latch, and a hash latch covers multiple hash chains. The preceding query will now report the number of blocks on each hash latch. Before you can determine if a hash latch is overloaded, the ratio between hash latches and hash buckets must first be determined. According to the following example, each hash latch protects 125 hash chains. If you allow up to 10 blocks per hash chain, you should only be concerned when you see values close to or more than 1250 from the preceding query. The hash chain length can be reduced by increasing the number of hash buckets through the _DB_BLOCK_HASH_BUCKETS parameter. You will find that this is rarely necessary in the newer releases of Oracle.

 _DB_BLOCK_HASH_BUCKETS = 128021 
 _DB_BLOCK_HASH_LATCHES = 1024 
 Ratio = 128021 / 1024 = 125

Cache Buffers Lru Chain Latches

In addition to the hash chains, buffer headers are also linked to other lists such as the LRU, LRUW, and CKPT-Q. The LRU and LRUW lists are nothing new ‚ they are the two original lists in the buffer cache. The LRU list contains buffers in various states, while the LRUW list contains only dirty buffers. The LRU and LRUW lists are mutually exclusive, and they are called a working set. Each working set is protected by a cache buffers lru chain latch. In other words, the number of working sets in the buffer cache is determined by the number of cache buffers lru chain latches. You can see the working sets by querying the X$KCBWDS (kernel cache buffer working sets descriptors) view as follows . Notice the child latch ADDR is the SET_LATCH address.

LRU + LRUW = A Working Set

 select set_id, set_latch 
 from   x$kcbwds 
 order by set_id; 
 
 SET_ID SET_LATC 
 ---------- -------- 
 1 247E299C 
 2 247E2E68 
 3 247E3334 
 4 247E3800 
 5 247E3CCC 
 6 247E4198 
 7 247E4664 
 8 247E4B30 
 
 select addr 
 from   v$latch_children 
 where name = cache buffers lru chain 
 order by addr; 
 
 ADDR 
 -------- 
 247E299C 
 247E2E68 
 247E3334 
 247E3800 
 247E3CCC 
 247E4198 
 247E4664 
 247E4B30

Typically, foreground processes access the LRU lists when looking for free buffers. The DBWR background process accesses the LRU lists to move cleaned buffers from the LRUW and move the dirty ones to the LRUW list. All processes must acquire the cache buffers lru chain latch before performing any kind of operation on a working set.

Buffers in the DB_2K_CACHE_SIZE, DB_4K_CACHE_SIZE, DB_8K_CACHE_SIZE, DB_16K_CACHE_SIZE, DB_32K_CACHE_SIZE, DB_CACHE_SIZE, DB_KEEP_CACHE_SIZE, and DB_RECYCLE_CACHE_SIZE pools are divided among the cache buffers lru chain latches. You should have at least one cache buffers lru chain latch per buffer pool or DBWR process, whichever is higher; otherwise, the working set can be too long. In Oracle9 i Database and Oracle Database 10 g , the default number of cache buffers lru chain latches is 4 times the CPU_COUNT, unless the DB_WRITER_PROCESSES is greater than 4, in which case the default is the product of DB_WRITER_PROCESSES and CPU_COUNT. The number of cache buffers lru chain latches can be adjusted upward by the _DB_BLOCK_LRU_LATCHES parameter in CPU_COUNT increments.

Competition for the cache buffers lru chain latch is symptomatic of intense buffer cache activity caused by inefficient SQL statements. Statements that repeatedly scan large unselective indexes or perform full table scans are the prime culprits. Capture the active SQL statement that is associated with the latch free wait event of the cache buffers lru chain latch. (Oracle Database 10 g has a separate wait event for this latch called latch: cache buffers lru chain .) Tune the statement to reduce the number of logical and physical reads.

Row Cache Objects Latches

The row cache objects latch protects the Oracle data dictionary (or row cache because information is stored as rows instead of blocks). Processes must obtain this latch when loading, referencing, or freeing entries in the dictionary. This latch is a solitary latch up to Oracle8 i Database. With the new multiple subpools architecture in Oracle9 i Database, multiple copies of child row cache objects latches exist. Oracle Database 10 g has a separate wait event for this latch known as latch: row cache objects .

Starting in Oracle7.0, the data dictionary became part of the shared pool; before that every dictionary object had a separate DC_* initialization parameter. The change in Oracle 7.0 means the data dictionary can no longer be tuned directly; it can only be tuned indirectly through the SHARED_POOL_SIZE parameter. The V$ROWCACHE has statistics for every data dictionary object. You can discover the hottest object using the following query:

 select cache#, type, parameter, gets, getmisses, modifications mod 
 from   v$rowcache 
 where  gets > 0 
 order by gets; 
 
 CACHE# TYPE        PARAMETER                GETS  GETMISSES    MOD 
 ------ ----------- ------------------ ---------- ---------- ------ 
 7 SUBORDINATE dc_user_grants        1615488         75      0 
 2 PARENT      dc_sequences          2119254     189754    100 
 15 PARENT      dc_database_links     2268663          2      0 
 10 PARENT      dc_usernames          7702353         46      0 
 8 PARENT      dc_objects           11280602      12719    400 
 7 PARENT      dc_users             81128420         78      0 
 16 PARENT      dc_histogram_defs   182648396      51537      0 
 11 PARENT      dc_object_ids       250841842       3939     75

Row cache tuning is very limited. The best solution is to reduce dictionary access based on the output from the V$ROWCACHE. For example, if sequences are the problem, consider caching the sequences. Views that contain multiple table joins and view-on-views will increase this latch contention. The common workaround is simply to increase the SHARED_POOL_SIZE.