| < Day Day Up > |
|
In the previous chapter we took a brief look at how to set up STATSPACK, how to execute snapshots at different periods of time and how to execute STATSPACK reports. As already stated in the previous chapter STATSPACK is useful for analysis and detection of bottlenecks. In my opinion Oracle Enterprise Manager is better at providing more focused and easy-to-use drill-down into the Oracle Database Wait Event Interface. The Oracle Enterprise Manager Capacity Planner tool can be used to gather and compare historical sets of statistics much like STATSPACK can.
Tip | STATSPACK is currently the official Oracle Corporation performance monitoring and tuning tool. |
The easiest method of briefly explaining STATSPACK and its use is to display an example output report and describe the different sections step by step as textual inserts into the report itself. The example used in this section is a range of 10 level 6 snapshots with the database in a consistent state across all snapshots.
Tip | With the TIMED_STATISTICS parameter set to TRUE events are listed in order of wait times as opposed to the number of waits. Wait times are much more significant than number of waits. |
The first section shows basic configuration and database instance information and is really a header for the STATSPACK report.
STATSPACK report for DB Name DB Id Instance Inst Num Release Cluster Host ------- ---------- -------- -------- --------- ------- ---------- TEST 1798396642 test 1 9.2.0.1.0 NO 2000SERVER Snap Id Snap Time Sessions Curs/Sess Comment ------- ----------------- -------- --------- ------- Begin Snap: 191 08-Aug-03 16:15:40 21 7.5 End Snap: 200 08-Aug-03 16:31:04 21 7.1 Elapsed: 1 5.40 (mins) Cache Sizes (end) ----------------- Buffer Cache: 32M Std Block Size: 8K Shared Pool Size: 64M Log Buffer: 512K
The load profile section gives an overall health check and load information for the database server. The three most important things are hard parses, which should be as low as possible, executions per second and transactions per second. Other values in this example report are all typical of an OLTP database, which the Accounts schema is attempting to simulate.
Load Profile ------------ Per Second Per Transaction ---------- --------------- Redo size: 69,010.88 8,372.64 Logical reads: 1,945.87 236.08 Block changes: 489.67 59.41 Physical reads: 126.89 15.39 Physical writes: 20.40 2.47 User calls: 8.54 1.04 Parses: 60.15 7.30 Hard parses: 0.02 0.00 Sorts: 27.41 3.33 Logons: 2.76 0.34 Executes: 120.65 14.64 Transactions: 8.24 % Blocks changed per Read: 25.16 Recursive Call %: 98.68 Rollback per transaction %: 0.33 Rows per Sort: 14.94
The instance efficiency section simply shows various ratios. Remember that ratios can be symptoms or indications of possible problems elsewhere. Do not attempt to tune ratios. The only possible problems in this case could be significant re-parsing and perhaps slightly low memory usage.
Instance Efficiency Percentages (Target 100%) --------------------------------------------- Buffer Nowait %: 99.69 Redo NoWait %: 100.00 Buffer Hit %: 93.65 In-memory Sort %: 99.85 Library Hit %: 99.99 Soft Parse %: 99.97 Execute to Parse %: 50.15 Latch Hit %: 100.00 Parse CPU to Parse Elapsd %: 66.64 % Non-Parse CPU: 97.37 Shared Pool Statistics Begin End ----- ----- Memory Usage %: 32.76 33.78 % SQL with executions>1: 73.04 88.26 % Memory for SQL w/exec>1: 54.99 89.13
The top five timed wait events are wait events which can potentially most significantly affect performance. In this example report sequential reads are much higher than scattered reads. This is normal for a small-sized transactional OTLP-type database. Buffer busy wait events are probably occurring due to conflict between many small and a few large transactions or due to lack of concurrency. The Accounts schema database has a small amount of reporting functionality executing on it as well as containing heavy small transactional load.
Top 5 Timed Events ------------------- % Total Event Waits Time (s) Ela Time ----------------------- ------ -------- -------- db file sequential read 79,028 7,261 80.74 db file scattered read 3,325 424 4.71 buffer busy waits 5,451 366 4.06 CPU time 286 3.18 direct path read 1,513 205 2.28
Now we get a list of all wait events in order of descending wait time. Once again the most significant wait time by far is for sequential reads. Sequential reads often indicate single-block reads. High single-block read wait times indicate concurrency issues at the block level, perhaps hot blocks; in the Accounts schema this very likely.
Wait Events for DB: TEST Instance: test Snaps: 191–200 -> s – second -> cs – centisecond – 100th of a second -> ms – millisecond – 1000th of a second -> us – microsecond – 1000000th of a second -> ordered by wait time desc, waits desc (idle events last) Avg Total Wait wait Waits/ Event Waits Timeouts Time (s) (ms) txn ---------------------------- ------ -------- ---------- ----- ------ db file sequential read 79,028 0 7,261 92 10.4 db file scattered read 3,325 0 424 127 0.4 buffer busy waits 5,451 0 366 67 0.7 direct path read 1,513 0 205 135 0.2 db file parallel write 820 408 105 128 0.1 enqueue 447 2 103 230 0.1 control file sequential read 610 0 66 108 0.1 log file sync 2,775 0 63 23 0.4 log file parallel write 7,017 4,957 57 8 0.9 control file parallel write 298 0 29 96 0.0 latch free 321 59 17 54 0.0 log file switch completion 9 1 8 873 0.0 direct path write 86 0 4 51 0.0 LGWR wait for redo copy 92 12 1 6 0.0 buffer deadlock 6 6 0 21 0.0 log file sequential read 2 0 0 11 0.0 log file single write 2 0 0 9 0.0 SQL*Net break/reset to clien 4 0 0 0 0.0 SQL*Net more data to client 3 0 0 0 0.0 wakeup time manager 29 29 858 29601 0.0 SQL*Net message from client 154 0 242 1569 0.0 SQL*Net more data from clien 22 0 0 0 0.0 SQL*Net message to client 154 0 0 0 0.0
The list of all wait events is followed by background processing wait events.
Background Wait Events for DB: TEST Instance: test Snaps: 191–200 -> ordered by wait time desc, waits desc (idle events last) Avg Total Wait wait Waits/ Event Waits Timeouts Time (s) (ms) txn ---------------------------- ------ -------- ---------- ----- ------ db file parallel write 822 409 105 128 0.1 log file parallel write 7,021 4,959 57 8 0.9 control file parallel write 298 0 29 96 0.0 control file sequential read 136 0 18 129 0.0 db file scattered read 71 0 12 166 0.0 latch free 24 0 4 162 0.0 db file sequential read 28 0 4 133 0.0 LGWR wait for redo copy 92 12 1 6 0.0 direct path write 10 0 0 11 0.0 direct path read 10 0 0 8 0.0 log file sequential read 2 0 0 11 0.0 log file single write 2 0 0 9 0.0 buffer busy waits 1 0 0 9 0.0 rdbms ipc reply 4 0 0 2 0.0 rdbms ipc message 13,706 8,089 5,372 392 1.8 smon timer 3 2 782 ###### 0.0
The next sections contain parsed SQL code ordered in various different ways. The first section orders SQL statements in descending database buffer gets order where buffer gets exceed 10,000. It is my personal opinion that sequencing SQL code statements based on database buffer cache hits is not particularly useful to tuning in general since it advocates "fiddling" with the size of the database buffer cache and shared pool.
Tip | Hash values can be used to access details of specific SQL code statements. |
SQL ordered by Gets for DB: TEST Instance: test Snaps: 191–200 -> End Buffer Gets Threshold: 10000 -> Note that resources reported for PL/SQL includes the resources used by all SQL statements called within the PL/SQL code. As individual SQL statements are also reported, it is possible and valid for the summed total % to exceed 100
Tip | The number of executions is nearly always significant regardless of how much of a type of activity an SQL code statement is causing. A poorly tuned SQL code statement, even a simple statement, executed much more often than any other SQL code is always a problem. |
Gets per CPU Elapsd Buffer Gets Executions Exec %Total Time (s) Time (s) Hash Value ----------- ---------- -------- ------ -------- -------- ---------- 1,646,666 2,548 646.3 91.6 254.07 8266.96 1852168585 DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate; broken BOOLEAN := FALSE; BEGIN SIMULATE; :mydate := next_date; I F broken THEN :b := 1; ELSE :b := 0; END IF; END; 235,786 468 503.8 13.1 132.43 871.74 2026502321 SELECT min(order_id),max(order_id) from orders 136,975 251 545.7 7.6 6.43 1642.38 3111014165 SELECT tl.* from transactions t join transactionsline tl on(tl.t ransaction_id = t.transaction_id) where t.supplier_id = :b1 132,626 8,416 15.8 7.4 4.81 70.13 2781724698 INSERT into transactionsline values(:b4,:b3,:b2,:b1) 132,557 8,415 15.8 7.4 5.06 62.82 1787838546
Tip | All but the first five SQL code statements have been deleted from this section of this STATSPACK report in order to avoid large amounts of superfluous text, an equally large quantity of paper and a certain propensity to confusion on the part of the reader. |
Ordering SQL code statements by descending physical reads, actual I/O activity, is much more useful than using descending buffer gets order. I/O activity can be a performance hog but it can also be necessary depending on application and database type. For instance, full physical scans are preferable in data warehouses.
SQL ordered by Reads for DB: TEST Instance: test Snaps: 191–200 -> End Disk Reads Threshold: 1000
Tip | The number of executions is nearly always significant regardless of how much of a type of activity an SQL code statement is causing. A poorly tuned SQL code statement, even a simple statement, executed much more often than any other SQL code is always a problem. |
Physical Reads CPU Elapsd Reads Executions per Exec %Total Time (s) Time (s) Hash Value -------- --------- -------- ------ -------- -------- ---------- 111,203 2,548 43.6 94.8 254.07 8266.96 1852168585 DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate; broken BOOLEAN := FALSE; BEGIN SIMULATE; :mydate := next_date; I F broken THEN :b := 1; ELSE :b := 0; END IF; END; 32,461 468 69.4 27.7 132.43 871.74 2026502321 SELECT min(order_id),max(order_id) from orders 18,974 357 53.1 16.2 5.43 1622.03 3751280151 SELECT ol.* from orders o join ordersline ol on(ol.order_id = o. order_id) where o.supplier_id = :b1 18,891 251 75.3 16.1 6.43 1642.38 3111014165 SELECT tl.* from transactions t join transactionsline tl on(tl.t ransaction_id = t.transaction_id) where t.supplier_id = :b1 7,602 250 30.4 6.5 4.65 647.74 92829708 SELECT tl.* from transactions t join transactionsline tl on(tl.t ransaction_id = t.transaction_id) where t.customer_id = :b1
Tip | All but the first five SQL code statements have been deleted from this section of this STATSPACK report in order to avoid large amounts of superfluous text, an equally large quantity of paper and a certain propensity to confusion on the part of the reader. |
In the next STATSPACK report snippet SQL code statements have been ordered in descending order of execution. Ordering by number of executions is generally the most useful sorting option. The most useful method for detecting problems for highly active OLTP databases is usually a combination of executions and physical reads.
SQL ordered by Executions for DB: TEST Instance: test Snaps: 191–200 -> End Executions Threshold: 100 Rows Rows CPU per Elap per Executions Processed per Exec Exec (s) Exec (s) Hash Value ---------- --------- -------- -------- -------- ---------- 13,313 13,313 1.0 0.00 0.00 2489014428 SELECT to_number(to_char(SYSTIMESTAMP,'FF3'))+1 from dual 8,416 8,416 1.0 0.00 0.01 2781724698 INSERT into transactionsline values(:b4,:b3,:b2,:b1) 8,416 8,416 1.0 0.00 0.00 2822097328 INSERT into cashbookline values(:b3,:b2,:b1) 8,415 8,415 1.0 0.00 0.01 1787838546 INSERT into ordersline values(:b4,:b3,:b2,:b1) 8,413 8,413 1.0 0.00 0.01 52940216 INSERT into stockmovement values(stockmovement_seq.nextval,:b4,( :b3*–1),:b2,:b1) returning stockmovement_id INTO :b0
Tip | All but the first five SQL code statements have been deleted from this section of this STATSPACK report in order to avoid large amounts of superfluous text, an equally large quantity of paper and a certain propensity to confusion on the part of the reader. |
The number of parse calls for an SQL code statement is relevant because it denotes how often SQL code statements are re-parsed. Re-parsing is inefficient because it costs significant CPU processing time.
SQL ordered by Parse Calls for DB: TEST Instance: test Snaps: 191–200 -> End Parse Calls Threshold: 1000 % Total Parse Calls Executions Parses Hash Value ----------- ---------- ------ ---------- 3,612 3,612 6.50 3615375148 COMMIT 2,571 2,558 4.63 493392946 update sys.job$ set this_date= :1 where job= :2 2,570 2,570 4.62 990457800 select u1.user#, u2.user#, u3.user#, failures, flag, interval#, what, nlsenv, env, field1 from sys.job$ j, sys.user$ u1, sys .user$ u2, sys.user$ u3 where job=:1 and (next_date < sysdate or :2 != 0) and lowner = u1.name and powner = u2.name and cowner = u3.name 2,570 2,557 4.62 3128358781 update sys.job$ set failures= 0, this_date = null, flag= :1, last_da te = :2, next_date = greatest(:3, sysdate), total=total+(sysdate-nvl(this_date,sysdate)) where job=:4 2,558 2,558 4.60 3013728279 select privilege#,level from sysauth$ connect by grantee# = prior privilege# and privilege#>0 start with (grantee# = :1 or grantee# = 1) and privilege#>0
Tip | All but the first five SQL code statements have been deleted from this section of this STATSPACK report in order to avoid large amounts of superfluous text, an equally large quantity of paper and a certain propensity to confusion on the part of the reader. |
The next section lists out general instance activity statistics. There are a lot of different statistics to peruse. A large amount of information can be gathered from the instance activity statistics. Looking at the statistics we can see CPU usage, database writer process (DBWR) activity, buffers, enqueue locking, indexing, tables, I/O, logging, session activity, rollback and undo, transactional activity, amongst many other indicators. This is simply a list of all the instance activity statistics contained in a single section of the STATSPACK report.
Two statistics in the STATSPACK report snippet shown below, called process last nonidle time and session connect time, are off the scale. The former statistic refers to the last time a process was executed and the latter to session connection time. Both these statistics are irrelevant based on the way in which my Accounts schema has OLTP activity executed against it.
Instance Activity Stats for DB: TEST Instance: test Snaps: 191–200 Statistic Total per Second per Trans ---------------------------------- ------------ -------------- ------------ CPU used by this session 28,578 30.9 3.8 CPU used when call started 28,578 30.9 3.8 CR blocks created 644 0.7 0.1 DBWR buffers scanned 108,190 117.1 14.2 DBWR checkpoint buffers written 1,040 1.1 0.1 DBWR checkpoints 1 0.0 0.0 DBWR free buffers found 94,700 102.5 12.4 DBWR lru scans 429 0.5 0.1 DBWR make free requests 429 0.5 0.1 DBWR revisited being-written buff 0 0.0 0.0 DBWR summed scan depth 108,190 117.1 14.2 DBWR transaction table writes 51 0.1 0.0 DBWR undo block writes 3,800 4.1 0.5 SQL*Net roundtrips to/from client 151 0.2 0.0 active txn count during cleanout 6,271 6.8 0.8 background checkpoints completed 1 0.0 0.0 background checkpoints started 1 0.0 0.0 background timeouts 1,064 1.2 0.1 branch node splits 3 0.0 0.0 buffer is not pinned count 902,275 976.5 118.5 buffer is pinned count 130,645 141.4 17.2 bytes received via SQL*Net from c 138,570 150.0 18.2 bytes sent via SQL*Net to client 71,041 76.9 9.3 calls to get snapshot scn: kcmgss 194,194 210.2 25.5 calls to kcmgas 11,457 12.4 1.5 calls to kcmgcs 648 0.7 0.1 change write time 955 1.0 0.1 cleanout - number of ktugct calls 2,753 3.0 0.4 cleanouts and rollbacks – consist 184 0.2 0.0 cleanouts only - consistent read 141 0.2 0.0 cluster key scan block gets 241 0.3 0.0 cluster key scans 214 0.2 0.0 commit cleanout failures: block l 113 0.1 0.0 commit cleanout failures: buffer 9 0.0 0.0 commit cleanout failures: callbac 57 0.1 0.0 commit cleanout failures: cannot 6 0.0 0.0 commit cleanouts 46,281 50.1 6.1 commit cleanouts successfully com 46,096 49.9 6.1 commit txn count during cleanout 5,998 6.5 0.8 consistent changes 2,968 3.2 0.4 consistent gets 1,101,924 1,192.6 144.7 consistent gets – examination 108,479 117.4 14.2 current blocks converted for CR 26 0.0 0.0 cursor authentications 82 0.1 0.0 data blocks consistent reads – un 2,363 2.6 0.3 db block changes 452,457 489.7 59.4 db block gets 696,070 753.3 91.4 deferred (CURRENT) block cleanout 10,492 11.4 1.4 dirty buffers inspected 1,266 1.4 0.2 enqueue conversions 4,213 4.6 0.6 enqueue deadlocks 1 0.0 0.0 enqueue releases 41,603 45.0 5.5 enqueue requests 41,610 45.0 5.5 enqueue timeouts 58 0.1 0.0 enqueue waits 423 0.5 0.1 exchange deadlocks 6 0.0 0.0 execute count 111,483 120.7 14.6 free buffer inspected 1,311 1.4 0.2 free buffer requested 120,649 130.6 15.8 hot buffers moved to head of LRU 34,312 37.1 4.5 immediate (CR) block cleanout app 325 0.4 0.0 immediate (CURRENT) block cleanou 5,150 5.6 0.7 index fast full scans (full) 463 0.5 0.1 index fetch by key 25,431 27.5 3.3 index scans kdiixs1 116,315 125.9 15.3 leaf node 90-10 splits 251 0.3 0.0 leaf node splits 547 0.6 0.1 logons cumulative 2,552 2.8 0.3 messages received 7,614 8.2 1.0 messages sent 7,614 8.2 1.0 no buffer to keep pinned count 0 0.0 0.0 no work - consistent read gets 858,628 929.3 112.7 opened cursors cumulative 54,152 58.6 7.1 parse count (failures) 1 0.0 0.0 parse count (hard) 18 0.0 0.0 parse count (total) 55,575 60.2 7.3 parse time cpu 753 0.8 0.1 parse time elapsed 1,130 1.2 0.2 physical reads 117,242 126.9 15.4 physical reads direct 3,116 3.4 0.4 physical writes 18,848 20.4 2.5 physical writes direct 3,125 3.4 0.4 physical writes non 18,137 19.6 2.4 checkpoint pinned buffers inspected 44 0.1 0.0 prefetched blocks 31,706 34.3 4.2 process last non-idle time 2,692,317,558,691 ############## ############ recovery blocks read 0 0.0 0.0 recursive calls 590,362 638.9 77.5 recursive cpu usage 28,269 30.6 3.7 redo blocks written 132,490 143.4 17.4 redo buffer allocation 7 0.0 0.0 retries redo entries 229,257 248.1 30.1 redo log space requests 10 0.0 0.0 redo log space wait time 786 0.9 0.1 redo size 63,766,056 69,010.9 8,372.6 redo synch time 6,466 7.0 0.9 redo synch writes 2,663 2.9 0.4 redo wastage 1,924,580 2,082.9 252.7 redo write time 8,476 9.2 1.1 redo writer latching time 52 0.1 0.0 redo writes 7,004 7.6 0.9 rollback changes - undo 518 0.6 0.1 records a rollbacks only - 529 0.6 0.1 consistent read rows fetched via callback 17,855 19.3 2.3 session connect time 2,692,317,558,691 ############## ############ session logical reads 1,797,981 1,945.9 236.1 session pga memory max 100,372 108.6 13.2 session uga memory 29,372,040 31,787.9 3,856.6 session uga memory max 268,212,120 290,272.9 35,216.9 shared hash latch 76,480 82.8 10.0 upgrades - no w shared hash latch 1 0.0 0.0 upgrades – wait sorts (disk) 38 0.0 0.0 sorts (memory) 25,287 27.4 3.3 sorts (rows) 378,384 409.5 49.7 summed dirty queue length 15,041 16.3 2.0 switch current to new buffer 997 1.1 0.1 table fetch by rowed 321,386 347.8 42.2 table fetch continued row 0 0.0 0.0 table scan blocks gotten 53,011 57.4 7.0 table scan rows gotten 9,862,509 10,673.7 1,295.0 table scans (long tables) 2 0.0 0.0 table scans (short tables) 24,593 26.6 3.2 transaction rollbacks 123 0.1 0.0 transaction tables consistent rea 13 0.0 0.0 transaction tables consistent rea 604 0.7 0.1 user calls 7,890 8.5 1.0 user commits 7,591 8.2 1.0 user rollbacks 25 0.0 0.0 write clones created in backgroun 1 0.0 0.0 write clones created in foregroun 41 0.0 0.0
The next two sections show tablespace and datafile physical read and write I/O statistics. In general, user application tables are contained within the DATA tablespace and related indexes in the INDX tablespace. Table reads should probably be higher than index reads since tables are read completely (full table scans) or as a result of index ROWID pointer searches and scans. Index writes are higher than table writes, probably indicating many indexes for each table and high insertion activity. In fact high row insertion activity is probably causing contention and thus the significant buffer wait events already seen. Other values in the following section indicate temporary sort space on disk using the TEMP tablespace is possibly significant. Perhaps the sort area buffers could be increased in size, namely the SORT_AREA_SIZE and SORT_AREA_RETAINED SIZE parameters. Rollback (automated undo) reads is likely caused by consistent transaction requirements. Write values for rollback space are much higher than that of read values. However, read values probably remain significant. However, in the Accounts schema this type of rollback consistent requirement is normal for application and database type.
Tablespace IO Stats for DB: TEST Instance: test Snaps: 191–200 –>ordered by IOs (Reads + Writes) desc Tablespace ----------------------- Av Av Av Av Buffer Av Buf Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms) ------------------ ------- ------ ------- ------ -------- ------ ------ DATA 42,865 46 101.1 1.0 3,280 4 1,287 66.2 INDX 36,831 40 82.4 1.8 7,106 8 4,145 68.3 TEMP 2,849 3 133.9 1.1 3,115 3 0 0.0 UNDOTBS1 470 1 106.3 1.0 3,870 4 3 3.3 PERFSTAT 1,721 2 130.7 1.0 1,367 1 0 0.0 SYSTEM 453 0 131.1 2.6 95 0 75 43.2 OBJECTS 1 0 170.0 1.0 1 0 0 0.0 TOOLS 1 0 260.0 1.0 1 0 0 0.0 USERS 1 0 260.0 1.0 1 0 0 0.0 -----------------------------------------------------------
With respect to datafiles I/O activity appears to be spread well between the two datafiles for the DATA tablespace but not as well for the two datafiles in the INDX tablespace.
File IO Stats for DB: TEST Instance: test Snaps: 191–200 –>ordered by Tablespace, File Tablespace Filename ----------------------- ---------------------------------- Av Av Av Av Buffer Av Buf Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms) -------------- ------- ------ ------- ------ -------- ------ ------ DATA E:\ORACLE\ORADATA\TEST\DATA01.DBF 21,460 23 103.4 1.0 1,606 2 549 65.1 E:\ORACLE\ORADATA\TEST\DATA02.DBF 21,405 23 98.8 1.1 1,674 2 738 67.0 INDX E:\ORACLE\ORADATA\TEST\INDX01.DBF 20,039 22 86.2 1.7 4,046 4 2,029 65.7 E:\ORACLE\ORADATA\TEST\INDX02.DBF 16,792 18 77.9 1.9 3,060 3 2,116 70.8 OBJECTS E:\ORACLE\ORADATA\TEST\OBJECTS01.DBF 1 0 170.0 1.0 1 0 0 PERFSTAT E:\ORACLE\ORADATA\TEST\PERFSTAT01.DBF 1,721 2 130.7 1.0 1,367 1 0 SYSTEM E:\ORACLE\ORADATA\TEST\SYSTEM01.DBF 453 0 131.1 2.6 95 0 75 43.2 TEMP E:\ORACLE\ORADATA\TEST\TEMP01.DBF 2,849 3 133.9 1.1 3,115 3 0 TOOLS E:\ORACLE\ORADATA\TEST\TOOLS01.DBF 1 0 260.0 1.0 1 0 0 UNDOTBS1 E:\ORACLE\ORADATA\TEST\UNDOTBS01.DBF 470 1 106.3 1.0 3,870 4 3 3.3 USERS E:\ORACLE\ORADATA\TEST\USERS01.DBF 1 0 260.0 1.0 1 0 0 ----------------------------------------------------------
Next we get database buffer cache statistics. Buffer busy waits are significant.
Buffer Pool Statistics for DB: TEST Instance: test Snaps: 191–200 –> Standard block size Pools D: default, K: keep, R: recycle –> Default Pools for other block sizes: 2k, 4k, 8k, 16k, 32k Free Write Buffer Number of Cache Buffer Physical Physical Buffer Complete Busy P Buffers Hit % Gets Reads Writes Waits Waits Waits --------- ----- --------- -------- -------- ------ ------- ------ D 4,000 95.7 2,667,765 113,687 15,693 0 0 5,473 ------------------------------------------------------------
These are recovery requirement estimates.
Instance Recovery Stats for DB: TEST Instance: test Snaps: 191–200 –> B: Begin snapshot, E: End snapshot Targt Estd Log File Log Ckpt Log Ckpt MTTR MTTR Recovery Actual Target Size Timeout Interval (s) (s) Estd Ios Redo Blks Redo Blks Redo Blks Redo Blks Redo Blks - ---- ---- -------- --------- --------- --------- --------- --------- B 35 27 496 25653 184320 184320 249736 E 35 27 614 25742 184320 184320 221989 ---------------------------------------------------------
This is the database buffer cache advisory. Buffer advisories are covered in a previous chapter. I do not find advisories particularly useful, tending to encourage frequent upward resizing of the database buffer cache and shared pool. Resizing buffer pools upwards does not solve problems but perhaps sometimes alleviates symptoms of problems and can even "help" to obscure significant performance problems.
Buffer Pool Advisory for DB: TEST Instance: test End Snap: 200 -> Only rows with estimated physical reads >0 are displayed -> ordered by Block Size, Buffers For Estimate Size for Size Buffers for Est Physical Estimated P Estimate (M) Factr Estimate Read Factor Physical Reads - ------------ ----- ----------- ------------ -------------- D 4 .1 500 4.83 8,118,743 D 8 .3 1,000 3.10 5,218,381 D 12 .4 1,500 1.99 3,340,086 D 16 .5 2,000 1.55 2,609,060 D 20 .6 2,500 1.35 2,275,264 D 24 .8 3,000 1.21 2,038,548 D 28 .9 3,500 1.10 1,851,173 D 32 1.0 4,000 1.00 1,681,112 D 36 1.1 4,500 0.91 1,533,069 D 40 1.3 5,000 0.83 1,401,561 D 44 1.4 5,500 0.76 1,281,701 D 48 1.5 6,000 0.70 1,171,290 D 52 1.6 6,500 0.64 1,070,867 D 56 1.8 7,000 0.58 979,671 D 60 1.9 7,500 0.53 892,039 D 64 2.0 8,000 0.48 808,529 D 68 2.1 8,500 0.44 737,110 D 72 2.3 9,000 0.40 673,978 D 76 2.4 9,500 0.37 618,692 D 80 2.5 10,000 0.34 569,011
Data block waits indicate hot block issues, common in highly active small transactional OLTP databases. Possible solutions are already well covered in this book.
Buffer wait Statistics for DB: TEST Instance: test Snaps: 191–200 –> ordered by wait time desc, waits desc Tot Wait Avg Class Waits Time (s) Time (ms) ---------- ----- -------- --------- data block 5,488 371 68 undo header 3 0 3
Enqueue activity is generally consistent with details covered already in this book. There is no need to reiterate.
Enqueue activity for DB: TEST Instance: test Snaps: 191–200 –> Enqueue stats gathered prior to 9i should not be compared with 9i data –> ordered by Wait Time desc, Waits desc Failed Avg Wt Wait Eq Requests Succ Gets Gets Waits Time (ms) Time(s) -- -------- --------- ------ ----- --------- ------- TX 10,107 10,105 0 362 295.01 107 SQ 992 992 0 60 39.73 2 HW 69 69 0 1 .00 0
Rollback segment statistics show some wrapping with possibly significant consistent transaction read requirements as already stated before. In high small transactional activity OLTP databases this behavior is probably unavoidable. For automated undo the UNDO_RETENTION parameter could be altered; for manual rollback segments create more, smaller rollback segments. Obviously tuning rollback space does not solve the real problem but merely attempts to spread rollback more evenly and thus avoid rollback contention. The real problem is that the database is very busy. Possible solutions to this type of performance issue are better handling of concurrency at the block level by raising INITRANS values, using reverse key indexing or perhaps even clustering. All these items and more have been covered in this book already.
Rollback Segment Stats for DB: TEST Instance: test Snaps: 191–200 –> A high value for "Pct Waits" suggests more rollback segments may be required –> RBS stats may not be accurate between begin and end snaps when using Auto Undo managment, as RBS may be dynamically created and dropped as needed Trans Table Pct Undo Bytes RBS No Gets Waits Written Wraps Shrinks Extends ------ ----------- ----- ---------- ----- ------- ------- 0 12.0 0.00 0 0 0 0 1 2,275.0 0.00 2,240,820 4 0 0 2 2,318.0 0.00 2,000,692 4 0 0 3 2,080.0 0.00 2,109,416 4 0 0 4 1,958.0 0.05 2,608,464 5 0 0 5 2,182.0 0.00 1,984,204 2 0 0 6 2,426.0 0.00 2,248,984 2 0 0 7 1,912.0 0.00 2,053,060 4 0 0 8 2,546.0 0.00 2,975,356 4 0 0 9 2,345.0 0.00 2,350,076 3 0 0 10 2,541.0 0.00 2,671,298 6 1 1 --------------------------------------------------- Rollback Segment Storage for DB: TEST Instance: test Snaps: 191–200 -> Optimal Size should be larger than Avg Active RBS No Segment Size Avg Active Optimal Size Maximum Size ------ ------------ ---------- ------------ ------------ 0 385,024 0 385,024 1 35,840,000 860,252 35,840,000 2 34,725,888 866,432 34,725,888 3 35,774,464 850,308 35,774,464 4 33,677,312 884,223 33,677,312 5 36,823,040 1,035,943 36,823,040 6 43,114,496 4,062,790 43,114,496 7 35,774,464 849,535 35,774,464 8 35,774,464 959,362 35,774,464 9 35,774,464 987,376 35,774,464 10 32,759,808 697,641 36,036,608 ---------------------------------------------------- Undo Segment Summary for DB: TEST Instance: test Snaps: 191-200 -> Undo segment block stats: -> uS - unexpired Stolen, uR - unexpired Released, uU - unexpired reUsed -> eS - expired Stolen, eR - expired Released, eU - expired reUsed Undo Undo Num Max Qry Max Tx Snapshot Out of uS/uR/uU/ TS# Blocks Trans Len (s) Concurcy Too Old Space eS/eR/eU ---- ------ ------- ------- -------- -------- ------ ----------- 1 5,291 452,288 196 3 0 0 0/0/0/0/0/0 Undo Segment Stats for DB: TEST Instance: test Snaps: 191-200 -> ordered by Time desc Undo Num Max Qry Max Tx Snap Out of uS/uR/uU/ End Time Blocks Trans Len (s) Concy Too Old Space eS/eR/eU ------------ ------ ------- ------- ------ ------- ------ ----------- 08-Aug 16:36 2,378 157,059 196 2 0 0 0/0/0/0/0/0 08-Aug 16:26 2,334 150,680 169 2 0 0 0/0/0/0/0/0 08-Aug 16:16 579 144,549 28 3 0 0 0/0/0/0/0/0 -----------------------------------------------------------
Next is latch activity. There appear to be significant issues with the database buffer cache and the library cache in the shared pool. These values are once again significant with details covered previously in this book. The database buffer cache and shared pool are deliberately set to low values in my database containing the Accounts schema, in order to create tunable problems.
Latch Activity for DB: TEST Instance: test Snaps: 191–200 –>"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for willing-to-wait latch get requests –>"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests –>"Pct Misses" for both should be very close to 0.0 Pct Avg Wait Pct Get Get Slps Time NoWait NoWait Latch Requests Miss Miss (s) Requests Miss ------------------------- --------- ---- ---- ---- -------- ----- Consistent RBA 7,028 0.0 0 0 FIB s.o chain latch 4 0.0 0 0 FOB s.o list latch 57 0.0 0 0 SQL memory manager worka 603 0.0 0 0 active checkpoint queue 901 0.0 0 0 archive control 12 0.0 0 0 cache buffer handles 1,556 0.0 0 0 cache buffers chains 4,761,717 0.0 1.0 0 154,473 0.0 cache buffers lru chain 153,530 0.0 1.0 0 84 0.0 channel handle pool latc 2 0.0 0 0 channel operations paren 587 0.0 0 0 checkpoint queue latch 66,578 0.0 0 15,155 0.0 child cursor hash table 293 0.0 0 0 dml lock allocation 43,873 0.0 1.0 0 0 dummy allocation 5,120 0.0 0 0 enqueue hash chains 88,018 0.0 6.0 15 0 enqueues 23,702 0.0 1.0 0 0 event group latch 1 0.0 0 0 file number translation 6,045 0.0 0 0 hash table column usage 10 0.0 0 30 0.0 job workq parent latch 0 0 3,122 0.0 job_queue_processes para 375 0.0 0 0 ktm global data 5 0.0 0 0 kwqit: protect wakeup ti 29 0.0 0 0 lgwr LWN SCN 7,029 0.0 0 0 library cache 1,022,754 0.0 1.1 1 0 library cache load lock 12 0.0 0 0 library cache pin 630,683 0.0 1.0 0 0 library cache pin alloca 323,891 0.0 1.1 0 0 list of block allocation 8,313 0.0 0 0 loader state object free 170 0.0 0 0 longop free list parent 19 0.0 0 19 0.0 messages 35,912 0.0 1.0 0 0 mostly latch-free SCN 7,031 0.0 0 0 multiblock read objects 12,287 0.0 0 2 0.0 ncodef allocation latch 23 0.0 0 0 post/wait queue 14,181 0.0 0 2,790 0.0 process allocation 1 0.0 0 1 0.0 process group creation 2 0.0 0 0 redo allocation 242,912 0.0 1.1 0 0 redo copy 60 1.7 2.0 0 228,845 0.1 redo writing 22,607 0.0 0 0 row cache enqueue latch 29,914 0.0 0 0 row cache objects 36,319 0.0 0 0 sequence cache 57,781 0.0 1.0 0 0 session allocation 68,320 0.0 0 0 session idle bit 18,430 0.0 0 0 session switching 23 0.0 0 0 session timer 485 0.0 0 0 shared pool 333,910 0.0 2.0 0 0 sim partition latch 0 0 67 0.0 simulator hash latch 122,962 0.0 0 0 simulator lru latch 12,525 0.0 0 118 0.0 sort extent pool 254 0.0 0 0 transaction allocation 6,507 0.0 0 0 transaction branch alloc 23 0.0 0 0 undo global data 36,910 0.0 1.0 0 0 user lock 5,096 0.0 0 0
Latch sleeps are a result of missed latch acquisition. Missed latches are problematic for performance. Once again significant issues are apparent in the database buffer cache and the library cache in the shared pool.
ZZZZZZZZZZZZZ721
Latch Sleep breakdown for DB: TEST Instance: test Snaps: 191–200 –> ordered by misses desc Get Spin & Latch Name Requests Misses Sleeps Sleeps 1->4 -------------------------- --------- ------ ------ ----------- library cache 1,022,754 115 121 0/109/6/0/0 cache buffers chains 4,761,717 58 58 0/0/0/0/0 cache buffers lru chain 153,530 30 31 0/29/1/0/0 redo allocation 242,912 17 18 0/16/1/0/0 enqueue hash chains 88,018 9 54 0/3/0/6/0 library cache pin allocati 323,891 9 10 0/8/1/0/0 library cache pin 630,683 8 8 0/8/0/0/0 dml lock allocation 43,873 7 7 0/7/0/0/0 sequence cache 57,781 4 4 0/4/0/0/0 enqueues 23,702 3 3 0/3/0/0/0 undo global data 36,910 2 2 0/2/0/0/0 messages 35,912 1 1 0/1/0/0/0 redo copy 60 1 2 0/0/1/0/0 shared pool 333,910 1 2 0/0/1/0/0
The next section shows very deep level sources of missed latches. From my perspective this information is too excessive to be useful considering other tuning tools available.
Latch Miss Sources for DB: TEST Instance: test Snaps: 191–200 –> only latches with sleeps are shown –> ordered by name, sleeps desc NoWait Waiter Latch Name Where Misses Sleeps Sleeps ----------------------- -------------------------- ------ ------ ------ cache buffers chains kcbgtcr: kslbegin excl 0 21 24 cache buffers chains kcbrls: kslbegin 0 13 9 cache buffers chains kcbgcur: kslbegin 0 7 11 cache buffers chains kcbchg: kslbegin: call CR 0 3 0 cache buffers chains kcbget: exchange rls 0 3 0 cache buffers chains kcbzwb 0 3 2 cache buffers chains kcbget: exchange 0 2 0 cache buffers chains kcbgtcr: fast path 0 2 3 cache buffers chains kcbget: pin buffer 0 2 8 cache buffers chains kcbchg: kslbegin: bufs not 0 1 0 cache buffers chains kcbnlc 0 1 0 cache buffers lru chain kcbzgb: wait 0 27 29 cache buffers lru chain kcbbiop: lru scan 0 2 0 cache buffers lru chain kcbbxsv: move to being wri 0 1 0 cache buffers lru chain kcbzgb: posted for free bu 0 1 1 dml lock allocation ktaiam 0 4 3 dml lock allocation ktaidm 0 3 4 enqueue hash chains ksqcmi: kslgpl 0 46 0 enqueue hash chains ksqrcl 0 5 10 enqueue hash chains ksqgtl3 0 2 35 enqueue hash chains ksqcmi: get hash chain lat 0 1 9 enqueues ksqgel: create enqueue 0 2 0 enqueues ksqdel 0 1 2 library cache kglupc: child 0 26 71 library cache kglpndl: child: before pro 0 21 3 library cache kglpndl: child: after proc 0 13 0 library cache kgllkdl: child: cleanup 0 10 2 library cache kglpin: child: heap process 0 8 1 library cache kglpnc: child 0 8 2 library cache kglhdgc: child: 0 7 0 library cache kglobpn: child: 0 7 1 library cache kglget: child: KGLDSBRD 0 6 13 library cache kglhdgn: child: 0 5 4 library cache kglic 0 5 5 library cache kglget: child: KGLDSBYD 0 3 0 library cache kgldte: child 0 0 1 19 library cache kglpin 0 1 0 library cache pin kglpnal: child: alloc spac 0 4 2 library cache pin kglpnc: child 0 4 0 library cache pin kglpndl 0 2 0 library cache pin kglupc 0 1 9 library cache pin alloca kgllkdl 0 6 5 library cache pin alloca kglpnal 0 4 5 messages ksaamb: after wakeup 0 1 0 redo allocation kcrfwr 0 16 11 redo allocation kcrfwi: before write 0 1 2 redo allocation kcrfwi: more space 0 1 5 redo copy kcrfwr: nowait 0 2 0 sequence cache kdnnxt: cached seq 0 3 0 sequence cache kdnss 0 1 4 shared pool kghalo 0 2 0 undo global data ktubnd:child 0 1 1 undo global data ktudba: KSLBEGIN 0 1 0 -------------------------------------------------------
This is another example of excessively deep level analysis by STATSPACK analyzing the row cache in the shared pool, specifically the data dictionary activity cache from the V$ROWCACHE performance view. The only interesting factor here is the occurrence of histograms. It appears that the ANALYZE command used to compute statistics generates histograms for all indexed columns, by default in Oracle9i Release 2 (9.2) on NT/2000, when computing statistics on tables. At least it does on my database server. The occur- rence of the high value for dc_histogram_defs is the result.
Dictionary Cache Stats for DB: TEST Instance: test Snaps: 191–200 –>"Pct Misses" should be very low (< 2% in most cases) –>"Cache Usage" is the number of cache entries being used –>"Pct SGA" is the ratio of usage to allocated size for that cache Get Pct Scan Pct Mod Final Cache Requests Miss Reqs Miss Reqs Usage -------------------- -------- ---- ---- ---- ---- ----- dc_histogram_defs 98 80.6 0 0 194 dc_object_ids 76 2.6 0 0 293 dc_objects 212 0.0 0 0 447 dc_profiles 2,541 0.0 0 0 1 dc_rollback_segments 67 0.0 0 0 12 dc_segments 136 0.0 0 0 206 dc_sequences 931 0.0 0 931 8 dc_tablespace_quotas 5 0.0 0 5 4 dc_tablespaces 663 0.0 0 0 6 dc_user_grants 17 0.0 0 0 12 dc_usernames 25 0.0 0 0 4 dc_users 12,796 0.0 0 0 14 ------------------------------------
This section shows library cache activity based on object type. There is a lot more SQL activity than anything else and a small number of reloads.
Library Cache Activity for DB: TEST Instance: test Snaps: 191–200 –>"Pct Misses" should be very low Get Pct Pin Pct Invali- Namespace Requests Miss Requests Miss Reloads dations --------------- -------- ---- -------- ---- ------- ------- BODY 5,092 0.0 5,092 0.0 0 0 CLUSTER 2 0.0 3 0.0 0 0 INDEX 116 0.0 116 0.0 0 0 SQL AREA 19,071 0.0 237,733 0.0 22 0 TABLE/PROCEDURE 4,237 0.0 55,925 0.0 0 0 TRIGGER 7,547 0.0 7,547 0.0 0 0 ----------------------------------------
Once again this is a buffer area advisory, potentially encouraging excessive increased re-sizing of the shared pool, avoiding solving and obscuring underlying performance problems.
Shared Pool Advisory for DB: TEST Instance: test End Snap: 200 –> Note there is often a 1:Many correlation between a single logical object in the Library Cache, and the physical number of memory objects associated with it. Therefore comparing the number of Lib Cache objects (e.g. in v$librarycache), with the number of Lib Cache Memory Objects is invalid Estd Shared Pool SP Estd Estd Estd Lib LC Time Size for Size Lib Cache Lib Cache Cache Time Saved Estd Lib Cache Estim (M) Factr Size (M) Mem Obj Saved (s) Factr Mem Obj Hits ----------- ----- --------- --------- ---------- ------- -------------- 32 .5 11 2,813 35,757 1.0 2,269,393 40 .6 11 2,813 35,757 1.0 2,269,393 48 .8 11 2,813 35,757 1.0 2,269,393 56 .9 11 2,813 35,757 1.0 2,269,393 64 1.0 11 2,813 35,757 1.0 2,269,393 72 1.1 11 2,813 35,757 1.0 2,269,393 80 1.3 11 2,813 35,757 1.0 2,269,393 88 1.4 11 2,813 35,757 1.0 2,269,393 96 1.5 11 2,813 35,757 1.0 2,269,393 104 1.6 11 2,813 35,757 1.0 2,269,393 112 1.8 11 2,813 35,757 1.0 2,269,393 120 1.9 11 2,813 35,757 1.0 2,269,393 128 2.0 11 2,813 35,757 1.0 2,269,393 ---------------------------------------------------------------
This is a summary of SGA (system global area) sizes.
SGA Memory Summary for DB: TEST Instance: test Snaps: 191–200 SGA regions Size in Bytes ---------------- ------------- Database Buffers 33,554,432 Fixed Size 453,452 Redo Buffers 667,648 Variable Size 92,274,688 ------------- Sum 126,950,220 -------------------------
The next section contains a breakdown of the SGA.
SGA breakdown difference for DB: TEST Instance: test Snaps: 191–200 Pool Name Begin value End value % Diff ------ ------------------------- ----------- --------- ------ Large free memory 8,388,608 8,388,608 0.00 Shared 1M buffer 2,098,176 2,098,176 0.00 Shared Checkpoint queue 282,304 282,304 0.00 Shared FileIdentificatonBlock 323,292 323,292 0.00 Shared FileOpenBlock 695,504 695,504 0.00 Shared KGK heap 3,756 3,756 0.00 Shared KGLS heap 1,766,080 1,788,664 1.28 Shared KQR M PO 451,100 492,572 9.19 Shared KQR S PO 58,940 58,940 0.00 Shared KQR S SO 3,072 3,072 0.00 Shared KSXR large reply queue 166,104 166,104 0.00 Shared KSXR pending messages que 841,036 841,036 0.00 Shared KSXR receive buffers 1,033,000 1,033,000 0.00 Shared MTTR advisory 12,692 12,692 0.00 Shared PL/SQL DIANA 1,580,060 1,580,060 0.00 Shared PL/SQL MPCODE 497,840 507,844 2.01 Shared PLS non-lib hp 2,068 2,068 0.00 Shared character set object 274,508 274,508 0.00 Shared dictionary cache 1,610,880 1,610,880 0.00 Shared enqueue 171,860 171,860 0.00 Shared event statistics per sess 1,718,360 1,718,360 0.00 Shared fixed allocation callback 180 180 0.00 Shared free memory 56,403,656 55,548,136 -1.52 Shared joxs heap init 4,220 4,220 0.00 Shared kgl simulator 608,780 613,016 0.70 Shared ksm_file2sga region 148,652 148,652 0.00 Shared library cache 3,598,836 3,665,656 1.86 Shared message pool freequeue 834,752 834,752 0.00 Shared miscellaneous 4,250,564 4,250,564 0.00 Shared parameters 31,412 33,500 6.65 Shared processes 144,000 144,000 0.00 Shared sessions 410,720 410,720 0.00 Shared sim memory hea 25,400 25,400 0.00 Shared sql area 3,823,264 4,531,524 18.53 Shared table definiti 3,684 3,740 1.52 Shared trigger defini 6,732 6,732 0.00 Shared trigger inform 496 496 0.00 Shared trigger source 100 100 0.00 buffer_cache 33,554,432 33,554,432 0.00 fixed_sga 453,452 453,452 0.00 log_buffer 656,384 656,384 0.00 ------------------------------------------------
And finally we get a listing of nondefault setting database configuration parameters.
init.ora Parameters for DB: TEST Instance: test Snaps: 191–200 End value Parameter Name Begin value (if different) ------------------------------ --------------------------------- -------------- aq_tm_processes 1 background_dump_dest E:\oracle\admin\test\bdump compatible 9.2.0.0.0 control_files E:\oracle\oradata\test\control01. core_dump_dest E:\oracle\admin\test\cdump db_block_size 8192 db_cache_size 33554432 db_domain db_file_multiblock_read_count 16 db_name test fast_start_mttr_target 300 hash_join_enabled TRUE instance_name test java_pool_size 0 job_queue_processes 10 large_pool_size 8388608 open_cursors 300 pga_aggregate_target 0 processes 150 query_rewrite_enabled FALSE remote_login_passwordfile EXCLUSIVE shared_pool_size 67108864 sort_area_size 32768 star_transformation_enabled FALSE timed_statistics TRUE undo_management AUTO undo_retention 10800 undo_tablespace UNDOTBS1 user_dump_dest E:\oracle\admin\test\udump ------------------------------------------------------ End of Report
As you can now see STATSPACK produces a large amount of information in a single report. This section on using STATSPACK is all a single STATSPACK report. Finding a very specific problem amongst all that detail is probably more difficult than using drill- down into the Oracle Database Wait Event Interface, within the Oracle Enterprise Manager GUI Performance Overview tool. STATSPACK is best to get an overall picture of underlying physical structure. Using the Wait Event Interface is perhaps easier to isolate specific issues after having gained a general perspective using STATSPACK.
This chapter covering the Oracle Database Wait Event Interface and STATSPACK completes the third part of this book on physical and configuration tuning for Oracle Database, and concludes this book. I hope you have enjoyed reading it as much as I have enjoyed writing it.
| < Day Day Up > |
|