Section 11.3. Eliminating Wasteful Work | Optimizing Oracle Performance

11.3 Eliminating Wasteful Work

Oracle Corporation's wait event documentation has come a long way since the old days when the Oracle7 Server Tuning guide provided its comically bad advice to users of the new V$SESSION_WAIT view (Chapter 8). There's one area to which I wish Oracle would devote more focus, however: workload reduction . I think one reason Oracle doesn't focus on workload reduction in its wait events reference is that the wait events reference appears in the chapter entitled "Instance Tuning." I think that in a chapter with such a name , an author feels constrained to limit his discussions to "tuning" activities that don't require application modifications.

You and I, however, live under no such restrictions. Even if you're using a third-party application that you cannot change without vendor participation, you still need to keep your mind open to the performance enhancement opportunities made possible by workload reduction. Of course, eliminating wasteful workload can sometimes call for application modification. Don't despair, however. It is often not as difficult as you might think to convince an application software vendor to improve the performance of a packaged application.

Your best chance of convincing an application software vendor to improve the performance of your purchased application is to provide irrefutable quantitative evidence that your performance improvement suggestions will help their product make you ”and your vendor's other customers ” happier .

A great benefit of using Method R is that no matter what the cause of a user action's performance problem, you'll find it. Will it be bad news to learn that your application vendor made a terrible design mistake that will prohibit your happiness until it is fixed? It may be. But if the truth is that the only path to good performance is through your application vendor, you need to learn that fact as quickly as possible, so that you'll stop wasting resources on activities that are destined to fail.

Each of the following sections augments the wait event references you'll find at Oracle Corporation web sites and elsewhere on the Internet. Each section describes a few ways to eliminate wasteful workload and how the waste at issue can show up as components of user action response time.

11.3.1 Logical I/O Optimization

Much of the data manipulation performed upon an Oracle database takes place in the database buffer cache region of the Oracle kernel's collection of shared memory segments called the system global area . Therefore, all performance analysts pay attention to what goes on in the database buffer cache. The Oracle kernel reportedly uses hundreds of different code paths to access buffers in the database buffer cache [Lewis (2003)]. The most expensive of those buffer visits are called Oracle logical I/O (LIO) operations. A database call's LIO count is the sum of its cr and cu statistic values from SQL trace data.

At virtually every Oracle site I've ever visited, more than 50% of the total CPU capacity consumed by Oracle applications has been wasted on unnecessary LIO calls. In many cases, well over 90% of a system's total capacity usage can be eliminated with no loss of useful function whatsoever to the business.

Excessive buffer visits are the morbid obesity of the database. Just like carrying around an extra twenty pounds of body fat hurts virtually every subsystem in a human body ( circulatory , renal, musculoskeletal , ocular...), extra LIO calls can degrade the performance of virtually every subsystem in an Oracle application.

Visiting too many buffers consumes unnecessary CPU service and causes time spent not executing that can show up in large unaccounted-for durations. Unnecessary LIO operations cause latch free waits for cache buffers chains latches, and it motivates unnecessary OS read calls that show up as db file sequential read or db file scattered read .

Many Oracle wait events are capable of revealing harmful performance side effects of unnecessary buffer visits. For example, everybody's performance degrades as CPU run queues grow longer. Oracle's log file sync wait event is one of the first events to show increased latencies due to the time a process spends waiting in a CPU run queue. Negative effects of excessive buffer visits show up in places you might never have expected. For example, when unnecessary buffer visits motivate unnecessarily intense competition for disk I/O, DBWR writes can queue behind read requests . When DBWR fails to keep pace with the buffer change rate, applications become susceptible to waits for free buffer waits , write complete waits , and even log file switch (checkpoint incomplete) events. The origin of buffer busy waits problems can often be traced back to an excessive number of LIO operations. Mistakes that lead to unnecessary LIO operations can even cause unnecessary SQL*Net message from client event executions.

11.3.1.1 Why LIO problems are so common

There are several reasons that so many systems suffer from excessive LIO processing. One reason is that we're all taught early and often that memory accesses are a lot faster than disk accesses, with the implication that lots of memory accesses are nothing really to worry about [Millsap (2001c)]. A deeper reason that so many Oracle applications suffer from excessive LIO processing is that there are so many ways that people can cause the problem. Here is a small sample:

Application users: There are several ways that application users can cause an application to make unnecessary buffer visits. They can run unconstrained queries instead of filtered queries; for example, they can search for the vendor named "Xerox" by performing a blind query instead of specifying X% as a constraint on the name. They can run reports without appropriate arguments; for example, they can run the accidental "whole company's sales since inception" report instead of the intended "this month's sales for my department" report. Especially when systems slow down because of other LIO excesses, users can resubmit the same job several times, resulting in an execution of the same LIOs several times.
Application administrators: Application administrators can make mistakes that result in unnecessary buffer visits, too. In configurable applications, like the Oracle e-Business Suite, charts of accounts configuration and setup decisions can make a tremendous difference in the LIO counts generated by common business functions. Some applications, like the Oracle General Ledger product, have their own query optimization features built in. Using features like these without careful consideration of their performance impact can cause lots of unnecessary LIO calls. Application administrators that do a poor job of data archiving and purging can inflict millions of unnecessary LIO calls upon an application.
Instance administrators: Instance administrator mistakes that can cause unnecessary LIO processing include making poor choices for the dozens of instance parameters like HASH_AREA_SIZE and DB_FILE_MULTIBLOCK_READ_COUNT that influence the operation of the Oracle cost-based query optimizer.
Data administrators: Data administrators can cause unnecessary LIO processing in an immense number of ways. Perhaps the most common is to provide poor quality information about tables and indexes to the Oracle cost-based query optimizer (CBO) by botching the statistics gathering process. Tables afflicted with severe row migration or row chaining problems consume more LIO processing than necessary. Tables with poorly planned PCTFREE and PCTUSED values can drive unnecessary LIOs. Failing to use index-organized tables, clusters, and partitions in the right situations can result in unnecessary LIOs. Failing to declare constraints (like which columns are primary or foreign key columns, and which columns are NULL -able and which are not) can prevent the Oracle query optimizer from using reduced-LIO execution plans. Of course, having too few indexes ”or simply the wrong indexes ”can cause unnecessary LIO operations for queries, and having too many indexes can cause unnecessary LIOs for INSERT , UPDATE , and DELETE statements.
Application developers: Applications development decisions of course have immense impact upon LIO counts. Several types of SQL design mistakes make it impossible for the Oracle kernel to use efficient query execution plans. For example, using a WHERE clause predicate like TRUNC(START_DATE) = TO_DATE(:b1,'mm/dd/rr') might prevent an Oracle kernel process from using an index upon START_DATE that might have provided excellent LIO count reduction. Application code can use perfect SQL and still execute too many LIO calls. For example, an application coded to fetch one row at a time from an Oracle cursor can execute a hundred times more LIO calls than the same application designed to use the Oracle array fetch mechanism to fetch 100 rows in a single LIO call. Neglecting to use Oracle's array features puts extra load not just on the database, but also on the network; the extra database calls required to process larger numbers of smaller row sets produce more SQL*Net message from client event executions, which contribute quickly to user action response times.
Application data designers: Application designers can also make it impossible to build an efficient, low-LIO application. One inventory tracking application that Jeff worked on a few years ago made it impossible to determine the location of an inventory item without constructing the entire history of where the item had been. Instead of a quick indexed "where is it?" lookup, the application required a complicated and long-running CONNECT BY query.

With so many people in the mix who have to do their jobs well to prevent LIO problems, it's no wonder that most sites generate excessive LIO calls.

11.3.1.2 How to optimize SQL

Optimizing inefficient SQL is easily the most important performance repair tactic that you'll need as an Oracle performance analyst. If a database call motivates more than about ten LIO calls per row returned per table listed in the FROM clause of the SQL text motivating the call, then the SQL statement's efficiency can probably be improved. For example, a three-table join operation that returns two rows should probably require fewer than about 60 LIO calls.

Any ratio is unreliable in certain circumstances. One such circumstance for this ratio occurs when a query's result set is the result of an aggregation. For example, a query returning a sum (one row) from a million-row table will legitimately require more than ten LIO calls.

Applications executing SQL resulting in large numbers of LIO calls create massive scalability barriers for systems with large user counts. Not only do unnecessary LIO calls motivate excessive CPU capacity consumption, they often drive large numbers of latch free waits for cache buffers chains latches [Millsap (2001c)]. Attempted latch acquisitions in and of themselves can cause excessive CPU capacity consumption, especially in environments where analysts have increased the value of _SPIN_COUNT beyond its default value (as a general rule, don't ).

There are several good resources available today that explain how to optimize SQL: [Ensor and Stevenson (1997a, 1997b); Harrison (2000); Lewis (2001b, 2002); Kyte (2001); Adams (2003); Lawson (2003); Holt et al. (2003)]. ^[2] Contributors to various mailing lists like Oracle-L (http://www.cybcon.com/~jkstill) do an excellent job of helping list users write efficient SQL. Each of these sources includes good advice about how to write efficient SQL using methods including (but by no means restricted to) the following:

^[2] I'm also eager to see the new SQL optimization book that Dan Tow (http://www.singingsql.com) is reportedly writing as I finish this book.

Diagnosing the behavior of SQL statement executions with tools like tkprof , EXPLAIN PLAN , and debugging events like 10032, 10033, 10046, 10079, 10104, and 10241.
Diagnosing the behavior of the Oracle query optimizer using debugging events like 10053.
Manipulating SQL text to permit the use of more efficient execution plans.
Defining an efficient index strategy to aid in better data reduction for queries without producing excessive overhead for INSERT , UPDATE , MERGE , and DELETE operations.
Using the stored outlines feature to force the Oracle query optimizer to use the plan of your choosing.
Creating appropriate table, index, and database statistics to better inform the Oracle query optimizer about your data.
Designing physical data models that facilitate the storage and retrieval operations your application requires.
Designing logical data models that facilitate the storage and retrieval operations your application requires.

11.3.2 Parse Optimization

Excessive parsing is a sure-fire way to ensure that an application will never scale to large user counts [Holt and Millsap (2000)]. The general sentiment that students seem to bring to our classrooms is that hard parses are huge scalability inhibitors for transaction processing systems, but "soft parses" are okay. More to the point, perhaps people believe that hard parses are avoidable, but "soft parses" are not. Both sentences are only half true. Hard parses are as awful as people expect, and you can avoid them by using bind variables instead of literal values in your application SQL. However, so-called soft parses are awful in their own right, and you can avoid more of them than you might have thought.

Many authors use the term "soft parse" as a synonym for "parse call." I prefer the term "parse call," because it focuses your attention upon the application, where you can actually implement a remedy action. Using the term "soft parse" seems to draw people's focus to the database, which is not the stack layer where you can fix the problem. Here's why. Any time an Oracle kernel process receives a parse call from an application, that kernel process must consume CPU capacity on the database server. If the kernel finds an appropriately sharable cursor for the query either in the session cursor cache or the Oracle library cache, then the parse call never motivates a hard parse , and the parse call ends up being cheaper than it might have been. However, even cheaper than a soft parse is no parse . Applications scale best to large user counts when they parse as infrequently as possible. You should strive to eliminate unnecessary parse calls whenever you can.

In fact, applications scale best when they make the smallest number of database calls that they can. The evolution of the Oracle Call Interface (OCI) reflects this goal. For example, the release 8 OCI reduces client-server round trips in a number of clever ways (http://otn.oracle.com/tech/oci/htdocs/Developing_apps.html). The release 9.2 OCI goes even further to prevent many database calls in the application from ever even reaching the database (http://otn.oracle.com/tech/oci/htdocs/oci9ir2_new_features).

On high concurrency systems with unnecessarily high parse call counts, large CPU service numbers often correlate with large numbers of latch free waits for library cache, shared pool, and other latches. Attempted latch acquisitions in and of themselves can cause excessive CPU capacity consumption, especially in environments where analysts have increased the value of _SPIN_COUNT (again, as a general rule, don't ). Furthermore, excessive parse calls cause unnecessary SQL*Net message from client latencies, which can add up to several seconds of response time waste for every second of actual work done inside the database. Finally, parse calls that use long SQL texts create unnecessary SQL*Net more data from client latencies, which can also add up to big response time numbers.

If your performance problem is caused by large numbers of parse calls, then consider the following workload reduction strategies:

Don't use string literals in SQL WHERE clauses. Use bind variables (placeholders) instead, especially when the literal string has high cardinality (that is, when the literal string has many possible values). Using string literals instead of bind variables consumes CPU service and, on high concurrency systems, it causes unnecessary latch free waits for shared pool, library cache, and row cache object latches.

Extract parse calls from within loops so that an application can reuse the cursor prepared by a single parse call many times. The pseudocode of Example 11-2 shows how.

Example 11-2. Parsing inside of a loop creates a dreadful scalability inhibitor

 # BAD, unscalable application code for each v in (897248, 897249, ...) {   c = parse("select ... where orderid = ".v);   execute(c);   data = fetch(c);   close_cursor(c); }     # GOOD, scalable application code c = parse("select ... where orderid = :v1"); for each v in (897248, 897249, ...) {   execute(c, v);   data = fetch(c); } close_cursor(c);

Deactivate application-to-database driver features that motivate more parse database calls than are apparent in the application source code. For example, the Perl DBI provides a prepare -level attribute called ora_check_sql whose default value of 1 motivates two parse calls per Perl prepare function call. The first parse call is performed to help the application SQL developer more quickly debug his application source code by providing more detailed diagnostic information in response to failed parse calls. However, on production systems, this feature should be deactivated because it motivates unnecessary parse calls.
Use a multi- tier application architecture in which each application service parses all of its SQL statements exactly one time and then reuses cursors for the duration of its uptime.
Don't send long SQL text stings in parse calls. Use stored procedure calls instead. Sending long SQL text strings in parse calls consumes unnecessary CPU service consumption on the server ”even when they use bind variables. Even when the SQL text is completely shareable, the Oracle kernel must validate object permissions each time it receives a SQL text string from a new user ID (when the kernel receives a stored procedure call, the procedure executes in its owner's context, so the permissions on objects inside the package need to be checked only once ”if the application developer doesn't specify the use of the invoker's rights) [Adams (2003) 371-372]. Passing long SQL text strings also causes unnecessary network load, which manifests as SQL*Net more data from client latency for the Oracle kernel process making the parse call, and as longer SQL*Net message from client latencies for everyone else.
Reduce the application's use of public synonyms if you have an extraordinarily large number of object references [Adams (2003) 373-375]. Search www.google.com with site:www.ixora.com.au "public synonym " for additional information.

11.3.3 Write Optimization

The design of the Oracle kernel centralizes its writing tasks nicely into a small set of specialized background processes. The processes that do most of Oracle's write operations are DBWR, LGWR, and ARCH. Most databases do far more reading than they do writing. However, many systems have important service level agreements on business functions that require high-performance writing, and even on systems where writing takes a back seat to reading, slow writes can mess up performance for reads in indirect ways. For example, the poor performance behavior of a slow DBWR process can show up as free buffer waits events in query response times. Excessive write operations can queue at storage devices ahead of legitimate read requests, resulting in degraded db file sequential read , db file scattered read , or direct path read performance.

There are several ways that workload optimization can improve DBWR, LGWR, and ARCH write performance. The most commonly required optimization is actually to eliminate unnecessary LIO operations (see Section 11.3.1). Unnecessary LIO operations can motivate unnecessary OS read calls, which can queue ahead of DBWR writes, which can inspire longer-than-expected write latencies on db file single write and db file parallel write operations executed by a DBWR process.

Next, you should ensure that all the writes your application does are truly necessary. There are lots of sneaky ways that an application can generate more writes than it really needs to. For example:

The Oracle kernel generates redo and undo for every index block that is changed by an INSERT , UPDATE , or DELETE statement, so the presence of unnecessary indexes can generate lots of unnecessary undo in transaction processing systems. For example, an insert into a table with three indexes generates roughly ten times more workload than an insert into an unindexed table [Ensor and Stevenson (1997a), 147].
Some applications generate unnecessary undo by updating columns to the same value they already had. For example, in a SQL statement that sets a status flag from N to Y based on some set of conditions, make sure that your WHERE clause includes a predicate that specifies AND STATUS='N '. Automatic application generators often update columns to values they already had. They do it when they generate an UPDATE statement that updates every column that has a value shown on the current screen. Instead of updating all the columns with on-screen values, they should update only the columns that the user has changed .

Application users and database administrators can execute table and index operations that use the LOGGING designation by default, but that could have been performed just as well with the NOLOGGING designation. (The keywords LOGGING and NOLOGGING replace the deprecated keywords RECOVERABLE and UNRECOVERABLE .)

Using NOLOGGING is not a good idea if you actually want an operation to be recoverable. For example, you don't want to use NOLOGGING operations on a database that participates in a hot standby architecture. Oracle9 i provides a FORCE LOGGING mode to stop developers from successfully using the NOLOGGING option.

Your system configuration decisions influence the amount of workload your system must endure, too. For example, RAID level 5 disk configurations are particularly vulnerable to write-induced waiting. Every write performed by an Oracle DBWR process is a single-block write, which RAID level 5 handles very inefficiently unless it has been configured with a sufficient amount of cache. When sustained write rates overwhelm the storage capacity of the cache, the performance of a RAID level 5 disk group degrades to roughly four times worse than the array's expected operational throughput. Your best solution is to eliminate enough wasted workload that your sustained I/O rate to the device drops to a suitable level. Failing that, you can choose one of the following courses of action:

Increase the size of the cache (which will only defer the problem, but perhaps you can defer it long enough to suit your application for the duration of its peak I/O load).
Increase the number of RAID level 5 disk groups dedicated to servicing your database application's read and write requests.
Reconfigure your disks into a different RAID organization that allows you to achieve higher I/O throughput rates without the need to buy additional memory or disks. For example, use striping and mirroring.

Top