Chapter 10

Overview

I am extremely familiar with tuning. I spend a lot of time tuning systems, in particular systems that I did not architect or code. This leaves me at a distinct disadvantage; it might take me a while to figure out not only where to look, but also where not to look. Tuning in this way is a very tricky thing, and is usually done under extreme pressure. No one tunes when everything is going OK - they only seem to want to tune, when it is all falling apart.

This chapter describes my approach to tuning and the tools I use. You will see that I adopt a 'defensive' approach - I try to make it so that I never have to tune at all; not after a rollout of an application anyway. Tuning is part of development that starts before the first line of code is ever written and ends the day before deployment; it is not a post deployment activity. Unfortunately, most of the tuning I am called in to do affects production systems that are already developed. This means we are tuning in a somewhat hostile environment (angry end users) and have many undesirable constraints applied to us (it is a production system, they don't like you to shut them down to change something). The best time to tune is well before you get to that point.

Specifically, this chapter will cover:

  • Bind variables and performance.

  • How to trace applications using SQL_TRACE and TIMED_STATISTICS and interpret the results using TKPROF.

  • How to set up and use Statspack to tune the database instance.

  • Some of the V$ tables that I use regularly and how I use them.

Identifying the Problem

Tuning an application can be a very interesting thing indeed. Not only can figuring out where the application is falling apart be difficult, but also implementing the fixes can be even more painful. In some cases, the overall architecture must change. In Chapter 1 on Developing Successful Oracle Applications, I described a system that used Oracle's multi-threaded server to execute long running stored procedures. This system needed an architectural overhaul to fix. In other cases, simply finding the worst performing SQL queries and tuning them, is all that it takes.

Not all tuning is database-related. I remember one particularly difficult tuning session where the client was using a commercially available time and attendance application built on top of Oracle. This particular application had quite a few installations at various locations and all was going well elsewhere. At this one location however, the biggest install to date, it was totally falling apart. Most of the time, it ran perfectly, as expected. At peak times, such as shift change and mealtime, a problem would arise C the entire system would periodically appear to hang. There seemed to be no reason why. I knew by observation that it was a locking/contention issue C I could see that clearly. Figuring out why we had locking and contention was the hard part. After spending two days kicking around in the database, looking at all of the V$ tables, reviewing code, and observing the problem, I asked to see the application in action. I understood conceptually what it did, but I didn't know how it did it. The clients took me to the warehouse floor where the end users (the factory workers) used the application. I saw how they actually used it. They would pass in a single file past one of the many terminals and swipe a bar code to clock in. The next person in line would hit the Enter key and swipe their card and on this goes. Eventually, as I was standing there, the application hung. No one could swipe their cards anymore. Then, someone walked up to an empty terminal, hit Enter and swiped their card C the system was back! Looking closer at the application, it was clear what was happening. It turned out to not be a database tuning issue at all C it was purely a human computer interface issue. The application interface was the cause of the hang. It turns out the system was designed to process a transaction like this:

  1. Swipe the card that locked a user row in a table and inserted a row into another table.

  2. See the message on screen that said row inserted, please hit enter.

  3. You would hit Enter and then the application would COMMIT (until then, it would not have COMMITted).

  4. Next person swipes the card.

As it turned out, people were doing the following:

  1. Swipe the card, and leave the terminal.

  2. Next person hits Enter for the last person, committing their work, swipes their card and leaves.

When the last person swiped their card, they were leaving a transaction open, which had locked resources. A background process that was running every couple of minutes locked a resource and then tried to lock the same resource that this 'open' transaction had. The background process would halt, but not before it locked some of the resources needed by the interactive application. This would block all of the terminals and the system would 'hang' in this fashion until someone walked by and noticed the Please hit enter to continue message, and hit the Enter key. Then everything worked OK. A simple fix in the user interface to commit the transaction solved this issue.

In another case, on my last tuning foray, I was looking at a very large application implementation. There were literally hundreds of people involved in the development, implementation, and deployment of this application. The situation was reaching crisis point (they always are when you are tuning). The complaint: 'The Oracle application is slow.' After taking a look at the application, the environment, and the database, it was not at all obvious where the problem was. After many walkthroughs and application overviews, a couple of possible choke points in the system began to become clear. As it turned out, it wasn't the Oracle application at all C it was an interface to an existing system. The new application was using the existing system in a way no one envisioned. The additional load on the existing system put it over the edge. The new application was killing the old application. It was very difficult to track this down because the code wasn't instrumented heavily (instrumentation meaning something as simple as 'debug' messages or application logs with timings, so you can see what is going on). We had to add a lot of 'after the fact' instrumentation to determine the cause of the slowdown (and eventual shutdown) of the system.

The moral of these stories is that tuning is tricky, and the solutions are not always intuitive. No two tuning exercises are going to be the same; the problems are not always in the database, they are not always in the application, and they are not always in the architecture. Finding them, especially if you do not understand how the application is to be used or how it works, is sometimes 'pure luck'. That's why tuning is considered 'black magic' by some. It is extremely hard to figure out how to tell someone to tune a system if the approach is to tune after the fact. Tuning after the fact requires the investigative skills of a detective, you are unraveling a mystery. It takes a fine mix of technical skills and people skills (no one ever wants to have the finger pointed at them, you have to be careful in this respect). There is no hard and fast roadmap to follow in after the fact tuning. I can tell you, however, how to tune as you develop; it is a strategy I will encourage. System tuning to me is a design time activity, something you build into the entire system. Tuning a system after the fact is not really tuning C that is recoding, re-architecting.

My Approach

In this section, I am going to introduce some of the general ideas I have about tuning. If there's one thing considered somewhat as a 'black art' in Oracle, it would be tuning. Anything you do not understand looks like magic. Database tuning is something a lot of people do not understand. I find the art of tuning rather straightforward. I believe there are three levels of tuning we all need to be aware of and follow:

Application tuning parts 1 and 2 constitute a good 90 percent or more of the effort. That's right; before we even get the DBA's involved, we the developers have done 90 percent of the work. This is why I think most people do not understand database tuning. They are constantly asking me to come in and tune their database, without touching the applications! Except in the rarest of cases, this is physically impossible. Everyone is in search of what I call the fast=true init.ora parameter. This magic parameter will make their database go faster. The sooner you accept the fact that it does not exist, the better off you'll be. It is extremely unlikely that we can make your queries go magically faster by flipping a switch in a parameter file. We may have to reorganize your data, and I'm not talking about moving data files around to speed up I/O, I'm talking about physically rearranging the columns in your tables, the number of tables you have, and what they contain. That is application redesign.

There are some issues that have to be tuned at the database level, but experience shows that most of the work is done by us in the applications themselves. It is most likely that only you will be able to take your query that does 1,000,000 logical I/Os to get its answer, and figure out a better query or alternate method to getting that answer. If you have architectural issues, only you will be able to correct or, work around, them.

Tuning is a Constant thing

Once you realize that most tuning is done at the application level, the next step is to determine that tuning is something done continuously. It doesn't have a start point and a stop point. It is part of the design phase, it is part of development, it is part of testing, and it is part of deployment and beyond.

Design for Performance

A database system is not something that can be approached with the 'build it now and tune it later approach'. Far too many decisions will be made in the 'build it' part that will directly impact the ability of the system to perform, and more importantly to scale. Remember I said earlier, 'it is far easier to build a non-scalable system than it is to build a scalable system in Oracle'. Building a non-scalable application is trivial C anyone can do that. It takes work to build one that performs and scales well. It must be designed this way from day one.

True story: a group of people developed an application. It was an extension to an existing system. It came with its own set of tables, and used some of the existing tables. The new application is deployed, and immediately, the existing system becomes unusable C the entire system is just too slow to use. It turns out the developers of the new application decided that they would figure out what indexes they needed 'in the field'. Short of primary keys, there wasn't a single index to be found. Virtually every query consisted of multiple full scans. Their question was 'how do we fix this quickly?' The only answer at that point was 'drop your tables and get out of your database.' To this group, tuning was an activity to be undertaken post-development. You get it working, and then you get it working faster. Needless to say, this project was a total failure. The database structures they had developed could not efficiently answer the questions they wanted to ask of it. It was back to square one, and they had to start over.

My approach starts from the very beginning of the application design phase and I would now like to relate a small story that illustrates my approach very well. There was once an internal system where I work called 'phone'. You could telnet into any e-mail machine (back when e-mail was character mode), and on the command line type phone <search string>. It would return data like this:

$ phone tkyte TKYTE    Kyte, Tom            703/555 4567  Managing Technologies RESTON: 

When the Web exploded in about 1995/1996, our group wrote a small web system that loaded this phone data into a table, and let people search it. Now that it was in a database and had a little GUI to go with it, it started becoming the de-facto standard within the company for looking up people. Over time, we started adding more data to it, and more fields. It really started to catch on. At some point, we decided to add a lot more fields to it, and rebuild the system with more features.

Our goal at this stage was to develop it from the very beginning to perform. We knew it would be a limiting factor in our total system performance. It might have been small in size (application-wise) but it would generate the most traffic on our site. The first thing we did, the very first, based on our knowledge and thoughts of how people would use this simple little system, was to design the tables to hold this data. We specifically designed these tables for what they were going to be used for. We had a little, read-only repository of data that people would be searching, and it needed to be fast. This system was growing in popularity every day, and was threatening to consume our machine with its resources. We had a single 67-column, 75000-row table upon which we wanted to perform a simple string search against various fields. So, if I put in ABC, it would find ABC in the e-mail address, or the first name, or the last name, or the middle name, and so on. Even worse, it was to be interpreted as %ABC%, and the data was in mixed case.

There wasn't an index in the world that could help us here C so we built our own. Every night as we refreshed the data from our HR system (a complete refresh of data), we would also issue:

CREATE TABLE FAST_EMPS PCTFREE 0 CACHE AS SELECT upper(last_name)||'/'||upper(first_name)||'/'  . || '/' ||                   substr( phone, length(phone)-4) SEARCH_STRING,        rowid row_id   FROM EMPLOYEES / 

after we were done. In effect, we build the most dense, compact table possible (pctfree 0), and asked that it be cached if possible. Now, we would query:

select *   from employees  where rowid in ( select row_id                     from fast_emp                    where search_string like :bv                      and rownum <= 500 ) 

This query would always full scan the FAST_EMP table, but we wanted it to. That was, given the types of questions we were using, the only choice. There is no indexing scheme we could use to answer this. Our goal from the outset was to minimize the amount of data that would be scanned, to limit the amount of data people would get back, and to make it as fast as possible. The above accomplishes all three goals. The FAST_EMP table is typically always in the buffer cache. It is small (less than 8 percent the size of the original table), and scans very fast. It has already done the work of the case-insensitive searching for us once (instead of once per query) by storing the data in uppercase. It limits the number of hits to 500 (if your search is broader than that, refine it C you'll never look at 500 hits anyway). In effect, that table works a lot like an index for us, since it stores the row ID in employees. There are no indexes employed on this system in order to do this search C just two tables.

This is a perfect example of an application with very particular design considerations C and of how taking these into account at the very beginning can lead to optimum performance.

Try many Approaches

It is crucial to test, to experiment, to try different implementations. 'Theory' is great C but often wrong, implemented tests are much more accurate. Try out your ideas. See what the real world performance is. The database has literally thousands of features. There is no single 'best solution' (if that was the case, then a database vendor would only need to supply this one solution). Sometimes partitioning data will increase performance, at other times it is not. Sometimes interMedia Text can speed up your searches, sometimes not. Sometimes a hash cluster is the best thing in the world, sometimes not. There are no 'evil' features (features to be avoided at all costs). Likewise, there are no 'silver bullet' features that solve all problems.

Before settling on this design for the above application, we tried a couple of alternative approaches. We tried using a fast full scan on a function-based index (close, but not as fast), we tried interMedia Text (not useful due to the %ABC% requirement), we tried having just an extra field in the base EMPLOYEES table (wiped out the buffer cache, too big to full scan). It seems funny to have spent so much time on this one detail. However, this one query is executed between 150,000 and 250,000 times a day. This is two to three times a second, every second, all day long, assuming a constant flow of traffic. That is not something we can assume C we frequently have spikes in activity as most systems do. If this single query performed poorly, our entire system would fall apart C and it is just one of thousands of queries we have to do. By determining where our weak points would be, the lowest hanging fruit so to say, and concentrating on them, we were able to build an application that scales very well. If we had tried the 'tuning after the fact' principle, we would have really found ourselves re-writing after the fact.

Program Defensively

Instrument your code, and leave it instrumented in production. Instrumentation is the practice of having a way to trace your application from the 'outside'. SQL_TRACE (covered in detail later in the chapter) is an instrumentation tool. The EVENT system within Oracle is an instrumentation tool (an example of EVENTs in Oracle is given below). The Oracle database is heavily instrumented so the Oracle kernel developers can diagnose performance issues without every having to go onsite. You need this capability in your application as well. The only way to make things go faster is to understand where they are going slow. If you just point at a process and say 'it is going slow', you'll have a hard time tuning it. If the process is heavily instrumented and has the ability to log itself on demand, you'll be able to figure out where things are going slow.

Benchmark

Benchmarking periodically throughout the implementation is critical. Something that works fine with 10 users falls apart with 100 or 1000. Benchmarking is the only way to ensure you can meet your goals.

The first point to note here is that you should have identified performance metrics from day one. This is my nightmare: someone wants to do a benchmark and the goal is simply 'go as fast as you can'. This is an open invitation to benchmark for the rest of your life. Everything can always go a little faster. With the 'as fast as you can' goal, you are never done. You need to know what your constraints are, and develop to them. Additionally, if you are constrained to 'go as fast as you can', you also have license to 'go as slow as you want'. You have nothing to measure towards, so whatever you develop is fast enough. That last point should raise some eyebrows C 'as fast as you can' is the same as 'as slow as you want'. Metrics are the only thing that will prevent that from happening.

The first step is to benchmark in isolation. If you cannot get it to run fast in single user mode, it will only run slower in real life. Record these benchmarks, compare them to past and future tests. You'll find that a module that used to take 1 second, and now takes 1 minute, much easier this way.

The next step is to benchmark to scale and test your application (or your ideas) under your expected load to ensure what you are building scales up. In testing, many people test functionality. When I test, I test for scalability. Now we have all of the modules for our application together for the first time, we need to take the time and the energy to scale it up, and see if it will fly or not. It is at this point that you will discover the queries that do not use bind variables, the locking and contention introduced by the application, the architectural flaws. In a scalability test, these things become obvious C painfully obvious. If you want to be successful when you deploy your application, you will test it for scale before deployment One thing I hear time and time again is 'we developed against a subset of the data and it ran fine. When we deployed the system in production, everything ran really slow. Help us quick!' In this case, the only thing you can do quickly is to un-deploy the system, and go back to the drawing board. You need to develop the system against the full set of data upon which it will be used. That full scan of 100 rows was no problem for your development machine. Now that it is 100,000 rows, and you have 100 people doing it simultaneously, it is something of a performance issue. You must develop with the real stuff, the real tables, the real security, everything. This is the only way to head off the really bad queries in the beginning. Again, record these benchmarks C you'll quickly discover the implementation that causes performance to nosedive by comparing current results to history. You'll also be able to justify that new piece of hardware/software down the road (it increased throughput by X) C you won't have to guess.

For example, the following sequence of statements work great in isolation:

declare   l_rec t%rowtype; begin   select * from T into l_rec from T where rownum = 1 FOR UPDATE;   process( l_rec );   delete from t where t.pk = l_rec.pk;   commit; end; 

It processes the first record in a table very quickly. You might time this in isolation and say 'I can do 5 transactions per second (TPS).' You then proceed to 'extrapolate' that number upwards and say 'if we run 10 of these processes, we can do 50 TPS.' The problem is that you will still only do 5 TPS (if that) with 10 processes, as they will all serialize; they all try to lock the first record and only one of them can do this at any one time. Getting something to run fast in isolation is great; getting it to run fast in an environment with many users is another thing all together.

Even if you have a read-only system, in other words, no possibility of a locking issue as above, you must benchmark to scale. Queries need resources C block buffers, I/O from disk, CPU to sort data with, and so on; anything that needs resources must be tested. The row lock from the example above is just a type of resource, there are hundreds of resources to contend for, regardless of the type of system you are building.

And finally, the last component in the tuning loop should be the database instance tuning. Most people are looking for what I like to call the FAST=TRUE setting in the init.ora file; some simple setting they can make in their database initialization file to make things go faster. This setting does not exist. In fact, database instance tuning in my experience will return the least amount of performance increase on a database that has reasonable settings already. There are extremely rare instances where the database setup is so bad that instance tuning will materially affect performance, but they are rare indeed (you might have 300 block buffers configured on a data warehouse, and you really needed between 30,000 and 300,000, for example). Application tuning, such as changing the database structures, and implementing better performing algorithms, is where the low hanging fruit is, where the opportunity to really tune lies. Often, there is little to nothing that can be done at the database instance level to fix things.

Now, back to the real world again where applications are built, never tested to scale, never held up to metrics during development, and aren't instrumented at all. What can we do with these (besides run away and hide)? In some cases, we'll be able to use some tools, to diagnose and correct their issues. In many other cases, instrumentation of the code itself will be necessary, particularly in large applications with many interacting modules. In order to diagnose large application issues, just finding out where to start may be difficult. If you have a Java client taking to an application server, that makes a call to a CORBA object, that ultimately updates the database C finding out where things are taking time is going to be tedious (unless you've instrumented). Even after you find it, fixing it will be hard as well. Many times the simple solution is the correct answer. The fewer moving parts to consider, the easier it is to tune.

Bind Variables and Parsing (Again)

We have covered bind variables a couple of times already, from various perspectives. We have seen for example that by not using bind variables, you might spend 90 percent of your execution time parsing queries instead of actually running them. We have seen how this can affect the shared pool, a precious resource in Oracle. By now, you understand that they are critical to a system's performance. Don't use them and you will run many times slower then you should, and you will reduce the number of users that can ultimately supported. Use them and your life will be much easier. At least you won't have to go back and fix your programs to use them.

Bind variables are important because one of the design features of the Oracle database is to reuse optimizer plans whenever possible. When you submit any SQL or PL/SQL to the database, Oracle will first search the shared pool to see if it already exists. For example, in the case of a SQL query, Oracle will look to see if that query has already been parsed and optimized. If it finds the query and it can be reused, you are ready to go. If it cannot find it, Oracle must go through the arduous process of parsing the query fully, optimizing the plan, performing the security checks, and so on. This not only consumes a lot of CPU power (typically many more times the CPU processing time than is used in executing the query itself), it tends to lock portions of the library cache for relatively long periods. The more people you have parsing queries, the longer the wait for the latch on the library cache and the system slowly grinds to a halt.

Bind variables and their uses are a good example of why you have to test to scale. In a single user system, one user parsing queries that do not use bind variables may not be noticeable. This single session will run slower than it could, but it will probably run 'fast enough'. It is when you start 10 or 100 of these sessions simultaneously that you discover the system grinds to a halt. You have a scarce resource (CPU, Library Cache, Library Cache Latches) and you are overusing them all. By simply using bind variables, you reduce the needs for these resources many times.

I am going to demonstrate the power that bind variables have over the performance of your application once again. In Chapter 1 on Developing Successful Oracle Applications, I showed this for a single session C not using bind variables would make an application run slower. In that section, we saw that a block of code using bind variables would take 15 seconds to execute. The same exact code written with bind variables ran in 1.5 seconds. Here, I will show you the effect of not using bind variables in a multi-user situation. We already understand that our code will definitely run slower without bind variables C now let's measure what it will do to our scalability.

For this test, I will use the following tables. Note that access to the V$SESSION_EVENT table must be available for this example to work C you may need to have SELECT on V$SESSION_EVENT granted to you. Additionally, the SESSION or SYSTEM parameter TIMED_STATISTICS must be enabled for the numbers to be meaningful (the times will be zero otherwise). This may be accomplished via the command ALTER SESSION SET TIMED_STATISTICS=TRUE. We'll start by creating a global temporary table SESS_EVENT that our session will use to hold 'before values' of the events our session has waited on. This SESS_EVENT table will be used to measure our sessions wait events (what it waited on), how many times it waited for an event, and how long it waited in hundredths of seconds.

tkyte@TKYTE816> create global temporary table sess_event   2  on commit preserve rows   3  as   4  select * from v$session_event where 1=0;      Table created. 

Now, we'll create the 'application' table to test with:

tkyte@TKYTE816> create table t   2  ( c1 int, c2 int, c3 int, c4 int )   3  storage ( freelists 10 );      Table created 

I want to test the effects of multiple users inserting into this table concurrently. We saw in Chapter 6 on Tables the effect that multiple freelists can have on concurrent inserts so we've already incorporated this into our design. Now we'll measure what sorts of wait events our 'application' experiences. We do this by making a copy of our session's current wait events, running the block of code we wish to analyze, and then computing the waits that occurred during that block of code:

tkyte@TKYTE816> truncate table sess_event;      Table truncated.      tkyte@TKYTE816> insert into sess_event   2  select * from v$session_event   4  where sid = (select sid from v$mystat where rownum = 1);      3 rows created.      tkyte@TKYTE816> declare   2      l_number number;   3  begin   4      for i in 1 .. 10000   5      loop   6          l_number := dbms_random.random;   7   8          execute immediate   9          'insert into t values ( ' || l_number || ',' ||  10                                       l_number || ',' ||  11                                       l_number || ',' ||  12                                       l_number || ')';  13      end loop;  14      commit;  15  end;  16  /      PL/SQL procedure successfully completed      tkyte@TKYTE816> select a.event,   2         (a.total_waits-nvl(b.total_waits,0)) total_waits,   3         (a.time_waited-nvl(b.time_waited,0)) time_waited   4    from ( select *   5             from v$session_event   6            where sid = (select sid from v$mystat where rownum = 1 )) a,   7         sess_event b   8   where a.event = b.event(+)   9     and (a.total_waits-nvl(b.total_waits,0)) > 0  10  /      EVENT                          TOTAL_WAITS TIME_WAITED ------------------------------ ----------- ----------- SQL*Net message from client              4          14 SQL*Net message to client                5           0 log file sync                            5           2 

In this small test, we are building a unique INSERT statement that will look something like this:

insert into t values (12323, 12323, 12323, 12323); insert into t values (632425, 632425, 632425, 632425); ... 

The above results are from a single user run. When we run two of these at the same time, we'll see a wait report that looks more like this:

EVENT                                    TOTAL_WAITS TIME_WAITED ---------------------------------------- ----------- ----------- SQL*Net message from client                        4          18 SQL*Net message to client                          5           0 enqueue                                            2           0 latch free                                       142         235 log file sync                                      2           2 

As you can see, this session waited many times for a latch free (and waited a long cumulative time). The other observed waits in this case were:

The latch free event is the one we need to focus in on in this case. This latch free is actually a latch on the shared SQL area. This is something I happen to know given the characteristics of the above transaction we executed. Further sleuthing in the V$ tables would confirm this (later in this section, we will discuss these V$ views in more detail). Since we had two sessions now doing 'hard parses' (parses of a query that has never been parsed yet), contention for the shared SQL area is introduced. They both needed to modify a shared data structure, and only one session at a time can do this. The following chart shows the latch free wait event for 1, 2, 3, 4, and 5 sessions executing the above transaction simultaneously:

 

1 user

2 users

3 users

4 users

5 users

Waits

0

102

267

385

542

Time (seconds)

0

1.56

5.92

10.72

16.67

You have to remember that the above information is by session C each of the sessions had that many waits and waited that long. With two users we had about three seconds of wait time, with three users about 18, with four about 40, and so on. At some point, as we add more users, we will spend more time waiting than actually processing. The more users we add, the more time we spend waiting, and eventually the waits become so long that adding more users will decrease not only our response times, but also our overall throughput. How to fix this? If we simply rewrite the block of code to use bind variables like this:

tkyte@TKYTE816> declare   2      l_number number;   3  begin   4      for i in 1 .. 10000   5      loop   6          l_number := dbms_random.random;   7   8          execute immediate   9          'insert into t values ( :x1, :x2, :x3, :x4 )'  10                  using l_number, l_number, l_number, l_number;  11      end loop;  12      commit;  13  end;  14  /      PL/SQL procedure successfully completed. 
Note 

To run this example on Oracle 8.1.5 please see the errata at the Apress web site, http://www.apress.com/

Using this, we notice a marked improvement:

 

1 user

2 users

3 users

4 users

5 users

Waits

0

47 (46%)

65 (25%)

89 (23%)

113 (20%)

Time (seconds)

0

0.74 (47%)

1.29 (21%)

2.12 (19%)

3.0 (17%)

This is a dramatic improvement, but we can do better. With two users, the total number of waits for latch frees on the shared SQL area is down to 46 percent of the original count, as is the time waited. As we add more users, it gets even better. With five users, we are experiencing only 20 percent of the waits we had without bind variables and waiting only 17 percent of the time.

However, I said, we could go one step better. In the above example, while we are avoiding the hard parse, we are still experiencing 'soft parses'. Each iteration through the loop must find the INSERT statement in the shared pool and verify we can use it. The EXECUTE IMMEDIATE approach above is equivalent to coding:

loop   parse   bind   execute close end; 

We would much prefer to code:

parse loop   bind   execute end; close; 

In the above example, we could either use static SQL in PL/SQL or use DBMS_SQL and procedurally do the dynamic SQL (see Chapter 16 on Dynamic SQL later for examples and all of the details on DBMS_SQL). I will use static SQL with bind variables as follows:

tkyte@TKYTE816> declare   2      l_number number;   3  begin   4      for i in 1 .. 10000   5      loop   6          l_number := dbms_random.random;   7   8          insert into t   9                  values( l_number, l_number, l_number, l_number );  10      end loop;  11      commit;  12  end;  13  /      PL/SQL procedure successfully completed. 

I know that PL/SQL will cache my cursor for me C that is one of the major advantages of PL/SQL. This insert will typically be soft parsed once per session for me if it was in a procedure. It will be soft parsed once per block in an anonymous block. Now, if we re-run the tests:

 

1 user

2 users

3 users

4 users

5 users

Waits

0

1

1

7

7

Time (seconds)

0

0

0.01

0.10

0.08

For all intents and purposes, the latching is totally gone; it is barely measurable. The above demonstrates why:

Avoiding the soft parse of a SQL statement pays back even more than using bind variables in some cases, as demonstrated above. Don't be so fast to close a cursor C the overhead of having it remain open during your program execution if you might reuse it, is overshadowed by the performance increase you will receive by keeping it open.

In Oracle 8.1.6, a new feature was introduced called CURSOR_SHARING. Cursor sharing is an 'auto binder' of sorts. It causes the database to rewrite your query using bind variables before parsing it. This feature will take a query such as:

scott@TKYTE816> select * from emp where ename = 'KING'; 

and will automatically rewrite it as:

select * from emp where ename =:SYS_B_0 

or in 8.1.6 and in 8.1.7

select * from emp where ename =:"SYS_B_0" 

This is a step in the right direction, but should not be the final solution to the problem and should not be used long term. There are side effects from CURSOR_SHARING that you must aware of as well. A poorly written program may experience a large gain in performance by setting the CURSOR_SHARING=FORCE parameter, but it will still be running slower than it should, and it will still be limited in the degree to which it can scale. As we observed above, we were able to reduce by a large percentage many of the waits and the time spent waiting by simply using bind variables. We could achieve the same exact results using CURSOR_SHARING=FORCE. It was not until we avoided the soft parses however, that we were able to avoid 100 percent of the waits. Cursor sharing will not avoid these soft parses for us. It is a fact that if you can benefit greatly from CURSOR_SHARING, you are causing Oracle to parse many queries too frequently. If you are frequently causing many queries to be parsed, then adding CURSOR_SHARING will fix the bind variable issue, but you will still be left with the soft parse overhead. While not as egregious as the overhead of bind variables, high soft parse counts will limit your performance and scalability as well. The only correct solution is to use bind variables in the first place and to reuse your cursors whenever possible. For example, if I were coding a Java application I would never code a routine like this:

... String getWordb(int ID, int IDLang, Connection conn) throws SQLException {   CallableStatement stmt = null;        stmt = conn.prepareCall("{ call get.wordb (?,?,?)}");   stmt.setInt(1,ID);   stmt.setInt(2,IDLang);   stmt.registerOutParameter (3, java.sql.Types.VARCHAR);   stmt.execute();   String word = stmt.getString (3);   stmt.close();   return word; } ... 

I would code it like this:

... CallableStatement stmt = null;      String getWordb(int ID, int IDLang, Connection conn) throws SQLException {   if ( stmt == null ) {     stmt = conn.prepareCall("{call get.wordb (?,?,?)}");     stmt.registerOutParameter (3, java.sql.Types.VARCHAR);   }   stmt.setInt(1,ID);   stmt.setInt(2,IDLang);   stmt.execute();   return stmt.getString (3); } ... 

Here, I am ensuring the use of bind variables by using the placeholders in the statement. Additionally, I am parsing this statement at most once per program execution. This will make a huge difference in performance. In one test, I called the 'bad' and 'good' implementations 1000 times each. The bad implementation, with the soft parses for each call, took two and a half seconds to execute. The good implementation took one second. That was in single user mode. We know by now that adding additional users, each doing thousands of soft parses, will just slow each other down with latch waits during the soft parse. This is overhead we can, and must, avoid in our applications.

Now back to CURSOR_SHARING for a moment. I also said above that there are some side effects of CURSOR_SHARING that you must be aware of. They fall into the following categories:

We'll take a look at each of these issues in detail, but first let's take a closer look at the optimizer related issues. These will have direct performance related effects on your application. Consider the following simple example in which we will execute a query that uses a mixture of bind variables and character string constants. The performance of this query before CURSOR_SHARING is excellent. The performance after turning on CURSOR_SHARING is terrible. The reason for the performance hit is due to the fact that the optimizer has less information to work with. The optimizer, when it could see the character string constants, was able to determine that one index would be much more selective than another. When all of the constants were removed, the optimizer didn't have this foresight. This will be a common issue in many applications; it will not be a 'rare' occurrence. Most applications use a mixture of bind variables and constants in their SQL. Just as never using a bind variable is terrible, always using a bind variable can be just as bad. Here is the table setup for the example:

tkyte@TKYTE816> create table t as   2  select * from all_objects;      Table created.      tkyte@TKYTE816> create index t_idx1 on t(OBJECT_NAME);      Index created.      tkyte@TKYTE816> create index t_idx2 on t(OBJECT_TYPE);      Index created.      tkyte@TKYTE816> analyze table t compute statistics   2  for all indexed columns   3  for table;      Table analyzed. 

We have two indexes on this table. The index on OBJECT TYPE is used for our query below. The index on OBJECT NAME is used heavily by another application. We need to have both indexes in place. The query we will execute against this table is going to be:

select *   from t  where object_name like :search_str    and object_type in( 'FUNCTION','PROCEDURE', 'TRIGGER' ); 

This query is designed to fill in a pop-up list in our application. From this list we would click on the procedure, function, or trigger we want to edit. This query is executed thousands of times a day.

If we take a look at the raw data in the table:

tkyte@TKYTE816> compute sum of cnt on report tkyte@TKYTE816> break on report tkyte@TKYTE816> select object_type, count(*) cnt from t group by   2  object_type; OBJECT_TYPE               CNT ------------------ ---------- CONSUMER GROUP              2 CONTEXT                     4 DIRECTORY                   1 FUNCTION                   27 INDEX                     301 INDEXTYPE                   3 JAVA CLASS               8926 JAVA DATA                 288 JAVA RESOURCE              69 JAVA SOURCE                 4 LIBRARY                    33 LOB                         3 OPERATOR                   15 PACKAGE                   229 PACKAGE BODY              212 PROCEDURE                  24 SEQUENCE                   35 SYNONYM                  9817 TABLE                     292 TRIGGER                     7 TYPE                      137 TYPE BODY                  12 UNDEFINED                   1 VIEW                     1340                    ---------- sum                     21782      24 rows selected. 

we can see that 58 rows would be retrieved by the IN clause, and an indeterminate amount would be retrieved by the LIKE clause (anywhere from 0 to 21,782). If we run the query in SQL*PLUS like this:

tkyte@TKYTE816> variable search_str varchar2(25) tkyte@TKYTE816> exec :search_str := '%';      PL/SQL procedure successfully completed.      tkyte@TKYTE816> set autotrace traceonly tkyte@TKYTE816> select * from t t1 where object_name like :search_str   2  and object_type in( 'FUNCTION','PROCEDURE', 'TRIGGER' );      58 rows selected.           Execution Plan ----------------------------------------------------------    0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=3 Bytes=291)    1    0   INLIST ITERATOR    2    1     TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=5 Card=3 Bytes=291)    3    2       INDEX (RANGE SCAN) OF 'T_IDX2' (NON-UNIQUE) (Cost=2 Card=3)      Statistics ----------------------------------------------------------         222  recursive calls           0  db block gets          45  consistent gets           3  physical reads           0  redo size        6930  bytes sent via SQL*Net to client         762  bytes received via SQL*Net from client           5  SQL*Net roundtrips to/from client           1  sorts (memory)           0  sorts (disk)          58  rows processed 

it is obvious with our knowledge of the data, that the optimizer should definitely use the index on OBJECT_TYPE to find the 58 rows out of the 21,000 and apply the LIKE clause to them. It also shows there are times when using a constant is relevant. In this particular case, we would rather not use a bind variable C we would rather use a constant. The optimizer can ascertain that this query will benefit from one index over the other greatly. If we do this:

tkyte@TKYTE816> alter session set cursor_sharing = force;      Session altered.      tkyte@TKYTE816> select * from t t2 where object_name like :search_str   2  and object_type in( 'FUNCTION','PROCEDURE', 'TRIGGER' );      58 rows selected.           Execution Plan ----------------------------------------------------------    0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=3 Bytes=291)    1    0   INLIST ITERATOR    2    1     TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=5 Card=3 Bytes=291)    3    2       INDEX (RANGE SCAN) OF 'T_IDX2' (NON-UNIQUE) (Cost=2 Card=3)      Statistics ----------------------------------------------------------           0  recursive calls           0  db block gets       19256  consistent gets         169  physical reads           0  redo size        7480  bytes sent via SQL*Net to client         762  bytes received via SQL*Net from client           5  SQL*Net roundtrips to/from client           2  sorts (memory)           0  sorts (disk)          58  rows processed 

Although the optimizer plan does not appear to have changed (AUTOTRACE is showing the same exact plan) the difference in consistent GETS (logical reads) is significant C indicating that something changed. In fact, the database really ran the query:

select *   from t t2  where object_name like :search_str   and object_type in( :SYS_B_0,:SYS_B_1, :SYS_B_2 ) 

and it was no longer able to determine the amount of rows it would access via the index on OBJECT_TYPE. The above example also demonstrates how CURSOR_SHARING can be harder to tune with. The explain plan SQL*PLUS printed out leads me to believe we performed an index read on T_IDX2, but if you look at the consistent gets (logical reads), we see 19,256 of them. The first query that really did do an index scan on T_IDX2 processed 45 blocks. Here, explain plan used by autotrace is tricked into giving us the wrong plan. It is not aware of the true query being executed. I turned on SQL_TRACE instead (more on that in the next section), and for the two queries we can then clearly see:

select * from t t1 where object_name like :search_str and object_type in( 'FUNCTION','PROCEDURE', 'TRIGGER' )      call     count       cpu    elapsed  disk      query    current        rows ------- ------  -------- ---------- ----- ---------- ----------  ---------- Parse        1      0.00       0.00     0          0          0           0 Execute      1      0.00       0.00     0          0          0           0 Fetch        5      0.01       0.09    14         34          0          58 ------- ------  -------- ---------- ----- ---------- ----------  ---------- total        7      0.01       0.09    14         34          0          58      Rows     Row Source Operation -------  ---------------------------------------------------      58  INLIST ITERATOR      58   TABLE ACCESS BY INDEX ROWID T      61    INDEX RANGE SCAN (object id 25244)      select * from t t2 where object_name like :search_str and object_type in( :SYS_B_0,:SYS_B_1, :SYS_B_2 )      call     count       cpu    elapsed  disk      query    current        rows ------- ------  -------- ---------- ----- ---------- ----------  ---------- Parse        1      0.00       0.00     0          0          0           0 Execute      1      0.00       0.00     0          0          0           0 Fetch        5      0.15       1.77   255      19256          0          58 ------- ------  -------- ---------- ----- ---------- ----------  ---------- total        7      0.15       1.77   255      19256          0          58      Rows     Row Source Operation -------  ---------------------------------------------------      58  TABLE ACCESS BY INDEX ROWID T   21783   INDEX RANGE SCAN (object id 25243) 

SQL_TRACE and TKPROF are able to show us what really happened here. The second query was executed using the other index (as evidenced by the differing object IDs), which in this case was the wrong plan. Our query takes 15 to 20 times longer to execute and accesses an incredible amount of data. This problem of over binding will show up in many applications that use a mixture of constants and bind variables. Without the additional information provided by the constants in the query, the optimizer may make the wrong decision. Only by correctly using bind variables where we intend to, and using constants when we must, will we achieve a fine balance. CURSOR_SHARING is a temporary crutch that may get you started, but should not be relied on long term for this reason.

In addition to the optimizer being tricked as above, the use of CURSOR_SHARING may affect other Oracle features. For example, if you refer back to Chapter 7 on Indexes, there is a feature called Function-Based Indexes. I can effectively create an index on a function. Well, Using the same example we used in that chapter where we created an index like this:

tkyte@TKYTE816> create index test_soundex_idx on   2             test_soundex( substr(my_soundex(name),1,6) )   3  /      Index created. 

We would find that CURSOR_SHARING would remove the ability of this query:

tkyte@TKYTE816> select name   2    from test_soundex C   3   where substr(my_soundex(name),1,6) = my_soundex( 'FILE$' )   4  / 

To use the index, since the literal values 1 and 6 would be replaced by bind variables. Here, we can overcome the issue by 'hiding' the constants in a view, but it is an issue to consider none the less. What other 'surprise' incompatibilities will be out there?

Another side effect of CURSOR_SHARING is the possible unexpected change in the size of columns returned by a query. Consider the following example where I display the size of columns returned by a query before and after CURSOR_SHARING. This example uses DBMS_SQL to dynamically parse and describe a query. It will print out the sizes of the columns as reported to the application by Oracle:

tkyte@TKYTE816> declare   2      l_theCursor     integer default dbms_sql.open_cursor;   3      l_descTbl       dbms_sql.desc_tab;   4      l_colCnt        number;   5  begin   6      execute immediate 'alter session set cursor_sharing=exact';   7      dbms_output.put_line( 'Without Cursor Sharing:' );   8      for i in 1 .. 2   9      loop  10          dbms_sql.parse(  l_theCursor,  11                          'select substr( object_name, 1, 5 ) c1,  12                                  55 c2,  13                                 ''Hello'' c3  14                            from all_objects t'||i,  15                          dbms_sql.native );  16  17          dbms_sql.describe_columns( l_theCursor,  18                                         l_colCnt, l_descTbl );  19  20          for i in 1 .. l_colCnt loop  21              dbms_output.put_line( 'Column ' ||  22                                    l_descTbl(i).col_name ||  23                                    ' has a length of ' ||  24                                    l_descTbl(i).col_max_len ) ;  25          end loop;  26          execute immediate 'alter session set cursor_sharing=force';  27          dbms_output.put_line( 'With Cursor Sharing:' );  28      end loop;  29  30      dbms_sql.close_cursor(l_theCursor);  31      execute immediate 'alter session set cursor_sharing=exact';  32  end;  33  / Without Cursor Sharing: Column C1 has a length of 5 Column C2 has a length of 2 Column C3 has a length of 5 With Cursor Sharing: Column C1 has a length of 30 Column C2 has a length of 22 Column C3 has a length of 32      PL/SQL procedure successfully completed. 

The reason column one went from a length of 5 to a length of 30 is because the function SUBSTR(OBJECT_NAME, 1, 5) was rewritten as SUBSTR(OBJECT_NAME, :SYS_B_0, :SYS_B_1). The database no longer knew that the maximum length this function could possibly return was 5, it was now 30 (the length of OBJECT_NAME). The length of column two went from 2 to 22 because the database no longer knew the number 55 would be returned C only that a number would be returned and numbers can be up to 22 bytes in length. The last column just defaulted to a value C if HELLO had been larger, the default would have been larger as well (for example, if you use a 35 byte string, the default would be 128 bytes).

You might say 'so what, the size of the returned data won't be different, just the reported length?' Well, the problem will come in with any SQL*PLUS scripts that you have, any reports you run using various tools, anything that relies on the database to tell it the length of the column for formatting will be adversely affected. The output from these applications will be different than it was without CURSOR_SHARING; your finely laid out reports will have their formatting destroyed in many cases. Just consider what impact this might have on an existing suite of applications! The side effect is very easy to observe:

tkyte@TKYTE816> > select substr(object_name,1,2)   2    from all_objects t1   3   where rownum = 1   4  /      SU -- /1      tkyte@TKYTE816> alter session set cursor_sharing = force;      Session altered.      tkyte@TKYTE816> select substr(object_name,1,2)   2    from all_objects t2   3   where rownum = 1   4  /      SUBSTR(OBJECT_NAME,1,2) ------------------------------ /1 

SQL*PLUS went from 2 characters per column to 30 characters per column. This would adversely affect reports that have been running successfully.

Am I Using Bind Variables?

One question I get, usually right after I ask 'are you using bind variables?', is 'how can I tell if I am?' Fortunately, figuring this out is pretty straightforward; all of the information we need is in the shared pool.

I've set up a script that I use (and pass around) frequently, which points out statements that look like they could be the same, only if they used bind variables. In order to show you how this script works, I'll artificially fill up my shared pool with 'bad' SQL that doesn't use bind variables:

tkyte@TKYTE816> create table t ( x int );      Table created.      tkyte@TKYTE816> begin   2      for i in 1 .. 100   3      loop   4          execute immediate 'insert into t values ( ' || i || ')';   5      end loop;   6  end;   7  /      PL/SQL procedure successfully completed. 

Now, I'm ready for the script. It starts by creating a function that removes constants from strings. It will take SQL statements such as:

insert into t values ( 'hello', 55 ); insert into t values ( 'world', 66 ); 

And turn them into:

insert into t values ( '#', @ ); 

All statements that could look the same if they used bind variables will be clearly visible to us C both of the above unique inserts will become the same insert statement after substitution. The function to do this transformation for us then is the following:

tkyte@TKYTE816> create or replace   2  function remove_constants( p_query in varchar2 )   3  return varchar2   4  as   5      l_query long;   6      l_char  varchar2(1);   7      l_in_quotes boolean default FALSE;   8  begin   9      for i in 1 .. length( p_query )  10      loop  11          l_char := substr(p_query,i,1);  12          if ( l_char = '''' and l_in_quotes )  13          then  14              l_in_quotes := FALSE;  15          elsif ( l_char = '''' and NOT l_in_quotes )  16          then  17              l_in_quotes := TRUE;  18              l_query := l_query || '''#';  19          end if;  20          if ( NOT l_in_quotes ) then  21              l_query := l_query || l_char;  22          end if;  23      end loop;  24      l_query := translate( l_query, '0123456789', '@@@@@@@@@@' );  25      for i in 0 .. 8 loop  26          l_query := replace( l_query, lpad('@',10-i,'@'), '@' );  27          l_query := replace( l_query, lpad(' ',10-i,' '), ' ' );  28      end loop;  29      return upper(l_query);  30  end;  31  /      Function created. 

For the main body of the script, we'll make a copy of the V$SQLAREA table C this view is expensive to query and we only want to query it once. We copy it into the temporary table, so we can work on its contents:

tkyte@TKYTE816> create global temporary table sql_area_tmp   2  on commit preserve rows   3  as   4  select sql_text, sql_text sql_text_wo_constants   5    from v$sqlarea   6   where 1=0   7  /      Table created.           tkyte@TKYTE816> insert into sql_area_tmp (sql_text)   2  select sql_text from v$sqlarea   3  /      436 rows created. 

We go through and update each and every row in this table to compute the transformed SQL_TEXT, removing the constants:

tkyte@TKYTE816> update sql_area_tmp   2     set sql_text_wo_constants = remove_constants(sql_text);   3  /      436 rows updated. 

and now we are readying to find the 'bad' SQL:

tkyte@TKYTE816> select sql_text_wo_constants, count(*)   2    from sql_area_tmp   3   group by sql_text_wo_constants   4  having count(*) > 10   5   order by 2   6  /      SQL_TEXT_WO_CONSTANTS            COUNT(*) ------------------------------ ---------- INSERT INTO T VALUES ( @)             100 

This clearly shows that there are 100 INSERT statements in my shared pool that differ only in one numeric field in the values clause. This would most likely indicate someone forgot to use bind variables. There are some legitimate cases where you might have a reasonable number of copies of a SQL statement in the shared pool C for example, there might be five tables named T in the database. Once we determine there is no legitimate cause, we must track down that person, teach them the proper way, and make them fix it. I count this as a bug in the program, not as something we have to live with C it must be corrected.

Bind Variables and Parsing Wrap-Up

In this section, we discussed the importance of using bind variables in your application, as well as the desire to minimize the number of times you parse a query as well. We looked at a feature, CURSOR_SHARING, which would appear to be a panacea to these issues, only to discover this is not entirely the case. CURSOR_SHARING may be used as a temporary stop-gap solution to particular application performance issues but ultimately, the only way to achieve all of the performance and scalability you can, you must implement your applications correctly.

I cannot stress this point enough. I have personally seen many more than one system fail due to not taking heed of the above facts. As I said in the very beginning of this book, if I were to write a book on how to build a non-scalable, non-performant Oracle application, it would have one chapter that simply said 'don't use bind variables.' Using bind variables and good parsing techniques in your applications won't guarantee scalability, but not using them however, will assure the opposite.

SQL_TRACE, TIMED_STATISTICS, and TKPROF

SQL_TRACE, TIMED_STATISTICS, and TKPROF are some of my favorite tools. I have used them countless times to determine where the performance issue in a system lies. Given that many times, one is tuning a system they did not write, knowing where to look is difficult. These settings and tool are a great place to start.

In a nutshell, SQL_TRACE enables the logging of all the SQL your application performs, performance statistics regarding the execution of that SQL, and the query plans your SQL actually used. As demonstrated in the previous section on CURSOR_SHARING, AUTOTRACE shows the wrong plan, SQL_TRACE and TKPROF got it right. TIMED_STATISTICS is a parameter that enables the server to tell us how long each step takes. Finally, TKPROF is a simple program used to format the raw trace file into something more readable. What I will do in this section is show you how to use SQL_TRACE and TKPROF, and to explain the meaning of what is included in the files used by these facilities. I will not be describing how you might take a particular query and tune it so much, as show how to use these tools to find the queries to tune. For more information on tuning individual queries C I recommend the Oracle8i Designing and Tuning for Performance Manual. It covers the various access plans queries may use, how to use hints to tune queries, and so on.

The parameter TIMED_STATISTICS controls whether Oracle will collect timing information for various activities in the database. It has two settings, TRUE and FALSE. This feature is so useful, I generally leave it on TRUE even when not tuning C it's performance impact on a database is negligible in general (there is an issue in Oracle 8.1.5 where by shared SQL might be defeated if TIMED_STATISTICS is set to TRUE). It may be set at either the SYSTEM or the SESSION level and may be 'globally' set in the initialization file for the database. If you put it in the INIT.ORA file for your instance, you will simply add:

timed_statistics=true 

In the init.ora and the next time you restart the database, it will be enabled. Alternatively, to enable it for your session, you may issue:

tkyte@TKYTE816> alter session set timed_statistics=true;      Session altered. 

And to turn this on for the entire system:

tkyte@TKYTE816> alter system set timed_statistics=true;      System altered. 

As I said, this is so useful to have, that I just leave it on all of the time by setting it to TRUE in my init.ora parameter file. The performance overhead due to this is not measurable and the effect of not having this information is that you cannot monitor performance at all.

Setting Up Tracing

SQL_TRACE may also be enabled at the system or session level. It generates so much output and is such a performance impact that you will almost always selectively enable it C you will rarely, if ever, enable it for the system in the init.ora file. SQL_TRACE has two settings as well, TRUE and FALSE. If set to TRUE, it will generate trace files to the directory specified by the init.ora parameter USER_DUMP_DEST, when using dedicated servers to connect to Oracle and BACKGROUND_DUMP_DEST when using a multi-threaded server (MTS) connection. I would recommend never attempting to use SQL_TRACE with MTS however, as the output from your sessions queries will be written to many various trace files, as your session migrates from shared server to shared server. Under MTS, interpreting SQL_TRACE results is nearly impossible. Another important init.ora parameter is MAX_DUMP_FILE_SIZE. This limits the maximum size of a trace file Oracle will generate. If you discover that your trace files are truncated, you will increase this setting. It may be changed via an alter system or session command. MAX_DUMP_FILE_SIZE may be specified in one of three ways:

I do not recommend UNLIMITED C it is far too easy to completely fill up a file system in this manner; a setting of 50 to 100 MB should be more then sufficient.

What are the various ways to enable SQL_TRACE? There are quite a few, but the ones I use mostly are:

There are other methods as well, however, the above three are the ones I use and see used most often. The ALTER SESSION SET SQL_TRACE and SYS.DBMS_SYSTEM methods of setting SQL_TRACE on are very straightforward and they are self-explanatory. The EVENT method however, is a little more obscure. It uses an internal (and undocumented) event facility within Oracle. In short, the command you would use will look like this:

alter session set events '10046 trace name context forever, level <n>'; alter session set events '10046 trace name context off'; 

Where N is one of the following values:

Using the DBMS_SUPPORT package is another method of setting SQL_TRACE with bind variable and wait event support. In order to get DBMS_SUPPORT however, you must contact Oracle Support, as it is not delivered as part of the normal installation, because it is simply an interface on top of the ALTER SYTEM SET EVENTS command above, using the ALTER command is perhaps easier.

Now we know how to turn SQL_TRACE on, the question becomes' how can we best make use of it? I myself, like to have a switch I can send to my applications on the command line or via a URL (if they are on the web) that tells them to turn tracing on. This lets me easily capture SQL_TRACE information for a single session. Many Oracle tools allow you to do this as well. For example, if you use Oracle forms, you can execute:

C:\> ifrun60 module=myform userid=scott/tiger statistics=yes 

The STATISTICS=YES is a flag to forms to tell it to issue ALTER SESSION SET SQL_TRACE=TRUE. If all of your applications do the same, you will find that tuning that application is easier. You can ask the single user, who is having a performance issue, to run with tracing enabled. Then ask them to reproduce the performance issue. You'll have all of the information you need to figure out why it is running slow. You won't have to ask them for the steps to reproduce the slowdown C you'll ask them to reproduce the slowdown and just analyze the results of it. If you trace with bind variables and wait events, you'll have more than enough information to figure out what is going wrong.

If you have an application provided by a third party, or you have existing applications that are not SQL_TRACE-enabled, what can you do to trace them? There are two approaches that I take. One approach, if the application is a client-server application and stays connected to the database, is to have the session you want to trace start up the application and log into the database. Then, by querying V$SESSION, you would determine that session's SID and SERIAL#. Now you can call SYS.DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION to enable tracing in that single session. Today however, many applications are web-based and this trick does not work as well. The sessions are very short and they come and go frequently. What we need is an ability to set SQL_TRACE on for a 'user' C whenever this user is in the database, we need to have set SQL_TRACE on for them. Fortunately, we can do that via the LOGON DDL trigger in the database. For example, a trigger I frequently use in Oracle 8i (database event triggers such AFTER LOGON are a new feature of Oracle 8.1.5 and up) is:

create or replace trigger logon_trigger after logon on database begin   if ( user = 'TKYTE' ) then     execute immediate     'ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT FOREVER, LEVEL 4''';   end if; end;/ 

This will enable tracing any time I logon to the database. The application does not have to participate in the setting of SQL_TRACE C we'll do this ourselves.

Using and Interpreting TKPROF Output

TKPROF is nothing more than a simple command line tool to format a raw trace file into something we can read easily. It is an excellent utility that is not used enough. I believe this is mostly due to ignorance of its existence. Now that you know it exists, I would expect you would use it constantly.

What I'll do in this section is to run a query with tracing enabled. We'll take a detailed look at the TKPROF report and learn what we need to look for in these reports:

tkyte@TKYTE816> show parameter timed_statistics;      NAME                                 TYPE    VALUE ------------------------------------ ------- ------------------------------ timed_statistics                     boolean TRUE      tkyte@TKYTE816> alter session set sql_trace=true;      Session altered.      tkyte@TKYTE816> > select owner, count(*)   2  from all_objects   3  group by owner;      OWNER                            COUNT(*) ------------------------------ ---------- CTXSYS                                185 DBSNMP                                  4 DEMO                                    5 DEMO11                                  3 MDSYS                                 176 MV_USER                                 5 ORDPLUGINS                             26 ORDSYS                                206 PUBLIC                               9796 SCOTT                                  18 SEAPARK                                 3 SYS                                 11279 SYSTEM                                 51 TKYTE                                  32 TYPES                                   3      15 rows selected.      tkyte@TKYTE816> select a.spid   2    from v$process a, v$session b   3   where a.addr = b.paddr   4     and b.audsid = userenv('sessionid')   5  /      SPID --------- 1124 

Here I verified that timed statistics was enabled (it is almost useless to execute SQL_TRACE without it) and then enabled SQL_TRACE. I then ran a query I wanted to analyze. Finally I ran a query to get my SPID, (server process ID) C this is very useful in identifying my trace file. After I executed this query, I exited SQL*PLUS and went to the directory on the database server specified by the USER_DUMP_DEST init.ora parameter. You can retrieve the value of this parameter online by querying the V$PARAMETER view or using DBMS_UTILITY, (which does not require access to the V$PARAMETER view):

tkyte@TKYTE816> declare   2  l_intval number;   3  l_strval varchar2(2000);   4  l_type   number;   5  begin   6     l_type := dbms_utility.get_parameter_value   7                         ('user_dump_dest', l_intval, l_strval);   8     dbms_output.put_line(l_strval );   9  end;   10  / C:\oracle\admin\tkyte816\udump      PL/SQL procedure successfully completed. 

In this directory, I found:

C:\oracle\ADMIN\tkyte816\udump>dir  Volume in drive C has no label.  Volume Serial Number is F455-B3C3       Directory of C:\oracle\ADMIN\tkyte816\udump      03/16/2001  02:55p      <DIR>          . 03/16/2001  02:55p      <DIR>          .. 03/16/2001  08:45a               5,114 ORA00860.TRC 03/16/2001  02:52p               3,630 ORA01112.TRC 03/16/2001  02:53p               6,183 ORA01124.TRC                3 File(s)         14,927 bytes                2 Dir(s)  13,383,999,488 bytes free 

A few trace files C this is where the SPID will come in handy. My trace file is ORA01124.TRC. I know that because the SPID is part of the filename. On UNIX a similar naming convention is used that also incorporates the SPID. One issue with trace files is that the files in this directory may not be readable by you, if you are not in the administrative group for Oracle (for example, the DBA group in UNIX). If not, you should ask your DBA to set:

_trace_files_public = true 

In the init.ora file of your test and development machines. This will allow all users to read trace files on the server. You should not use this setting on a production machine as these files can contain sensitive information. On a test or development platform, it should be safe to use. Notice that this init.ora parameter starts with an underscore. It is undocumented and unsupported by Oracle corporation. Again, like the EVENTS command we will use later, it is general knowledge and widely used C search Google or any other search engine for _trace_files_public, and you'll find many references to this parameter.

Now that we have identified our trace file, we need to format it. We can (and will, further on in this section) read the raw trace file. About 90 percent of the information we need is easily retrieved from a nicely formatted report however. The remaining 10 percent of the information is typically not needed, and when it is, we'll have to read it from the trace file itself. In order to format the trace file, we will use the TKPROF command line utility. In its simplest form we will just execute:

C:\oracle\ADMIN\tkyte816\udump>tkprof ora01124.trc report.txt      TKPROF: Release 8.1.6.0.0 - Production on Fri Mar 16 15:04:33 2001      (c) Copyright 1999 Oracle Corporation.  All rights reserved. 

The parameters to the TKPROF command are the input file name and the output file name. Now, we just need to edit REPORT.TXT, and we will find the following information:

select owner, count(*) from all_objects group by owner      call     count       cpu    elapsed  disk      query    current        rows ------- ------  -------- ---------- ----- ---------- ----------  ---------- Parse        1      0.00       0.00     0          0          0           0 Execute      1      0.00       0.00     0          0          0           0 Fetch        2      1.20       1.21     0      86091          4          15 ------- ------  -------- ---------- ----- ---------- ----------  ---------- total        4      1.20       1.21     0      86091          4          15      Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 69      Rows     Row Source Operation -------  ---------------------------------------------------      15  SORT GROUP BY   21792   FILTER   21932    NESTED LOOPS      46     TABLE ACCESS FULL USER$   21976     TABLE ACCESS BY INDEX ROWID OBJ$   21976      INDEX RANGE SCAN (object id 34)       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       0    FIXED TABLE FULL X$KZSPR       0    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR   11777    NESTED LOOPS   30159     FIXED TABLE FULL X$KZSRO   28971     TABLE ACCESS BY INDEX ROWID OBJAUTH$   28973      INDEX RANGE SCAN (object id 101)     631    TABLE ACCESS BY INDEX ROWID IND$     654     INDEX UNIQUE SCAN (object id 36) 

TKPROF is showing us a lot of information here. We'll take it piece by piece:

select owner, count(*) from all_objects group by owner 

First, we see the original query as the server received it. We should be able to recognize our queries easily in here. In this case, it is exactly as I had typed it in. Next comes the overall execution report for this query:

call     count       cpu    elapsed  disk      query    current        rows ------- ------  -------- ---------- ----- ---------- ----------  ---------- Parse        1      0.00       0.00     0          0          0           0 Execute      1      0.00       0.00     0          0          0           0 Fetch        2      1.20       1.21     0      86091          4          15 ------- ------  -------- ---------- ----- ---------- ----------  ---------- total        4      1.20       1.21     0      86091          4          15 

Here we see the three main phases of the query:

The column headings in this section have the following meanings:

The important threads or facts to look for in this section of the report are as follows:

A high (near 100 percent) parse count to execute count ratio when the execute count is greater than one C Here, you take the number of times you parsed this statement and divide by the number of times you executed it. If this ratio is 1 C then you parsed this query each and every time you executed it and that needs to be corrected. We would like this ratio to approach zero. Ideally, the parse count would be one and the execute count would be higher than one. If we see a high parse count, this implies we are performing many soft parses of this query. If you recall from the previous section, this can drastically reduce your scalability, and will impact your run-time performance even in a single user session. You should ensure that you parse a query once per session and execute it repeatedly C you never want to have to parse your SQL for each execution.

Execute count of one for all or nearly all SQL C If you have a TKPROF report in which all SQL statements are executed one time only, you are probably not using bind variables (they queries all look alike except for some constant in them). In a real application trace, we would expect very little 'unique' SQL; the same SQL should be executed more than once. Too much unique SQL typically implies you are not using bind variables correctly.

A large disparity between CPU time and elapsed time C This would indicate that you spent a lot of time waiting for something. If you see that you took one CPU second to execute, but it required ten seconds by the wall clock, it means you spent 90 percent of your run-time waiting for a resource. We'll see later on in this section how we can use the raw trace file to determine the cause of the wait. This wait could be for any number of reasons. For example, an update that was blocked by another session would have a very large elapsed time versus CPU time. A SQL query that performs lots of physical disk I/O might have lots of wait time for I/O to complete.

A large CPU or elapsed time number C These are your queries that represent the 'lowest hanging fruit'. If you can make them go faster, your program will go faster. Many times, there is one monster query gumming up the works; fix this and the application works just fine.

A high (FETCH COUNT)/(rows fetched) ratio C Here we take the number of FETCH calls (two in our example) and the rows fetched count (15 in our example). If this number is near one and the rows fetched is greater than one, our application is not performing bulk fetches. Every language/API has the ability to do this C to fetch many rows at a time in a single call. If you do not utilize this ability to bulk fetch, you will spend much more time performing round trips from the client to the server than you should. This excessive switching back and forth, in addition to generating a very 'chatty' network situation, is much slower than fetching many rows in one call. How you direct your application to bulk fetch is 'language/API' dependent. For example, in Pro*C you would pre-compile with prefetch=NN, in Java/JDBC you would call the SETROWPREFETCH method, in PL/SQL you would use the BULK COLLECT directive in a SELECT INTO, and so on. The above example shows that SQL*PLUS (the client we used), called fetch twice in order to retrieve 15 rows. This shows that SQL*PLUS used an array size of at least eight rows. In fact, SQL*PLUS uses by default, an array size of 15 rows C the second fetch it made returned zero records, it just got the end-of file.

An excessively high disk count C This is harder to evaluate as a rule of thumb, however if the DISK COUNT = QUERY + CURRENT MODE BLOCK COUNT, then all blocks, were read from disk. We would hope that if the same query were executed again, some of the blocks would be found in the SGA. You should consider a high disk count value to be a red flag, something to investigate. You might have to do some SGA resizing or work on the query to develop one that requires less block reads.

An excessively high query or current count C This indicates your query does a lot of work. Whether this is an issue or not is subjective. Some queries just hit a lot of data, as our example above does. A query that is executed frequently however, should have relatively small counts. If you add query and current mode blocks from the total line and divide that by the count column from the execute line, you would expect a small number.

Lets go, onto the next part of the report:

Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 69 

This is telling us that the query we executed was found in the shared pool C we did not generate a miss on the library cache during this parse. It indicates that we performed a soft parse of the query. The very first time a query is executed, we would expect this count to be one. If almost every query you execute has a one for this value, it would indicate you are not using bind variables (and you need to fix that). You are not reusing SQL.

The second line there, informs us of the optimizer mode that was in place during the execution of this query. This is informational only C the query plan developed and used would be affected by this setting.

Finally, the USERID used to parse the query is presented. This can be resolved into a USERNAME via:

tkyte@TKYTE816> select * from all_users where user_id = 69;      USERNAME                          USER_ID CREATED ------------------------------ ---------- --------- TKYTE                                  69 10-MAR-01 

showing that I ran it. The last section of the TKPROF report for this query is the query plan. The query plan that appears by default is shown below:

Rows     Row Source Operation -------  ---------------------------------------------------      15  SORT GROUP BY   21792   FILTER   21932    NESTED LOOPS      46     TABLE ACCESS FULL USER$   21976     TABLE ACCESS BY INDEX ROWID OBJ$   21976      INDEX RANGE SCAN (object id 34)       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       0    FIXED TABLE FULL X$KZSPR       0    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR       1    FIXED TABLE FULL X$KZSPR   11777    NESTED LOOPS   30159     FIXED TABLE FULL X$KZSRO   28971     TABLE ACCESS BY INDEX ROWID OBJAUTH$   28973      INDEX RANGE SCAN (object id 101)     631    TABLE ACCESS BY INDEX ROWID IND$     654     INDEX UNIQUE SCAN (object id 36) 

This is the actual query plan that was used by Oracle at run-time. The interesting thing about this plan is that the rows that flow through each step of the plan are visible. We can see for example that 28971 rows were fetched from OBJAUTH$. These counts are the row counts of the rows flowing out of that step of the execution plan (after any predicates that could be applied where applied to OBJAUTH$, it sent 28971 onto the next step in the plan). In Oracle 8.0 and before, this count was the count of rows inspected by that phase of the execution plan (the number of rows flowing into this step). For example, if 50000 rows were considered in OBJAUTH$, but some WHERE clause was used to exclude them, a TKPROF report from Oracle 8.0 would have reported 50,000 instead. Using this sort of information, you can see what steps you might want to avoid in a query, and either record the query, or use HINTS to come up with a more desirable plan.

You'll notice that there are a mixture of object names (for example, TABLE ACCESS BY INDEX ROWID IND$) and object IDs (for example, INDEX UNIQUE SCAN (object id 36)). This is because the raw trace file does not record all of the object names, only the object IDs for some objects. Also, TKPROF will not connect to the database to turn the object IDs into object names by default. We can easily turn this object ID into an object name via the query:

tkyte@TKYTE816> select owner, object_type, object_name   2  from all_objects   3  where object_id = 36;      OWNER                          OBJECT_TYPE        OBJECT_NAME ------------------------------ ------------------ -------------- SYS                            INDEX              I_IND1 

Alternatively, we could add the EXPLAIN= parameter to TKPROF as follows:

C:\oracle\ADMIN\tkyte816\udump>tkprof ora01124.trc x.txt explain=tkyte/tkyte 

in this case, we would receive the following error in the output file:

error during parse of EXPLAIN PLAN statement ORA-01039: insufficient privileges on underlying objects of the view 

While we can run the query, the base table that the view accesses are not visible to us. In order to get the explain plan for this query we would use the SYS account, or some other account, that does have access to the underlying objects.

Note 

I prefer to never use the EXPLAIN= however and would recommend the same for you.

The reason is that the explain plan query may differ radically from the actual query used at run-time. The only plan that can be trusted is the plan saved in the trace file itself. Here is a simple example using TKPROF with the explain=userid/password that demonstrates this:

select count(object_type) from  t where object_id > 0      call     count       cpu    elapsed  disk      query    current        rows ------- ------  -------- ---------- ----- ---------- ----------  ---------- Parse        1      0.00       0.00     0          0          0           0 Execute      1      0.00       0.00     0          0          0           0 Fetch        2      0.19       2.07   337      20498          0           1 ------- ------  -------- ---------- ----- ---------- ----------  ---------- total        4      0.19       2.07   337      20498          0           1      Misses in library cache during parse: 1 Optimizer goal: CHOOSE Parsing user id: 69  (TKYTE)      Rows     Row Source Operation -------  ---------------------------------------------------       1  SORT AGGREGATE   21790   TABLE ACCESS BY INDEX ROWID T   21791    INDEX RANGE SCAN (object id 25291)           Rows     Execution Plan -------  ---------------------------------------------------       0  SELECT STATEMENT   GOAL: CHOOSE       1   SORT (AGGREGATE)   21790    TABLE ACCESS   GOAL: ANALYZED (FULL) OF 'T' 

Obviously, one of the plans is incorrect; one shows an index range scan and table access by row ID, the other a simple full scan. Unless you knew that I analyzed the table after you executed your query, but before you ran TKPROF, you would be unable to explain this disparity. After I analyzed the table, the default plan for that query changed dramatically. TKPROF just uses the explain plan facility in Oracle. This will return the query plan that would be used at this point in time; not the plan that was actually used. Many features can impact the plan visible in the trace file versus the plan returned by explain plan. For example, the application could have been used stored query outlines (see Chapter 11 on Optimizer Plan Stability for more details on this feature). The query plan would have been based on a stored outline at run-time whereas the query plan returned by explain plan would be some other plan. In general, if you do use the EXPLAIN= parameter to TKPROF, you must verify that the two plans agree with each other step by step.

TKPROF has many command line options and if you just type TKPROF on the command line, you'll get to see them all:

C:\Documents and Settings\Thomas Kyte\Desktop>tkprof      Usage: tkprof tracefile outputfile [explain= ] [table= ]               [print= ] [insert= ] [sys= ] [sort= ]   table=schema.tablename   Use 'schema.tablename' with 'explain=' option.   explain=user/password    Connect to ORACLE and issue EXPLAIN PLAIN.   print=integer    List only the first 'integer' SQL statements.   aggregate=yes|no   insert=filename  List SQL statements and data inside INSERT statements.   sys=no           TKPROF does not list SQL statements run as user SYS.   record=filename  Record non-recursive statements found in the trace file.   sort=option      Set of zero or more of the following sort options:     prscnt  number of times parse was called     prscpu  cpu time parsing     prsela  elapsed time parsing     prsdsk  number of disk reads during parse     prsqry  number of buffers for consistent read during parse     prscu   number of buffers for current read during parse     prsmis  number of misses in library cache during parse     execnt  number of execute was called     execpu  cpu time spent executing     exeela  elapsed time executing     exedsk  number of disk reads during execute     exeqry  number of buffers for consistent read during execute     execu   number of buffers for current read during execute     exerow  number of rows processed during execute     exemis  number of library cache misses during execute     fchcnt  number of times fetch was called     fchcpu  cpu time spent fetching     fchela  elapsed time fetching     fchdsk  number of disk reads during fetch     fchqry  number of buffers for consistent read during fetch     fchcu   number of buffers for current read during fetch     fchrow  number of rows fetched     userid  userid of user that parsed the cursor 

The most useful option in my opinion is the sort= option. I like to sort by the various CPU and elapsed time metrics to get the 'worst' queries to pop up to the top of the trace file. You can also use this to find the queries that do too much I/O and so on. The remaining options are self-explanatory. 99.9 percent of the time I simply use tkprof tracefilename reportfilename and nothing else. This shows the SQL more or less in the order it was executed at run-time. I might use a tool such as grep in UNIX or find in windows to extract all of the total lines so I know what queries to zoom in on. For example, using our report.txt:

C:\oracle\ADMIN\tkyte816\udump>find "total" report.txt      ---------- REPORT.TXT total        2      0.00       0.00     0          0          0           0 total        4      0.01       0.02     1          1          4           1 total        4      1.20       1.21     0      86091          4          15 total        6      0.01       0.01     0          4          0           2 total        4      0.00       0.00     0          0          0           1 total       14      1.21       1.23     1      86092          8          17 total        6      0.01       0.01     0          4          0           2 

shows me I should edit report.txt and search for 1.21 if I want to speed up this process. There are other statements in there but obviously; this is the one I need to focus on if I want this to go faster.

Using and Interpreting Raw Trace Files

There are two types of trace files in Oracle; those generated by SQL_TRACE (these are useful to us) and those generated as a result of a failed session (a bug in the database software). The second type, from a failed session, is not useful to us directly; they are only useful for sending to Oracle Support for analysis. The first type of trace file is very useful to us, especially if we know how to read and interpret them.

Most of the time, we will use TKPROF to format the trace files for use, however from time to time we need to dig into the trace file to understand more about what happened than TKPROF will tell us. For example, suppose someone gives you a TKPROF report with:

UPDATE EMP SET ENAME=LOWER(ENAME) WHERE  EMPNO = :b1           call     count       cpu    elapsed  disk      query    current        rows ------- ------  -------- ---------- ----- ---------- ----------  ---------- Parse        1      0.00       0.00     0          0          0           0 Execute      1      0.00      54.25     0         17          8           1 Fetch        0      0.00       0.00     0          0          0           0 ------- ------  -------- ---------- ----- ---------- ----------  ---------- total        2      0.00      54.25     0         17          8           1      

It is obvious we have a problem here C it took almost a minute to do an update of a single row, even though it took no measurable CPU time. The problem is that we had some massive wait event on something, but TKPROF won't tell us what we were waiting on. Additionally, it would be nice if we knew which row it was that we were going after (such as what the EMPNO value in :b1 was). Information like that might help us track down how we go into this situation. Fortunately, the application was traced with the following command:

alter session set events '10046 trace name context forever, level 12'; 

so the trace file has both wait events and bind variables in it. Let's take a look at the raw trace file from top to bottom. The piece of code that I traced was:

scott@TKYTE816> alter session set events   2  '10046 trace name context forever, level 12';      Session altered.      scott@TKYTE816> declare   2  l_empno number default 7698;   3  begin   4     update emp set ename = lower(ename) where empno = l_empno;   5  end;   6  /      PL/SQL procedure successfully completed.      scott@TKYTE816> exit 

We do know the EMPNO being used in this simple example but in general, you would not. Here are the contents of the trace file interspersed with comments:

Dump file C:\oracle\admin\tkyte816\udump\ORA01156.TRC Sat Mar 17 12:16:38 2001 ORACLE V8.1.6.0.0 - Production vsnsta=0 vsnsql=e vsnxtr=3 Windows 2000 Version 5.0 , CPU type 586 Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production With the Partitioning option JServer Release 8.1.6.0.0 - Production Windows 2000 Version 5.0 , CPU type 586 Instance name: tkyte816      Redo thread mounted by this instance: 1      Oracle process number: 11      Windows thread id: 1156, image: ORACLE.EXE 

This is a standard trace file header. It is useful to identify the exact system version and database version you are executing on. It also has the Oracle SID in it (Instance Name), which is useful for confirming you are looking at the right trace file in the first place.

*** 2001-03-17 12:16:38.407 *** SESSION ID:(7.74) 2001-03-17 12:16:38.407 APPNAME mod='SQL*PLUS' mh=3669949024 act='' ah=4029777240 ===================== 

The above APPNAME record was made by a call to the DBMS_APPLICATION_INFO package (see Appendix A on Necessary Supplied Packages for details on this package). Applications use this package to register themselves in the database, so that a query on V$SESSION can see 'who' they are. SQL*PLUS in particular makes use of this package. You may or may not see an APPNAME record in your trace file depending on the environment. It is an excellent idea for you to have all of your applications register themselves, so hopefully you will see this record with your own module name. The meaning of this record is:

APPNAME mod='%s' mh=%lu act='%s' ah=%lu 

Field

Meaning

mod

Module name as passed to DBMS_APPLICATION_INFO

mh

Module hash value

act

Action as passed to DBMS_APPLICATION_INFO

ah

Action hash value

If you are a C programmer, you'll recognize the C printf format string. This can be used to tell us the data types we can expect to find in the APPNAME record; a %s is a string, the %lu is a long unsigned integer (a number). Continuing on, the next thing I see in my trace file is:

PARSING IN CURSOR #3 len=70 dep=0 uid=54 oct=42 lid=54 tim=6184206 hv=347037164   ad='31883a4' alter session set events '10046 trace name context forever, level 12' END OF STMT EXEC #3:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=4,tim=6184206 WAIT #3: nam='SQL*Net message to client' ela= 0 p1=1111838976 p2=1 p3=0 WAIT #3: nam='SQL*Net message from client' ela= 818 p1=1111838976 p2=1 p3=0 ===================== 

Here we can see the actual statement used to enable tracing. It is preceded by a CURSOR record, which will always be the case (all SQL in the trace file will be preceded by a cursor record). The meaning of the fields in the cursor record is:

Parsing in Cursor #%d len=%d dep=%d uid=%ld oct=%d lid=%ld tim=%ld hv=%ld ad='%s' 

Field

Meaning

Cursor #

The cursor number. You can use this to find the maximum number of open cursors in your application as well since this number is incremented and decremented each time you open a new cursor and close an existing one.

len

Length of the SQL statement to follow.

dep

The recursive depth of a SQL statement. Recursive SQL is SQL performed on behalf of some other SQL. Typically, this is SQL executed by Oracle in order to parse a query or perform space management. It can also be caused by PL/SQL (which is SQL) running SQL. So, you may find your own SQL is 'recursive'.

uid

User ID of the current schema. Note, this may be different than the lid below, especially if you use alter session set current_schema to change the parsing schema.

oct

Oracle Command Type. Numeric code indicating the type of SQL command being parsed.

lid

The user ID used to security-check the access privileges against.

tim

A timer. Its resolution is hundredths of seconds. You can subtract these times from each other as you encounter them in the trace file to see how far apart they occurred.

ha

Hash ID of the SQL statement.

ad

ADDR column of V$SQLAREA that refers to this SQL statement.

Next in the trace file, we can see the statement was actually executed right after parsing. The meanings of the values found in the EXEC record are:

EXEC Cursor#:c=%d,e=%d,p=%d,cr=%d,cu=%d,mis=%d,r=%d,dep=%d,og=%d,tim=%d 

Field

Meaning

Cursor #

The cursor number.

c

CPU time in hundredths of seconds.

e

Elapsed time in hundredths of seconds.

p

Number of physical reads performed.

cr

Consistent (query mode) reads (logical I/O).

cu

Current mode reads (logical I/O).

mis

Cursor miss in the library cache, indicates we had to parse the statement due to its being aged out of the shared pool, never having been in the shared pool, or it was otherwise invalidated.

r

Number of rows processed.

dep

Recursive depth of the SQL statement.

og

Optimizer goal, 1= all rows, 2 = first rows, 3 = rule, 4 = choose

tim

A timer.

There are other variations of the EXEC, record. Instead of the keyword EXEC we might also find:

Field

Meaning

PARSE

Parsing a statement.

FETCH

When fetching rows from a cursor.

UNMAP

Used to show freeing of temporary segments from intermediate results when they are no longer needed.

SORT UNMAP

Same as unmap but for sort segments.

The records for each of PARSE, FETCH, UNMAP, and SORT UNMAP contain the same information as an EXEC record does in the same order.

The last part of this section has our first reports of wait events. In this case they were:

WAIT #3: nam='SQL*Net message to client' ela= 0 p1=1111838976 p2=1 p3=0 WAIT #3: nam='SQL*Net message from client' ela= 818 p1=1111838976 p2=1 p3=0 

These are typical 'busy' wait events that we discussed earlier in this chapter. The message to client is the server sending the client a message and waiting for a response. The message from client is the server waiting for the client to send a request over. In this case, the elapsed time (ela) on that event was 8.18 seconds. This just means I waited 8.18 seconds after sending the ALTER SESSION command to the database to send the next command in this example. Unless you are feeding a constant and continuous stream of requests to the server, 'message from client' waits will be unavoidable and normal. The wait record includes these fields:

WAIT Cursor#: nam='%s' ela=%d p1=%ul p2=%ul p3=%ul 

Field

Meaning

Cursor #

The cursor number.

nam

The name of the wait event. The Oracle Server Reference has a complete list of wait events and details on each one.

ela

Elapsed time in hundredths of seconds for the event.

p1, p2, p3

The parameters specific to the wait event. Each event has its own set of parameters. Refer to the Oracle Server Reference for the meaning of p1, p2, p3 for a specific wait event.

Now we are ready to look at our first real statement in our trace file:

PARSING IN CURSOR #3 len=110 dep=0 uid=54 oct=47 lid=54 tim=6185026 hv=2018962105   ad='31991c8' declare l_empno number default 7698; begin    update emp set ename = lower(ename) where empno = l_empno; end; END OF STMT PARSE #3:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=4,tim=6185026 BINDS #3: ===================== 

Here we see our PL/SQL block of code as we submitted it. We can see from the PARSE record that it parsed very quickly, even though it was not in the library cache yet (MIS=1). We see a BINDS record with nothing after it. That's because this particular block of code has no bind variables. We will revisit this bind record further on down. We'll continue onto the next statement in the trace file where things get really interesting:

PARSING IN CURSOR #4 len=51 dep=1 uid=54 oct=6 lid=54 tim=6185026 hv=2518517322   ad='318e29c'UPDATE EMP SET ENAME=LOWER(ENAME) WHERE EMPNO = :b1 END OF STMT PARSE #4:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=0,tim=6185026 BINDS #4:  bind 0: dty=2 mxl=22(21) mal=00 scl=00 pre=00 oacflg=03 oacfl2=1 size=24 offset=0    bfp=07425360 bln=22 avl=03 flg=05    value=7698 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 308 p1=1415053318 p2=393290 p3=2947 WAIT #4: nam='enqueue' ela= 198 p1=1415053318 p2=393290 p3=2947 EXEC #4:c=0,e=5425,p=0,cr=17,cu=8,mis=0,r=1,dep=1,og=4,tim=6190451 EXEC #3:c=0,e=5425,p=0,cr=17,cu=8,mis=0,r=1,dep=0,og=4,tim=6190451 WAIT #3: nam='SQL*Net message to client' ela= 0 p1=1111838976 p2=1 p3=0 WAIT #3: nam='SQL*Net message from client' ela= 0 p1=1111838976 p2=1 p3=0 ===================== 

Here, we see our UPDATE statement as Oracle sees it. It is different than it was in the PL/SQL block of code; specifically our reference to l_empno (a variable) is replaced with a bind variable. PL/SQL replaces all references to local variables with bind variables before executing the statement. We can also see from the PARSING IN CURSOR record that the recursive depth (dep) is now one, not zero as it was in the original PL/SQL block. We can tell that this is SQL executed by some other SQL (or PL/SQL); it was not submitted by the client application to the database. This is a flag we can use to tell us where to look for this SQL. If the dep is not zero, we know the SQL 'lives' in the database, not in a client application. We can use this to our advantage when tuning. SQL that lives in the database can be changed easily without affecting the application. SQL that lives in the client application requires us to find the client, change the code, recompile it, and re-deploy it. The more SQL I can get to live in my database, the better. I can fix it without touching the application itself.

We see a BINDS record in here again, this time with more information. This update statement has a bind value and we can clearly see what it is C the value of the first bind variable is 7,698. Now, if this were a problem query we were looking at (a poorly performing query), we would have most all of information we needed to tune it. We have the text of the query exactly. We have the values of all bind variables (so we can run it with the same inputs). We even have the wait events that slowed us down during the execution. The only thing missing is the query plan, but that is only because we haven't paged down far enough.

The BIND record in the trace file contains this information:

Field

Meaning

cursor #

The cursor number.

bind N

The bind position starting from 0 (0 being the first bind variable).

dty

Datatype (see below).

mxl

Maximum length of the bind variable.

mal

Maximum array length (when using array binds or bulk operations).

scl

Scale.

pre

Precision.

oacflg

Internal flags. If this number is odd, the bind variable might be null (allows nulls).

oacfl2

Continuation of internal flags.

size

Size of buffer.

offset

Used in piecewise binds.

bfp

Bind Address.

bln

Bind buffer length.

avl

Actual value length.

flag

Internal Flags.

value

The character string representation (if possible, might be a 'hex' dump) of the bind value C this is what we really want!

The dty (data type) number may be decoded using information from the USER_TAB_COLUMNS view. If you select text from all_views where view_name = 'USER_VIEWS', you'll see a decode function call that will map the dty numbers to their character string representations.

The really interesting information, what we were after ultimately, is right here now; the wait information. We can clearly see why the update took almost one minutes wall clock time to complete even though it took no CPU time. The resource we were waiting on was a lock C if you recall from Chapters 3 and 4 where we discussed Locking and Concurrency and Transactions, an enqueue is one of two locking mechanisms Oracle employs internally to serialize access to shared resources. The trace file shows us we were waiting on a lock, we were not waiting on I/O, we were not waiting on a log file sync, we were not waiting for buffers to become free C we were enqueued on some resource. Going further, we can take the p1 parameter, and decode it to see what type of lock we were waiting on. The process for doing that:

tkyte@TKYTE816> create or replace   2  function enqueue_decode( l_p1 in number ) return varchar2   3  as   4      l_str varchar2(25);   5  begin   6      select chr(bitand(l_p1,-16777216)/16777215)||   7             chr(bitand(l_p1, 16711680)/65535)  || ' ' ||   8             decode( bitand(l_p1, 65535),   9                       0, 'No lock',  10                       1, 'No lock',  11                       2, 'Row-Share',  12                       3, 'Row-Exclusive',  13                       4, 'Share',  14                       5, 'Share Row-Excl',  15                       6, 'Exclusive' )  16        into l_str  17        from dual;  18  19      return l_str;  20  end;  21  /      Function created.      tkyte@TKYTE816> tkyte@TKYTE816> select enqueue_decode( 1415053318  ) from dual;      ENQUEUE_DECODE(1415053318) ---------------------------------------------------------------------      TX Exclusive 

This shows that we were waiting on an exclusive row level lock. The answer to why it took one minute to update this row is clear now. There was another session holding a lock on the row we wanted, and they held it for one minute while we waited for it. What we do about this is application-specific. For example, in the above application, I am doing a 'blind update'. If I do not want the application to block on the update, I might process it as:

  select ename from emp where empno = :bv for update NOWAIT;   update emp set ename = lower(ename) where empno = :bv; 

This would avoid the locking issue. At least now we are able to absolutely determine what caused this update to take so long and we are able to diagnose it in a post-mortem fashion. We don't have to 'be there' in order to diagnose this, we just need the relevant trace information.

Finishing up the trace file we see:

PARSING IN CURSOR #5 len=52 dep=0 uid=54 oct=47 lid=54 tim=6190451 hv=1697159799   ad='3532750' BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END; END OF STMT PARSE #5:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=6190451 BINDS #5:  bind 0: dty=1 mxl=2000(255) mal=25 scl=00 pre=00 oacflg=43 oacfl2=10 size=2000   offset=0    bfp=07448dd4 bln=255 avl=00 flg=05  bind 1: dty=2 mxl=22(02) mal=00 scl=00 pre=00 oacflg=01 oacfl2=0 size=24 offset=0    bfp=0741c7e8 bln=22 avl=02 flg=05    value=25 WAIT #5: nam='SQL*Net message to client' ela= 0 p1=1111838976 p2=1 p3=0 EXEC #5:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=6190451 WAIT #5: nam='SQL*Net message from client' ela= 273 p1=1111838976 p2=1 p3=0 

Now, this statement is a surprise. We never executed it ourselves and it isn't recursive SQL (dep=0). It came from the client application. This actually gives us insight into how SQL*PLUS and DBMS_OUTPUT work. I have set serveroutput on set in my login.sql file so whenever I log into SQL*PLUS, DBMS_OUTPUT is enabled by default. After every statement we execute that can generate DBMS_OUTPUT data, SQL*PLUS must call the GET_LINES procedure to get the data, and dump it to the screen (see the section on DBMS_OUTPUT in Appendix A on Necessary Supplied Packages for many more details on DBMS_OUTPUT). Here, we can see SQL*PLUS making that call. Further, we can see that the first parameter, :LINES is in fact an array with 25 slots (mal=25). So, now we know that SQL*PLUS will retrieve the DBMS_OUTPUT buffer 25 lines at a time and dump it to the screen. The fact that we can trace the behavior of SQL*PLUS is interesting; in fact, we can trace the behavior of any piece of software that executes against Oracle, to see what it does and how it does it.

Lastly, at the bottom of our trace file we see:

XCTEND rlbk=0, rd_only=0 WAIT #0: nam='log file sync' ela= 0 p1=988 p2=0 p3=0 STAT #4 id=1 cnt=1 pid=0 pos=0 obj=0 op='UPDATE EMP ' STAT #4 id=2 cnt=2 pid=1 pos=1 obj=24767 op='TABLE ACCESS FULL EMP ' 

The XCTEND (transaction boundary) record is a record of our commit, but once again, we didn't commit. SQL*PLUS did that for us silently when we exited. The values of the XCTEND record are:

Field

Meaning

rlbk

Rollack flag. If 0 we committed. If 1 we rolled back.

rd_Only

Read only flag. If 1 the transaction was read-only. If 0, changes were made and committed (or rolled back).

Immediately after the XCTEND record, we can see we had another wait event C this time for a log file sync. If we refer to the Oracle Server Reference and look up that wait event, we'll discover that the 988 for the p1 value represents the fact that buffer 988 in the redo log buffer had to be written out, and that's what we were waiting on. The wait was less than a hundredth of a second, as evidenced by ela=0.

The last records we observe in this trace file are the STAT records. These are the actual query plans used at run-time by our SQL. This is the plan we can trust. Not only that, but this plan will have the row counts correctly associated with each step of the plan. These records are only produced after the cursor they represent is closed. Generally, this means the client application must exit in order to see these records, simply executing an ALTER SESSION SET SQL_TRACE=FALSE will not necessarily produce these records in the trace file. The values in this record are:

Field

Meaning

cursor #

The cursor number.

id

The line of the explain plan from 1 to number of lines in the plan.

cnt

The number of rows flowing through this phase of the query plan.

pid

Parent ID of this step in the plan. Used to correctly reflect the hierarchy of the plan with indention.

pos

Position in the explain plan.

obj

Object ID of the referenced object if applicable.

op

The textual description of the operation performed.

There are only two other record types that we should find in a SQL_TRACE trace file. They represent errors encountered during the execution of a query. The errors will either be:

I have used the fact that these errors are recorded to the trace file many times in problem solving. If you are using some off-the-shelf application, a third party tool, or even many Oracle commands and you get back a less than helpful error message, it may be very useful to run the command with SQL_TRACE enabled and see what happens. In many cases, the root cause of the error can be gleaned from the trace file since all of the SQL performed on your behalf is recorded there.

In order to demonstrate what these records look like, I executed the following SQL:

tkyte@TKYTE816> create table t ( x int primary key );      Table created.      tkyte@TKYTE816> alter session set sql_trace=true;      Session altered.      tkyte@TKYTE816> select * from; select * from             * ERROR at line 1: ORA-00903: invalid table name           tkyte@TKYTE816> insert into t values ( 1 );      1 row created.      tkyte@TKYTE816> insert into t values ( 1 ); insert into t values ( 1 ) * ERROR at line 1: ORA-00001: unique constraint (TKYTE.SYS_C002207) violated      tkyte@TKYTE816> exit 

Upon reviewing the trace file I discovered:

===================== PARSE ERROR #3:len=15 dep=0 uid=69 oct=3 lid=69 tim=7160573 err=903 select * from ===================== PARSING IN CURSOR #3 len=27 dep=0 uid=69 oct=2 lid=69 tim=7161010 hv=1601248092   ad='32306c0' insert into t values ( 1 ) END OF STMT PARSE #3:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=7161010 ===================== ... EXEC #3:c=1,e=9,p=0,cr=9,cu=7,mis=0,r=0,dep=0,og=4,tim=7161019 ERROR #3:err=1 tim=7161019 

As you can see, finding the problem SQL (the root cause of the error) is trivial with this method; we can clearly see what went wrong. This is extremely useful if you are debugging an error that is happening deep inside of a stored procedure, for example. I've seen more than one case where an error occurs in a deeply nested stored procedure, and someone has a WHEN OTHERS exception handler that catches and ignores all exceptions. It is my opinion that the WHEN OTHERS exception handler should never be used and all applications that use it, and do not re-raise the error immediately, should be erased immediately C they are bugs waiting to happen. In any case, the error happens, but is caught and ignored, and no one is ever notified of it. The procedure appears to have worked, but it did not actually work. Here, a simple SQL_TRACE will show us what the procedure actually does and we can see that there was an error. Then all we need do is figure out why in the world it is being ignored. I also use this when commands return less than useful error messages. For example, if a snapshot (materialized view) refresh returns ORA-00942: table or view does not exist, using SQL_TRACE will be extremely useful. You might not be aware of all of the SQL that is executed on your behalf to refresh a materialized view, and the number of tables that are 'touched'. Using SQL_TRACE you'll actually be able to figure out which table or view does not exist, and put the correct permissions in place.

The format of the PARSE ERROR record is:

Field

Meaning

len

Length of the SQL statement

dep

Recursive depth of the SQL statement

uid

Parsing schema (may be different from the privilege schema)

oct

Oracle command type

lid

Privilege schema ID, whose privilege set the statement is actually executing under

tim

A timer

err

The ORA error code. If you execute:

tkyte@TKYTE816> EXEC DBMS_OUTPUT.PUT_LINE(SQLERRM(-903)); ORA-00903: invalid table name 

You can get the text of the error.

the contents of the ERROR record is simply:

Field

Meaning

cursor #

The cursor number

err

The ORA error code

tim

A timer

SQL_TRACE, TIMED_STATISTICS, and TKPROF Wrap-Up

In this section we took a fairly in depth look at a set of tools - a set of invaluable tools that work in all environments, and are always available to us. SQL_TRACE and the result reports available via TKPROF is one of the most powerful tuning and debugging tools you have available. I have used them to successfully debug and tune applications on countless occasions. Their ubiquity (there is no Oracle installation anywhere that does not have these tools) coupled with their power makes them second to none when it comes to tuning.

If you understand how to trace your application and how to interpret the results, you'll be halfway done with the tuning of your application. The other half is left to you - that is where you figure out why your query needs to access a million block buffers, or why it is waiting five minutes on enqueue waits. SQL_TRACE will tell you what to look at - fixing the problem is typically easier than finding it, especially if you are unfamiliar with the application in the first place.

DBMS_PROFILER

If you are a heavy PL/SQL user, you'll welcome the addition of a source code profiler in the database. With Oracle 8i, we have a full-blown source code profiler for PL/SQL integrated into the database. This package, DBMS_PROFILER, allows us to zero in on the routines and packages where we spend the largest amount of our execution time. Not only that, but it provides code coverage analysis, allowing us to tell if we have exercised 10 percent or 99 percent of our code with a given test run.

In Appendix A on Necessary Supplied Packages, I go into some detail on how to use the DBMS_PROFILER package and how to interpret the resulting reports and data. It is very straightforward to use and is nicely integrated in with PL/SQL itself. During the course of a PL/SQL stored procedure you can turn profiling on and off as you narrow down the section of code you would like to tune.

My stages of tuning an application would have me using SQL_TRACE and TKPROF first to identify and correct poorly tuned SQL in an application. After all of the SQL was executing as fast as it possibly could, I would start in with the source code profiler if need be to tune further. With the exception of a really bad algorithm, the biggest bang for the buck, performance-wise will come from tuning SQL. I find only moderate increases in tuning PL/SQL source code; worthy of the effort, but not as large as fixing a bad query. Of course, if the algorithm was terrible to begin with, this might not be true.

The other use of the profiler is to provide code coverage reports. This can be extremely useful during the testing phase of your application. Utilizing this feature, you can set up test cases and show that 100 percent of your code has been exercised during the test runs. While this does not guarantee bug free code, it does make sure nothing is obviously wrong.

Instrumentation

This is crucial to do, especially in a large application with many moving pieces. As applications get more complex, and have more interacting pieces, finding where the performance issue actually is has become more difficult then actually fixing the issue. By way of definition, instrumentation is including, in your code, enough logging capabilities that can selectively be enabled at will to figure out a) what the program is doing, and b) how long it takes it to do it. Everything from the simplest process to the most complex algorithm should be instrumented thoroughly. Yes, instrumentation adds overhead, even when you are not logging the messages, however the inability to determine where an issue is without it far outweighs any trivial performance overhead of having it.

In the beginning of this chapter, I related a story of my last tuning expedition. The architecture was complex to say the least. There was a browser front-end coming into a web server farm, across a very restrictive firewall. The web servers were executing Java Server Pages (JSPs). The JSPs used a connection pooling mechanism to get to the database to run some SQL and PL/SQL. They also made CORBA callouts to yet another application server, to interface with the legacy system. They were just a piece of a much larger application all together. There were batch jobs, background processes, queue processing, other off-the-shelf software, and so on. I came into that environment without knowing anything about the application, and unfortunately, not too many people there knew something about everything - they all knew their bits and pieces. Getting the big picture is always hard.

Since nothing was instrumented, not even a little, we had to use the tools listed above to try and isolate the issue. SQL_TRACE didn't show anything obvious (in fact it showed the SQL was executing OK). The profiler didn't supply any information beyond the fact that the PL/SQL was running OK. The V$ dynamic performance tables in the database showed us that it was OK. We could only use these tools to prove that it was not the database causing the slow down. We could not use them this time to find the problem. It was outside the database, somewhere between the browser and the database. Unfortunately, this was a maze of JSPs, EJBs, Connection Pools, and CORBA callouts. The only thing we could do is instrument the code to find out where the slow part was. It wasn't like we had a single program we could run in a debugger like the 'old days'. We had dozens of bits and pieces of code here and there glued together with a network - those bits and pieces of code had to tell us which one of them was slow.

We finally discovered it - it was a callout over CORBA to a legacy system that had to be done on almost every page. Only by creating log files with timestamps were we able to isolate this issue. It turned out to not even be a database problem, let alone an application problem, but we still looked pretty bad in the eyes of the end user. They really didn't care whose code was slow; they were just really upset that it took weeks to figure it all out.

How you instrument will vary based upon the language you choose. For example, in PL/SQL I use a custom developed package DEBUG. It uses a standardized logging mechanism to create a log file for any PL/SQL routine. We 'litter' our code with calls to debug.f like this:

create function foo ... as   ... begin      debug.f( 'Enter procedure foo' );      if ( some_condition ) then          l_predicate := 'x=1';      end if;           debug.f( 'Going to return the predicate "%s"', l_predicate );      return l_predicate; end; 

and it created log entries like this for us:

011101 145055 ( FOO,  6) Enter procedure foo 011101 145056 ( FOO, 11) Going to return the predicate "x=1" 

It automatically added the date (01/11/2001) and time (14:50:55), as well as what procedure/package called it and the line number to the log. We have the ability to turn tracing on and off for any given module or set of modules. Not only is this a useful debugging tool, but we also can look at the trace files and see right away what is taking a long time. We'll take another look at this DEBUG utility in Chapter 21 on Fine Grained Access Control where we use it heavily.

If every application, and every piece of every application, had such a facility, finding where to start looking for performance-related issues would be easier to say the least. In a non-instrumented system, it is very much like looking for a needle in a haystack. It might even be worse; at least with the needle in a haystack, you don't have lots of people with pre-conceived notions of where the problem is guiding you down the wrong paths initially.

In short, all application components, even the non-database ones, need to be instrumented, especially in today's environment where you don't have a single client application talking to a single server. With the explosion of web-based applications, particularly ones that use complex, distributed architectures, determining where the problems are, is infinitely harder than fixing them. By instrumenting your code from the very beginning, you'll be practicing a good defensive technique for problems that will come up down the road. I can guarantee you that you will never be sorry you instrumented; you'll only be sorry if you don't.

I encourage you to leave this instrumented code as is for production. Do not remove it for production. That is where it is most useful! My approach has been to implement an empty DEBUG package body (one where the functions all just immediately return). In this fashion, if I need to generate a trace - I just put the real DEBUG package body in, trace what I need to and then put back the empty one later. Removing the code from production 'for performance reasons' will obviate its usefulness. Consider the database itself - by setting a myriad of events, Oracle support can elicit a large amount of diagnostic data from your production database - the kernel developers recognize that the cost of not having the trace code in the production code far outweighs any perceived overhead.

StatsPack

So far, we have looked at tools that are useful to tune an application. SQL_TRACE, DBMS_PROFILER, instrumenting code - these are all application-level tuning tools. Once we believe we have an application working as well as it can, we should also look at the entire database instance and see how well is it doing. This is where the job duties between the DBA and the developer start to overlap and blur a little. The DBA might find the cause of the slowdown, but it will be the developer who has to fix it in many cases. Working together at this point is mandatory.

In the beginning, the tools one might use were called UTLBSTAT and UTLESTAT (begin statistics and end statistics). The UTLBSTAT script would take a snapshot of many V$ performance tables. At a later date, UTLESTAT would create a report based on the 'before' and 'after' values in the V$ tables. The statistics would then be 'wiped out' and you would be left with a simple textual report to review. Starting with Oracle8i, BSTAT/ESTAT have been formally replaced with the StatsPack set of utilities. This is a much more robust set of tools then BSTAT/ESTAT ever was. The most important new piece of functionality is the ability to have a history of the V$ tables stored for you. That is, instead of wiping out the statistics upon the generation of a report, StatsPack allows you to save the data and generate the reports at will, at a later date. With BSTAT/ESTAT for example, it would be impossible to generate a report every day during the week and a report that covered each hour of each day during the week. With StatsPack, I can simply set it up to collect statistics every hour (with hardly any measurable impact on my system), and then generate reports that compare any two of the 'snapshots'. In that fashion, I can create a report to cover any hour, as well as any day.

In addition to the reporting flexibility, StatsPack is more comprehensive in the data it displays. In this section, what I would like to do is cover how to install StatsPack, how to collect data, and then most importantly, how to read the resulting report.

Setting up StatsPack

StatsPack is designed to be installed when connected as INTERNAL or more appropriately, as SYSDBA (CONNECT SYS/CHANGE_ON_INSTALL AS SYSDBA), although it will execute a CONNECT INTERNAL. In order to install, you must be able to perform that operation. In many installations, this will be a task that you must ask the DBA or administrators to perform.

Once you have the ability to connect INTERNAl, installing StatsPack is trivial. You simply run statscre.sql in 8.1.6 or spcreate.sql in 8.1.7. These will be found in [ORACLE_HOME]\rdbms\admin when connected as INTERNAL via SQL*PLUS. It would look something like this:

C:\oracle\RDBMS\ADMIN>sqlplus internal      SQL*PLUS: Release 8.1.6.0.0 - Production on Sun Mar 18 11:52:32 2001      (c) Copyright 1999 Oracle Corporation.  All rights reserved.           Connected to: Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production With the Partitioning option JServer Release 8.1.6.0.0 - Production      sys@TKYTE816> @statscre ... Installing Required Packages 

You'll need to know three pieces of information before running the statscre.sql script. They are:

The script will prompt you for this information as it executes. In the event you make a typo or inadvertently cancel the installation, you should use spdrop.sql (8.1.7 and up) or statsdrp.sql (8.1.6 and before) to remove the user and installed views prior to attempting another install of StatsPack. The StatsPack installation will create three .lis files (the names of which will be displayed to you during the installation). You should review these for any possible errors that might have occurred. They should install cleanly however, as long as you supplied valid tablespace names (and didn't already have a user PERFSTAT).

Now that StatsPack is installed, all we need to do is collect at least two points of data. The simplest way to do this is to use the STATSPACK package now owned by PERFSTAT as follows:

perfstat@DEV2.THINK.COM> exec statspack.snap      PL/SQL procedure successfully completed. 

Now, we just need to wait a while, let the system run 'normally', and then take another snapshot. Once we have two data points collected, running a report is just as easy. The script statsrep.sql (8.1.6) or spreport.sql (8.1.7) is available to be run for this purpose. It is a SQL*PLUS script designed to be run when connected as PERFSTAT (whose password by default is PERFSTAT - this should be immediately changed after installation!). The report format between version 8.1.6 and 8.1.7 changed slightly, and as I prefer the 8.1.7 format to the 8.1.6 format, this is the report we will run. To run it, we will simply execute:

perfstat@ORA8I.WORLD> @spreport              DB Id    DB Name      Inst Num Instance ----------- ------------ -------- ------------  4080044148 ORA8I               1 ora8i           Completed Snapshots                                 Snap                    Snap Instance     DB Name         Id   Snap Started    Level Comment ------------ ------------ ----- ----------------- ----- ---------------------- ora8i        ORA8I            1 18 Mar 2001 12:44    10                               2 18 Mar 2001 12:47    10           Specify the Begin and End Snapshot Ids ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enter value for begin_snap: 

We are presented with a list of data points that we have collected, and will be asked to pick any two to compare. Then a default report name will be generated and we are prompted to accept it or to supply a new name. The report is then generated. The following is the 8.1.7 version of a StatsPack report section-by-section with comments describing what to look for, and how to interpret the results.

STATSPACK report for DB Name         DB Id    Instance     Inst Num Release     OPS Host ------------ ----------- ------------ -------- ----------- --- ------------ ORA8I         4080044148 ora8i               1 8.1.6.2.0   NO  aria                      Snap Id     Snap Time      Sessions                 ------- ------------------ --------  Begin Snap:          1 18-Mar-01 12:44:41       22    End Snap:          3 18-Mar-01 12:57:23       22     Elapsed:                  12.70 (mins)      Cache Sizes ~~~~~~~~~~~            db_block_buffers:      16384          log_buffer:     512000            db_block_size:          8192    shared_pool_size:  102400000 

The first part of the report is purely informational. It shows what database this report was generated against, including the DB Name and DB Id. These should be unique in your environment. The Instance variable is the Oracle SID of the database. These three bits of data should help you identify exactly which database this report was generated from. That was one of the issues with the old BSTAT/ESTAT reports - they did not show this identifying information. More than once, I was given a report to critique, but it turned out the report wasn't run on the server with the problem. This should never happen again. The other information here is the timing information; when were these data points collected and how far apart they were. It is surprising to many that the data points do not have to be far apart from each other - the above report covers a 13 minute window. They only have to span a normal range of activity. A StatsPack report generated with a 15 minute window is as valid as one with a one-hour window (or an even larger timeframe). In fact, as the window gets larger, it may be more difficult to come to definitive conclusions given the raw numeric data. Lastly, in this section, we see some high-level configuration information for the server. The main components of the SGA are visible for review:

Load Profile ~~~~~~~~~~~~                            Per Second       Per Transaction                                    ---------------       ---------------                   Redo size:              5,982.09             13,446.47               Logical reads:              1,664.37              3,741.15               Block changes:                 17.83                 40.09              Physical reads:                 15.25                 34.29             Physical writes:                  5.66                 12.73                  User calls:                  3.28                  7.37                      Parses:                 16.44                 36.96                 Hard parses:                  0.17                  0.37                       Sorts:                  2.95                  6.64                      Logons:                  0.14                  0.32                    Executes:                 30.23                 67.95                Transactions:                  0.44 

This section shows a great deal of information, in a very small amount of space. We can see how much REDO is generated on average every second and for every transaction. Here, I can see that I generate about 5 to 6 KB of redo per second. My average transaction generates just 13 KB of redo. The next bit of information has to do with logical and physical I/O. I can see here that about 1 percent of my logical reads resulted in physical I/O - that is pretty good. I can also see that on average, my transactions perform almost 4,000 logical reads. Whether that is high or not depends on the type of system you have. In my case, there were some large background jobs executing, so a high read count is acceptable.

Now for the really important information: my parse-related statistics. Here I can see that I do about 16 parses per second and about 0.17 of those are hard parses (SQL that never existed before). Every six seconds or so, my system is parsing some bit of SQL for the very first time. That is not bad. However, I would prefer a count of zero in this column in a finely tuned system that has been running for a couple of days. All SQL should be in the shared pool after some point in time.

  % Blocks changed per Read:    1.07    Recursive Call %:   97.39  Rollback per transaction %:    0.29       Rows per Sort:  151.30 

The next section in the above shows us some interesting numbers. The % Blocks Changed per Read shows us that in this case, 99 percent of the logical reads we do are for blocks that are only read, not updated. This system updates only about 1 percent of the blocks retrieved. The Recursive Call % is very high - over 97 percent. This does not mean that 97 percent of the SQL executed on my system is due to 'space management' or parsing. If you recall from our analysis of the raw trace file earlier from SQL_TRACE, SQL executed from PL/SQL is considered 'recursive SQL'. On my system, virtually all work is performed using PL/SQL, other than mod_plsql (an Apache web server module) and an occasional background job, everything is written in PL/SQL on my system. I would be surprised if the Recursive Call % were low in this case.

The percentage of transactions that rolled back (Rollback per transaction %)is very low, and that is a good thing. Rolling back is extremely expensive. First, we did the work, which was expensive. Then, we undid the work and again, this is expensive. We did a lot of work for nothing. If you find that most of your transactions roll back, you are spending too much time doing work and then immediately undoing it. You should investigate why you roll back so much, and how you can rework your application to avoid that. On the system reported on, one out of every 345 transactions resulted in a rollback - this is acceptable.

Instance Efficiency Percentages (Target 100%) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             Buffer Nowait %:  100.00       Redo NoWait %:  100.00             Buffer  Hit   %:   99.08    In-memory Sort %:   99.60             Library Hit   %:   99.46        Soft Parse %:   98.99          Execute to Parse %:   45.61         Latch Hit %:  100.00 Parse CPU to Parse Elapsd %:   87.88     % Non-Parse CPU:  100.00 

Next, we have the Instance Efficiency Percentages. Here, they state the target is 100 percent, and this is mostly true. I think the one exception to that is the Execute to Parse ratio. This is a measure of how many times a statement was executed as opposed to being parsed. In a system where we parse, and then execute the statement, and never execute it again in the same session, this ratio will be zero. The above shows that for every one parse, I had about 1.8 executes (almost a two to one ratio). It will depend on the nature of the system as to whether this is good or bad. On my particular system, we use mod_plsql to do all of our applications. A session is created, a stored procedure is executed, a web page is formed, and the session is destroyed. Unless we execute the same exact SQL many times in a single stored procedure, our execute to parse ratio will be low. On the other hand, if I had a client server or a stateful connection to the database (perhaps via a servlet interface), I would expect this ratio to be much closer to the target of 100 percent. I understand however, that given the architecture I am using, achieving an extremely high execute to parse ratio is not something I can do.

In my mind, the most important ratios are the parse ratios - they get my attention immediately. The soft parse ratio is the ratio of how many soft versus hard parses we do. 99 percent of the parses on this system are soft parses (reused from the shared pool). That is good. If we see a low soft parse ratio, this would be indicative of a system that did not use bind variables. I would expect to see a very high ratio in this field regardless of tools or techniques used. A low number means you are wasting resources and introducing contention. The next number to look at is the Parse CPU to Parse Elapsd. Here, I show about 88 percent. This is a little low; I should work on that. In this case for every CPU second spent parsing we spent about 1.13 seconds wall clock time. This means we spent some time waiting for a resource - if the ratio was 100 percent, it would imply CPU time was equal to elapsed time and we processed without any waits. Lastly, when we look at Non-Parse CPU, this is a comparison of time spent doing real work versus time spent parsing queries. The report computes this ratio with round(100*1-(PARSE_CPU/TOT_CPU),2). If the TOT_CPU is very high compared to the PARSE_CPU (as it should be), this ratio will be very near 100 percent, as mine is. This is good, and indicates most of the work performed by the computer was work done to execute the queries, and not to parse them.

All in all, in looking at the above section, my recommendation would be to reduce the hard parses even further. There obviously are a couple of statements still not using bind variables somewhere in the system (every six seconds a new query is introduced). This in turn would reduce the overall number of parses done because a hard parse has to perform a lot of recursive SQL itself. By simply removing a single hard parse call, we'll reduce the number of soft parses we perform as well. Everything else in that section looked acceptable. This first section we just reviewed is my favorite part of the StatsPack report, at a glance it gives a good overview of the relative 'health' of your system. Now, onto the rest of the report:

Shared Pool Statistics        Begin   End                               ------  ------              Memory Usage %:   75.03   75.26     % SQL with executions>1:   79.18   78.72   % Memory for SQL w/exec>1:   74.15   73.33 

This little snippet gives us some insight into our shared pool utilization. The details shown above are:

So, in general you would like to see about 75 to 85 percent of the shared pool being utilized over time in a steady state. The percentage of SQL with executions greater than one should be near 100 percent if the time window for the StatsPack report is big enough to cover all of your cycles. This is one statistic that is affected by the duration of time between the observations. You would expect it to increase as the amount of time between observations increases.

Now, onto the next section:

Top 5 Wait Events ~~~~~~~~~~~~~~~~~                                             Wait     % Total Event                                               Waits  Time (cs)   Wt Time -------------------------------------------- ------------ ------------ ------- SQL*Net more data from dblink                       1,861          836   35.86 control file parallel write                           245          644   27.63 log file sync                                         150          280   12.01 db file scattered read                              1,020          275   11.80 db file sequential read                               483          165    7.08           ------------------------------------------------------------- Wait Events for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> cs - centisecond -  100th of a second -> ms - millisecond - 1000th of a second -> ordered by wait time desc, waits desc (idle events last)      

Here is your 'low hanging fruit', the events that are slowing you down more than anything else. The first step is to look at the Wait Time and see if it is even worth your time to bother tuning anything based on them. For example, I spent 8.36 seconds in a 13 minute time period waiting on data from a dblink. Is that worth my time to investigate and 'fix'? In this case, I would say no, the average wait was 0.004 seconds. Further, I know I have a background process running that is doing a large operation over a database link, the wait time is actually pretty small, all things considered.

So, suppose I did have something that needed attention. Here, what you need to do is first find out what the events mean. For example, if you look up 'log file sync' in the Oracle Reference Manual, you'll discover that it is:

Note 

When a user session COMMITs, the session's redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write the log buffer to the redo log file. When the LGWR has finished writing, it will post the user session.

Wait Time: The wait time includes the writing of the log buffer and the post.

Now that I understand what the wait is, I would try to come up with a way to make it 'go away'. In the case of a log file sync, this means tuning LGWR. Basically, to make that wait go away you need to make the disks go faster, generate less redo, reduce contention for the log disks, and so on. Discovering the wait event is one thing, making it go away is another. There are some 200+ events that are timed in Oracle - none of them really have a cut and dry 'do this and they go away' solution.

One thing to keep in mind is that you will always be waiting on something. If you remove one roadblock, another will pop up. You will never get this list of large wait events to go away - you'll always have some. If you find that you are always 'tuning to go as fast as possible', you'll never be done. You can always make the system go one percent faster, but the amount of work and time you have to spend to get that one percent will increase exponentially over time. Tuning is something that must be done with a goal in mind, an end point. If you cannot say 'I am done when X is true', and X is something measurable, you are just wasting time.

The next section in the report:

Wait Events for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> cs - centisecond -  100th of a second -> ms - millisecond - 1000th of a second -> ordered by wait time desc, waits desc (idle events last)                                                                     Avg                                                       Total Wait    wait  Waits Event                               Waits   Timeouts  Time (cs)    (ms)   /txn ---------------------------- ------------ ---------- ----------- ------ ------ SQL*Net more data from dblin        1,861          0         836      4    5.5 control file parallel write           245          0         644     26    0.7 log file sync                         150          0         280     19    0.4 db file scattered read              1,020          0         275      3    3.0 db file sequential read               483          0         165      3    1.4 control file sequential read          206          0          44      2    0.6 SQL*Net message from dblink            51          0          35      7    0.2 refresh controlfile command            21          0          28     13    0.1 log file parallel write               374          0          14      0    1.1 latch free                             13         10           3      2    0.0 SQL*Net more data to client           586          0           2      0    1.7 single-task message                     1          0           2     20    0.0 direct path read                      716          0           1      0    2.1 direct path write                      28          0           1      0    0.1 file open                              28          0           1      0    0.1 SQL*Net message to dblink              51          0           0      0    0.2 db file parallel write                 24          0           0      0    0.1 LGWR wait for redo copy                 3          0           0      0    0.0 file identify                           1          0           0      0    0.0 SQL*Net message from client         2,470          0   1,021,740   4137    7.3 virtual circuit status                 25         25      76,825  30730    0.1 pipe get                              739        739      76,106   1030    2.2 SQL*Net more data from clien          259          0           3      0    0.8 SQL*Net message to client           2,473          0           0      0    7.3           ------------------------------------------------------------- 

shows all of the wait events that occurred for clients of the database in the window of measurement. In addition to what was available in the Top 5 report, it shows the average wait time in thousands of seconds, and the number of times a transaction waited on it. You can use this to identify wait events that are relevant. I should point out in this listing, there are lots of events you should just ignore. For example, SQL*Net message from client - ignore it in systems where the client has think time. It represents the time the client sat there and didn't ask the database to do anything (on the other hand, if you see this during a data load - then the client is not feeding the database fast enough and it does indicate a problem). In this case, however, this just means the client was connected and didn't make any requests. The report heading notes that idle events last. Everything from SQL*Net message from client on down is an 'idle' event, some process was waiting to be told to do something. Ignore them for the most part.

Background Wait Events for DB: ORA8I  Instance: ora8i  Snaps: 1  -3 -> ordered by wait time desc, waits desc (idle events last)                                                                     Avg                                                      Total Wait    wait  Waits Event                               Waits   Timeouts  Time (cs)    (ms)   /txn ---------------------------- ------------ ---------- ----------- ------ ------ control file parallel write           245          0         644     26    0.7 control file sequential read           42          0          25      6    0.1 log file parallel write               374          0          14      0    1.1 db file parallel write                 24          0           0      0    0.1 LGWR wait for redo copy                 3          0           0      0    0.0 rdbms ipc message                   1,379        741     564,886   4096    4.1 smon timer                              3          3      92,163 ######    0.0 pmon timer                            248        248      76,076   3068    0.7           ------------------------------------------------------------- 

The above section of the StatsPack report shows the wait events that occurred for the database 'background' processes (DBWR, LGRWR, and so on). Once again, idle waits are listed at the bottom of the report and generally should be ignored. These are useful for instance-wide tuning, to find out what the background processes are waiting on. It is easy for you to determine what is holding your session up, we've done that many times in the examples on bind variables and in other places by querying V$SESSION_EVENT. This report snippet shows the wait events for the background processes in much the same way that we showed them for our individual session.

SQL ordered by Gets for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> End Buffer Gets Threshold:   10000 -> Note that resources reported for PL/SQL includes the resources used by    all SQL statements called within the PL/SQL code.  As individual SQL    statements are also reported, it is possible and valid for the summed    total % to exceed 100        Buffer Gets    Executions  Gets per Exec  % Total  Hash Value --------------- ------------ -------------- ------- ------------         713,388            1      713,388.0    56.2   1907729738 BEGIN sys.sync_users.do_it; END;              485,161            1      485,161.0    38.3   1989876028 SELECT DECODE(SA.GRANTEE#,1,'PUBLIC',U1.NAME) "GRANTEE",U2.NAME "GRANTED_ROLE",DECODE(OPTION$,1,'YES','NO') "ADMIN_OPTION"   FRO M SYSAUTH$@ORACLE8.WORLD SA,DEFROLE$@ORACLE8.WORLD UD,USER$@ORAC LE8.WORLD U1,USER$@ORACLE8.WORLD U2  WHERE SA.GRANTEE# = UD.USER # (+)    AND SA.PRIVILEGE# = UD.ROLE# (+)    AND U1.USER# = SA.G              239,778            2      119,889.0    18.9    617705294 BEGIN statspack.snap(10); END;       ... 

This section of the report shows us our 'TOP' SQL. In this section of the report, the SQL is ordered by Buffer Gets, in other words, how many logical I/Os it performed. As noted in the comment at the top, the buffer gets for a PL/SQL unit includes the buffer gets for all SQL executed by the block of code. Therefore, you will frequently see PL/SQL procedures at the top of this list, since the sum of the individual statements executed by the stored procedure are summed up.

In this particular example, I can see that there was a PL/SQL routine sync_users.do_it. It hit almost three quarters of a million block buffers in its single execution. Whether that is bad or not isn't clear from this report. The only thing this report can do is report on the facts - it makes no judgements based on those facts. In this case, I know that sync_users is a large batch job that is synchronizing the data dictionaries on two databases, making sure that a user created on one database, is created on the other and the all of the roles are the same and passwords in sync. I expect it to be large - as it turns out, this is the job that is waiting on the dblink wait event noticed above as well.

SQL ordered by Reads for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> End Disk Reads Threshold:    1000       Physical Reads  Executions  Reads per Exec % Total  Hash Value --------------- ------------ -------------- ------- ------------           8,484            1        8,484.0    73.0   1907729738 BEGIN sys.sync_users.do_it; END;                2,810            2        1,405.0    24.2    617705294 BEGIN statspack.snap(10); END; ... 

This next section is very similar to the one above, but instead of reporting on logical I/O, it is reporting on physical I/O. This is showing me the SQL that incurs the most read activity on the system, the physical I/O. These are the queries and processes you might want to look at if you are finding yourself to be I/O bound. The sync_users routine might be in need of a little tuning - it is by far the largest consumer of disk resources on this particular system.

SQL ordered by Executions for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> End Executions Threshold:     100       Executions   Rows Processed    Rows per Exec   Hash Value ------------ ---------------- ---------------- ------------        2,583                0              0.0   4044433098 SELECT TRANSLATE_TO_TEXT   FROM WWV_FLOW_DYNAMIC_TRANSLATIONS$ WHERE TRANSLATE_FROM_TEXT = :b1  AND TRANSLATE_TO_LANG_CODE = :b 2             2,065            2,065              1.0   2573952486 SELECT DISPLAY_NAME   FROM WWC_PEOPLE_LABEL_NAMES  WHERE LABEL_N AME = :b1 

This portion of the 'TOP' SQL report shows us the SQL that was executed the most during this period of time. This might be useful in order to isolate some very frequently executed queries to see if there is some way to change the logic, to avoid having to execute them so frequently. Perhaps a query is being executed inside a loop and it could be executed once outside the loop instead - simple algorithm changes might be made to reduce the number of times you have to execute that query. Even if they run blindingly fast, anything that is executed millions of times will start to eat up a large chunk of time.

SQL ordered by Version Count for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> End Version Count Threshold:        20       Version   Count  Executions   Hash Value -------- ------------ ------------       21          415    451919557 SELECT SHORTCUT_NAME,ID   FROM WWV_FLOW_SHORTCUTS  WHERE FLOW_ID  = :b1  AND (:b2 IS NULL  OR SHORTCUT_NAME = :b2 ) AND NOT EXIST S  (SELECT 1   FROM WWV_FLOW_PATCHES  WHERE FLOW_ID = :b1  AND I D = BUILD_OPTION  AND PATCH_STATUS = 'EXCLUDE' )  ORDER BY SHORT CUT_NAME,SHORTCUT_CONSIDERATION_SEQ            21          110   1510890808 SELECT DECODE(:b1,1,ICON_IMAGE,2,ICON_IMAGE2,3,ICON_IMAGE3) ICON _IMAGE,DECODE(:b1,1,ICON_SUBTEXT,2,ICON_SUBTEXT2,3,ICON_SUBTEXT3 ) ICON_SUBTEXT,ICON_TARGET,ICON_IMAGE_ALT,DECODE(:b1,1,ICON_HEIG HT,2,NVL(ICON_HEIGHT2,ICON_HEIGHT),3,NVL(ICON_HEIGHT3,ICON_HEIGH T)) ICON_HEIGHT,DECODE(:b1,1,ICON_WIDTH,2,NVL(ICON_WIDTH2,ICON_H 

This report shows the SQL ordered by how many versions of the SQL appear in the shared pool. There can be many reasons why there is more than one version of the same exact SQL statement in the shared pool. Some of the major causes for this are:

The following example shows you how you can get many versions of the same SQL query in the shared pool. We start by flushing the shared pool to remove all statements, and then we'll get three versions of the same query loaded in there:

tkyte@TKYTE816> connect tkyte/tkyte      tkyte@TKYTE816> alter system flush shared_pool;      System altered.      tkyte@TKYTE816> select * from t where x = 5;      no rows selected      tkyte@TKYTE816> alter session set optimizer_goal=first_rows;      Session altered.      tkyte@TKYTE816> select * from t where x = 5;      no rows selected           tkyte@TKYTE816> connect scott/tiger      scott@TKYTE816> select * from t where x = 5;      no rows selected      scott@TKYTE816> connect tkyte/tkyte      tkyte@TKYTE816> select sql_text, version_count   2    from v$sqlarea   3   where sql_text like 'select * from t where x = 5%'   4  /      SQL_TEXT             VERSION_COUNT -------------------- ------------- select * from t wher             3 e x = 5           tkyte@TKYTE816> select loaded_versions, optimizer_mode,   2         parsing_user_id, parsing_schema_id   3    from v$sql   4   where sql_text like 'select * from t where x = 5%'   5  /      LOADED_VERSIONS OPTIMIZER_ PARSING_USER_ID PARSING_SCHEMA_ID --------------- ---------- --------------- -----------------               1 CHOOSE                  69                69               1 FIRST_ROWS              69                69               1 CHOOSE                  54                54 

This shows why we have multiple versions in the shared pool. The first two are in there because even though the same user ID parsed them, they were parsed in different environments. The first time the optimizer goal was CHOOSE, the next time it was FIRST ROWS. Since a different optimizer mode can result in a different plan, we need two versions of that query. The last row is in there because it is a totally different query; the text is the same however. This query selects from SCOTT.T, not TKYTE.T - it is a totally separate query all together.

A high version count should be avoided for the same reason that you should use bind variables and avoid soft parsing queries whenever possible; you are doing more work then you need to. Sometimes multiple versions of a query must be kept, especially when it is caused by different accounts executing the same SQL against different tables, as was the case above with TKYTE.T and SCOTT.T. The other case, where you have different environments causing multiple versions, should typically be avoided whenever possible.

In my case above, the 21 versions were caused by 21 different user accounts parsing the same query against different tables.

Instance Activity Stats for DB: ORA8I  Instance: ora8i  Snaps: 1 -3     Statistic                                    Total   per Second    per Trans --------------------------------- ---------------- ------------ ------------ CPU used by this session                14,196,226     18,630.2     41,876.8 ... parse count (hard)                             127          0.2          0.4 parse count (total)                         12,530         16.4         37.0 parse time cpu                                 203          0.3          0.6 parse time elapsed                             231          0.3          0.7 ... sorts (disk)                                     9          0.0          0.0 sorts (memory)                               2,242          2.9          6.6 sorts (rows)                               340,568        446.9      1,004.6... ------------------------------------------------------------- 

This part of the report, Instance Activity Stats, contains lots of very detailed numbers. We've already seen many of these numbers - they were used to compute the ratios and statistics at the beginning of the report. For example, looking at parse count (hard), and (total) we find:

tkyte@TKYTE816> select round( 100 *(1-127/12530),2 ) from dual;      ROUND(100*(1-127/12530),2) --------------------------                      98.99 

which is exactly our Soft Parse % from the beginning of the report. This detailed data was used to compute many of the ratios and summaries above.

Tablespace IO Stats for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 ->ordered by IOs (Reads + Writes) desc Tablespace ------------------------------                  Av      Av     Av                    Av        Buffer Av Buf          Reads Reads/s Rd(ms) Blks/Rd       Writes Writes/s      Waits Wt(ms) -------------- ------- ------ ------- ------------ -------- ---------- ------ TEMP          1,221       2    0.0     2.3          628        1          0    0.0  File IO Stats for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 ->ordered by Tablespace, File Tablespace               Filename ------------------------ ----------------------------------------------------                  Av      Av     Av                    Av        Buffer Av Buf          Reads Reads/s Rd(ms) Blks/Rd       Writes Writes/s      Waits Wt(ms) -------------- ------- ------ ------- ------------ -------- ---------- ------ DRSYS                    /d02/oradata/ora8i/drsys01.dbf             14       0    7.9     2.4            0        0          0 ... 

The above two reports are I/O-oriented. In general, you are looking for an even distribution of reads and writes across devices here. You want to find out what files might be 'hot'. Once your DBA understands how the data is read and written, they may be able to get some performance gain by distributing I/O across disks more evenly.

Buffer Pool Statistics for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> Pools   D: default pool,  K: keep pool,  R: recycle pool                                            Free   Write    Buffer        Buffer    Consistent    Physical   Physical  Buffer Complete       Busy P        Gets          Gets       Reads     Writes   Waits    Waits      Waits - ----------- ------------- ----------- ---------- ------- -------- ---------- D       9,183       721,865       7,586        118       0        0          0 

If we were using the multiple buffer pool feature, the above would show us the breakdown of usage by buffer pool. As it is, this is just reiterating information we saw at the beginning of the report.

Rollback Segment Stats for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 ->A high value for "Pct Waits" suggests more rollback segments may be required             Trans Table      Pct   Undo Bytes RBS No     Gets       Waits     Written        Wraps  Shrinks  Extends ------ ------------ ------- --------------- -------- -------- --------      0          5.0    0.00               0        0        0        0      1        866.0    0.00         447,312        1        0        0   Rollback Segment Storage for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 ->Optimal Size should be larger than Avg Active      RBS No    Segment Size      Avg Active    Optimal Size    Maximum Size ------ --------------- --------------- --------------- ---------------      0         663,552           7,372                         663,552      1      26,206,208         526,774                      26,206,208      2      26,206,208         649,805                      26,206,208 ... 

The above shows rollback segment activity. Again, you are looking for an even distribution across rollback segments (with the exception of the SYSTEM rollback segment of course). Also, the report headings have the most useful information to keep in mind while inspecting this section of the report - especially the advice about Optimal being larger then Avg Active if you use an optimal setting at all (the report above shows this database does not use optimal on the rollback segment size). As this is mostly DBA-related activity (the I/O and rollback segment information) we'll continue onto the next section:

Latch Activity for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 ->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for   willing-to-wait latch get requests ->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests ->"Pct Misses" for both should be very close to 0.0                                                      Pct    Avg                 Pct                                    Get          Get   Slps       NoWait NoWait Latch Name                       Requests      Miss  /Miss     Requests   Miss ----------------------------- -------------- ------ ------ ------------ ------ Active checkpoint queue latch            271    0.0                   0   virtual circuit queues                    37    0.0                   0      Latch Sleep breakdown for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> ordered by misses desc                                      Get                                  Spin & Latch Name                    Requests         Misses      Sleeps Sleeps 1->4 -------------------------- -------------- ----------- ----------- ------------ library cache                     202,907          82          12 72/8/2/0/0 cache buffers chains            2,082,767          26           1 25/1/0/0/0      Latch Miss Sources for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> only latches with sleeps are shown -> ordered by name, sleeps desc                                                           NoWait             Waiter Latch Name               Where                       Misses     Sleeps  Sleeps ------------------------ -------------------------- ------- ---------- ------- cache buffers chains     kcbgtcr: kslbegin                0          1       1 library cache            kglic                            0          7       0 library cache            kglhdgn: child:                  0          3       1 library cache            kglget: child: KGLDSBYD          0          2       0      Child Latch Statistics DB: ORA8I  Instance: ora8i  Snaps: 1 -3 -> only latches with sleeps are shown -> ordered by name, gets desc                              Child      Get Latch Name                Num    Requests         Misses     Sleeps ---------------------- ------ -------------- ----------- ----------    Spin &  Sleeps 1->4 ------------- cache buffers chains      930         93,800          21          1 20/1/0/0/0 library cache               2         48,412          34          6 29/4/1/0/0 library cache               1         42,069          10          3 8/1/1/0/0 library cache               5         37,334          10          2 8/2/0/0/0 library cache               4         36,007          13          1 12/1/0/0/0 

If you recall from Chapter 3 on Locking and Concurrency, latches are lightweight serialization devices used by Oracle. They are always either 'gotten' or not; they are not like enqueues where a session requests a lock and is put to sleep until it is available. With a latch, the requestor is told immediately 'you got it' or 'you didn't get it'. The requestor then 'spins' (consuming CPU) for a bit and tries again. If that doesn't work, it goes to 'sleep' for a bit, and tries again. The reports above show you this activity. For example, I can see that the library cache latch was missed 82 times out of 202,907 attempts. Further, 72 of the 82 attempts were successful the next time they tried, 8 on the second and 2 on the third. The ratio of Gets to Misses is very near 100 percent in this system (almost 100 percent of the gets were immediately successful) so there is nothing to do. On a system that does not use bind variables, or parses queries too frequently, you'll see lots of contention for the library cache latch. One other piece of information I can derive from this latch report above is that about 4.5 percent (93800/2082767) of my cache buffers chains latch requests were for one child latch out of (at least) 930. This probably indicates that I have a hot block - one that many processes are trying to access simultaneously. They all need a latch in order to access this block and this in resulting in some contention. That would be something to look into. The latch report is useful to identify what latch contention you are experiencing. We will need to go back to the application-level tuning to fix it. Latch contention is a symptom, not a cause of problems. In order to get rid of the symptom we must determine the cause. Unfortunately, you cannot generate a list of recommendations in the form of 'if you have contention for this latch, you need to do this' (if only it was so easy!). Rather, once you have identified that you in fact do have latch contention, you must go back to the application design and determine why you are contending for that resource.

Dictionary Cache Stats for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 ->"Pct Misses"  should be very low (< 2% in most cases) ->"Cache Usage" is the number of cache entries being used ->"Pct SGA"     is the ratio of usage to allocated size for that cache                                 Get         Pct     Scan   Pct      Mod  Final  Pct Cache                    Requests     Miss Requests  Miss      Req  Usage  SGA ---------------------- ------------ ------ -------- ----- -------- ------ ---- dc_constraints                    0               0              0    227   99 dc_database_links                 9    0.0        0              0      7   88 dc_files                          0               0              0     69   88 dc_free_extents                 747   55.0      336   0.0      672     90   98 dc_global_oids                   14    0.0        0              0     95   86 dc_histogram_data                 0               0              0      0    0 dc_histogram_data_valu            0               0              0      0    0 dc_histogram_defs                94   21.3        0              0  1,902  100 dc_object_ids                   190    0.0        0              0  2,392  100 dc_objects                      345    2.9        0              0  6,092  100 dc_outlines                       0               0              0      0    0 dc_profiles                     132    0.0        0              0      2   33 dc_rollback_segments            192    0.0        0              0     33   77 dc_segments                     637    0.6        0            340  3,028  100 dc_sequence_grants                6    0.0        0              0      6    6 dc_sequences                      8    0.0        0              5     23   82 dc_synonyms                      28   10.7        0              0     96   96 dc_tablespace_quotas              0               0              0     14   12 dc_tablespaces                1,033    0.0        0              0    100   94 dc_used_extents                 672   50.0        0            672      5    6 dc_user_grants                1,296    0.2        0              0    756   82 dc_usernames                    337    0.9        0              0  1,318   99 dc_users                      9,892    0.0        0              1    865  100 

This is a report on the dictionary cache. I don't like this report too much, since there is not very much I can do about the numbers reported back by it. The dictionary cache is totally controlled by Oracle, we cannot size the components of it. We can only size the shared pool size, as long as our shared pool is sized correctly - the above is supposed to take care of itself. As my shared pool is at 75 percent utilized, it is large enough. If my shared pool was 'full' and the hit ratios here were bad, increasing the shared pool would improve the hit ratios.

Library Cache Activity for DB: ORA8I  Instance: ora8i  Snaps: 1 -3 ->"Pct Misses"  should be very low                          Get       Pct        Pin        Pct               Invali- Namespace         Requests    Miss     Requests     Miss     Reloads  dations --------------- ------------ ------ -------------- ------ ---------- -------- BODY                   5,018    0.0          5,018    0.0          0        0 CLUSTER                    0                     0                 0        0 INDEX                      1    0.0              1    0.0          0        0 OBJECT                     0                     0                 0        0 PIPE                     765    0.0            765    0.0          0        0 SQL AREA               1,283    6.9         38,321    0.6         39        0 TABLE/PROCEDURE        3,005    0.3         11,488    0.6          1        0 TRIGGER                   21    0.0             21    0.0          0        0 

Here we are seeing the breakdown of the library cache hit ratio. On a finely tuned system the Pct Misses will be very near zero. On a development system, or a system where objects are frequently dropped and recreated, some of these ratios will be high, as will the Invalidations column. Sizing the shared pool and ensuring that you minimize hard parses by using bind variables, is the way to ensure good ratios in the above.

SGA Memory Summary for DB: ORA8I  Instance: ora8i  Snaps: 1 -3      SGA regions                       Size in Bytes ------------------------------ ---------------- Database Buffers                    134,217,728 Fixed Size                               69,616 Redo Buffers                            532,480 Variable Size                       130,187,264                                ---------------- sum                                 265,007,088      SGA breakdown difference for DB: ORA8I  Instance: ora8i  Snaps: 1 -3      Pool        Name                        Begin value      End value  Difference ----------- ------------------------ -------------- -------------- ----------- java pool   free memory                  17,838,080     17,838,080           0 java pool   memory in use                 3,133,440      3,133,440           0 shared pool KGFF heap                        54,128         54,128           0 shared pool KGK heap                          5,840          5,840           0 shared pool KQLS heap                     3,253,784      3,231,844     -21,940 shared pool PL/SQL DIANA                  4,436,044      4,413,960     -22,084 shared pool PL/SQL MPCODE                15,378,764     15,546,652     167,888 shared pool PLS non-lib hp                    2,096          2,096           0 shared pool State objects                   291,304        291,304           0 shared pool VIRTUAL CIRCUITS                484,632        484,632           0 shared pool db_block_buffers              2,228,224      2,228,224           0 shared pool db_block_hash_buckets           393,240        393,240           0 shared pool dictionary cache              6,547,348      6,586,032      38,684 shared pool event statistics per ses      1,017,600      1,017,600           0 shared pool fixed allocation callbac            960            960           0 shared pool free memory                  27,266,500     27,013,928    -252,572 shared pool joxlod: in ehe                   71,344         71,344           0 shared pool joxs heap init                      244            244           0 shared pool library cache                28,105,460     28,168,952      63,492 shared pool message pool freequeue          231,152        231,152           0 shared pool miscellaneous                 1,788,284      1,800,404      12,120 shared pool pl/sql source                    42,536         42,536           0 shared pool processes                       153,600        153,600           0 shared pool sessions                        633,600        633,600           0 shared pool sql area                     16,377,404     16,390,124      12,720 shared pool table columns                    45,264         45,936         672 shared pool table definiti                   11,984         12,944         960 shared pool transactions                    294,360        294,360           0 shared pool trigger defini                    8,216          8,216           0 shared pool trigger inform                    5,004          5,064          60 shared pool type object de                   48,040         48,040           0 shared pool view columns d                    1,072          1,072           0             db_block_buffers            134,217,728    134,217,728           0             fixed_sga                        69,616         69,616           0             log_buffer                      512,000        512,000           0 

This part of the report shows the utilization of the shared pool in some detail. You can see how over time, memory usage shifts from component to component, some sections giving up memory and others allocating more memory. I've found this section of the report most useful as a way to explain why something was reported in another section. For example, I received a series of StatsPack reports for analysis. They showed fairly constant hard and soft parse counts and then all of a sudden, the hard parses went through the roof for about an hour, then everything went back to normal. Using the reports, I was able to determine this at the same time the hard parse count shot up, the shared pool sql area memory usage decreased by a large amount, many tens of MB. Curious about this, I asked 'did someone flush the shared pool?' and the answer was 'of course, yes'. It was part of their standard operating procedure; every six hours they flushed the shared pool. Why? No one really knew, they just always did. In fact, a job was set up to do it. Disabling this job fixed the periodic performance issues, which were totally self-induced by flushing the shared pool (and hence flushing all of the plans that they spent the last six hours accumulating).

init.ora Parameters for DB: ORA8I  Instance: ora8i  Snaps: 1 -3                                                                   End value Parameter Name                Begin value                  (if different) ----------------------------- ---------------------------- -------------- background_dump_dest          /export/home/ora816/admin/ora8i/b compatible                    8.1.0, 8.1.6.0.0 control_files                 /d01/oradata/ora8i/control01.ctl, core_dump_dest                /export/home/ora816/admin/ora8i/c db_block_buffers              16384   End of Report      

The last part of the report is a listing of the init.ora parameters that differ from the default. These are useful in determining why something is occurring, similar to the shared pool memory report above. You can see quickly which parameters have been explicitly set and determine their effects on the system.

StatsPack Wrap-Up

StatsPack is a great tool to use on both production, and development systems. It will help you to determine the overall health of your database, as well as pinpoint potential bottlenecks. I strongly encourage people to use it continuously, as part of the day-to-day running of the system. There are plans to add additional analysis tools around the information collected by StatsPack, in particular some trend analysis, so you can not only see the delta between two points but you can see the trend over some arbitrary period of time.

Understanding the contents of the report is the first step. Understanding what to do about it is the next. You cannot look at any one number or statistic in isolation, as it takes some understanding of the system and its goals. In the above report we reviewed, unless I knew of the batch job sync_users and the fact that it used a database link heavily and that it was a background process - I might think I had a problem in the system. I don't however; sync_users is run in the background. The fact that it waits a little on a dblink is perfectly acceptable and within the constraints for my system. The report did point out that there is an application out there that is not making optimal use of bind variables in the system - we are hard parsing too frequently. I'll use the other tools presented in the earlier sections to troubleshoot that particular issue.

V$ Tables

In this section, I would like to review the primary V$ tables you will use in your application tuning efforts. There are over 180 of these tables and even though they are call 'dynamic performance tables', not all of them are performance-related. This section will show you the ones I use on a recurring basis. There are others, such as the V$ views used to monitor and tune multi-threaded server; this I won't be covering. All of these tables are documented in the Oracle8i Reference manual as well. I'm not going to repeat the material here but would rather like to draw you attention to the ones I think you should know about.

I am simply going to introduce these views to make you aware of them. Many of them we have used in various examples already. Some are new. When appropriate, a small example of using these views is included.

V$EVENT_NAME

We've seen the names of many 'events' in this section. We have seen how the event has a name and then up to three parameters p1, p2, and p3. We can either flip to the manuals every time we want to look up the meanings of p1, p2, and p3 for a given event, or we can simply query V$EVENT_NAME. For example, two of the events we've encountered in this chapter are latch free and enqueue. Using this view:

tkyte@TKYTE816>  select * from v$event_name where name = 'latch free'   2  /          EVENT# NAME                 PARAMETER1 PARAMETER2 PARAMETER3 ---------- -------------------- ---------- ---------- ----------          2 latch free           address    number     tries      tkyte@TKYTE816>  select * from v$event_name where name = 'enqueue'   2  /          EVENT# NAME                 PARAMETER1 PARAMETER2 PARAMETER3 ---------- -------------------- ---------- ---------- ----------         11 enqueue              name|mode  id1        id2 

we can quickly get a handle on the meaning of these parameters, especially if we've already read about them once or twice, and just need a quick reminder as to what is available.

V$FILESTAT and V$TEMPSTAT

V$FILESTAT and V$TEMPSTAT can give you a quick glimpse at the I/O done on your system, and how long Oracle has spent reading and writing any given file. You can either use StatsPack to get snapshots of this usage, or you can take a quick snapshot of this table yourself, wait a while, and then compare the differences.

V$LOCK

This is a view we've used a couple of times already in Chapter 3 on Locking and Concurrency. This view is used to tell you who has what locked. Remember, Oracle does not store row level locks externally from the data itself, so don't go looking for them in here. In this view, you'll be able to see who has TM (DML Enqueue) locks on tables, so you'll be able to tell that session 'x,y' has some rows locked in a given table but you cannot determine which rows they have locked.

V$MYSTAT

This view contains statistics for your session only. This is very useful to diagnose how your session is doing. The schema that creates this view must have direct access to the V$STATNAME and V$MYSTAT objects, for example:

sys@TKYTE816> grant select on v_$statname to tkyte;    Grant succeeded.      sys@TKYTE816> grant select on v_$mystat to tkyte;    Grant succeeded. 

Notice that we used V_$STATNAME and not V$STATNAME. This is because V$STATNAME is really just a synonym for the view V_$STATNAME.

This view contains the statistic number, an internal code, not the name of the event that is being tracked. I typically install a view like this:

ops$tkyte@ORA8I.WORLD> create view my_stats   2  as   3  select a.name, b.value   4    from v$statname a, v$mystat b   5   where a.statistic# = b.statistic#   6  /      View created.      ops$tkyte@ORA8I.WORLD> SELECT * FROM MY_STATS WHERE VALUE > 0;      NAME                                VALUE ------------------------------ ---------- logons cumulative                       1 logons current                          1 opened cursors cumulative             160 opened cursors current                  1 ... 

in my systems to make it easier to query. Once we have that set up, you can create queries that give you StatsPack-like information for your session. For example, we can compute the all-important Soft Parse Ratio like this:

ops$tkyte@ORA8I.WORLD> select round(100 *   2           (1-max(decode(name,'parse count (hard)',value,null))/   3              max(decode(name,'parse count (total)',value,null))), 2   4                          ) "Soft Parse Ratio"   5    from my_stats   6  /      Soft Parse Ratio ----------------            84.03 

If you create a set of queries like this, and put them in a 'logoff' trigger or embed them directly in your application, you can monitor the performance of your individual applications to see how many commits they perform, how many rollbacks, and so on by session and application.

V$OPEN_CURSOR

This view contains a list of every open cursor for all sessions. This is very useful for tracking down 'cursor leaks' and to see what SQL your session has been executing. Oracle will cache cursors even after you explicitly close them so don't be surprised to see cursors in there that you thought you closed (you might have). For example, in the same SQL*PLUS session I was using to compute the Soft Parse Ratio above I found:

ops$tkyte@ORA8I.WORLD> select * from v$open_cursor   2  where sid = ( select sid from v$mystat where rownum = 1 );      SADDR    SID USER_NAME ADDRESS  HASH_VALUE SQL_TEXT -------- --- --------- -------- ---------- ------------------------------ 8C1706A0  92 OPS$TKYTE 8AD80D18  607327990 BEGIN DBMS_OUTPUT.DISABLE;                                            END;      8C1706A0  92 OPS$TKYTE 8AD6BB54  130268528 select lower(user) || '@' ||                                            decode(global_name,                                            'ORACLE8.WO      8C1706A0  92 OPS$TKYTE 8AD8EDB4  230633120 select round(100 *                                            (1-max(decode(name,'parse                                            count (hard      8C1706A0  92 OPS$TKYTE 8AD7DEC0 1592329314 SELECT                                            ATTRIBUTE,SCOPE,NUMERIC_VALUE,                                            CHAR_VALUE,DATE_VALUE F      8C1706A0  92 OPS$TKYTE 8E16AC30 3347301380 select round( 100 *                                            (1-max(decode(name,'parse                                            count (hard)',      8C1706A0  92 OPS$TKYTE 8AD7AD70 1280991272 SELECT CHAR_VALUE FROM                                            SYSTEM.PRODUCT_PRIVS WHERE                                            (UPPER('      8C1706A0  92 OPS$TKYTE 8AD62080 1585371720 BEGIN                                            DBMS_OUTPUT.ENABLE(1000000);                                            END;      8C1706A0  92 OPS$TKYTE 8AD816B8 3441224864 SELECT USER FROM DUAL 8C1706A0  92 OPS$TKYTE 8ADF4D3C 1948987396 SELECT DECODE('A','A','1','2')                                            FROM DUAL      8C1706A0  92 OPS$TKYTE 89D30A18 2728523820 select round(100 *                                            (1-max(decode(name,'parse                                            count (hard      8C1706A0  92 OPS$TKYTE 8865AB90 3507933882 select * from v$open_cursor                                            where sid = ( select sid from                                            v$      8C1706A0  92 OPS$TKYTE 8AD637B0  242587281 commit 8C1706A0  92 OPS$TKYTE 8AD70660 3759542639 BEGIN                                            DBMS_APPLICATION_INFO.SET_MODU                                            LE(:1,NULL); END;           13 rows selected. 

As you can see, there are a number of cursors apparently open. However:

ops$tkyte@ORA8I.WORLD> select * from my_stats where name = 'opened cursors current';           NAME                                VALUE ------------------------------ ---------- opened cursors current                  1 

I really only have one cursor truly opened (and that is actually the cursor used to query how many open cursors I have). Oracle is caching the other cursors in the hope that I will execute them again.

V$PARAMETER

V$PARAMETER is useful to find various settings relevant to tuning, such as the block size, the sort area size, and so on. Its relevance to tuning is that it contains all of the init.ora parameter values, and many of these will have a bearing on our performance.

V$SESSION

V$SESSION contains a row for every session in the database. As with V$STATNAME shown earlier, you will need to be granted permissions by your DBA to use this view:

sys@TKYTE816> grant select on v_$session to tkyte;    Grant succeeded. 

To find your session's specific row, you can simply query:

ops$tkyte@ORA8I.WORLD> select * from v$session   2  where sid = ( select sid from v$mystat where rownum = 1 )   3  / 

I usually use this view to see what else is going on in the database. For example, I frequently use a script called showsql that shows me a listing of every session, its status (active or not), the module, action, and client_info settings, and finally for active sessions, what SQL they happen to be running.

The MODULE, ACTION, and CLIENT_INFO fields are settable by you in your applications via a call to the procedures in the DBMS_APPLICATION_INFO package (see the Appendix A on Necessary Supplied Packages for details on this package). I would encourage you to instrument every application you build with calls to this package to set this field. It can really save time trying to figure out what session is running which application - if you put that information into the V$ view, it'll be obvious.

My showsql script is simply:

column username format a15 word_wrapped column module format a15 word_wrapped column action format a15 word_wrapped column client_info format a15 word_wrapped column status format a10 column sid_serial format a15 set feedback off set serveroutput on      select username, ''''||sid||','||serial#||'''' sid_serial, status , module, action, client_info from v$session where username is not null /      column username format a20 column sql_text format a55 word_wrapped      set serveroutput on size 1000000 declare     x number; procedure p ( p_str in varchar2 ) is    l_str   long := p_str; begin    loop       exit when l_str is null;       dbms_output.put_line( substr( l_str, 1, 250 ) );       l_str := substr( l_str, 251 );    end loop; end; begin     for x in     ( select username||'('||sid||','||serial#||                 ') ospid = ' ||  process ||                 ' program = ' || program username,              to_char(LOGON_TIME,' Day HH24:MI') logon_time,              to_char(sysdate,' Day HH24:MI') current_time,              sql_address, LAST_CALL_ET         from v$session        where status = 'ACTIVE'          and rawtohex(sql_address) <> '00'          and username is not null order by last_call_et )     loop         dbms_output.put_line( '--------------------' );         dbms_output.put_line( x.username );         dbms_output.put_line( x.logon_time || ' ' ||                           x.current_time||                           ' last et = ' ||                           x.LAST_CALL_ET);         for y in ( select sql_text                      from v$sqltext_with_newlines                     where address = x.sql_address                       order by piece )         loop             p( y.sql_text );         end loop;     end loop; end; / 

and it produces output like this:

ops$tkyte@ORA8I.WORLD> @showsql      USERNAME        SID_SERIAL    STATUS     MODULE          ACTION CLIENT_INFO --------------- ------------- ---------- --------------- ------ ----------- OPS$TKYTE       '30,23483'    ACTIVE     01@ showsql.sql CTXSYS          '56,32'       ACTIVE -------------------- OPS$TKYTE(30,23483) ospid = 29832 program = sqlplus@aria (TNS V1-V3) Sunday    20:34  Sunday    20:40 last et = 0 SELECT USERNAME || '('  || SID  || ','  || SERIAL#  || ') ospid = '  || PROCESS  || ' program = '  || PROGRAM  USERNAME,TO_CHAR( LOGON_TIME,' Day HH24:MI') LOGON_TIME,TO_CHAR(SYSDATE,' Day HH24 :MI') CURRENT_TIME,SQL_ADDRESS,LAST_CALL_ET   FROM V$SESSION  WH ERE STATUS = 'ACTIVE'  AND RAWTOHEX(SQL_ADDRESS) != '00'  AND US ERNAME IS NOT NULL  ORDER BY LAST_CALL_ET -------------------- CTXSYS(56,32) ospid = 15610 program = ctxsrv@aria (TNS V1-V3) Monday    11:52  Sunday    20:40 last et = 20 BEGIN     drilist.get_cmd( :sid, :mbox, :pmask, :cmd_type,:disp_ id, :disp_return_address, :disp_user, :disp_command, :disp_arg1, :disp_arg2, :disp_arg3, :disp_arg4, :disp_arg5, :disp_arg6, :di sp_arg7, :disp_arg8, :disp_arg9, :disp_arg10 ); :error_stack := drue.get_stack; exception when dr_def.textile_error then :error_ stack := drue.get_stack; when others then drue.text_on_stack(sql errm); :error_stack := drue.get_stack; END; ops$tkyte@ORA8I.WORLD> 

As you can see, SQL*PLUS has filled in the MODULE column in V$SESSION with the name of the script that is being executed currently. This can be very useful, especially when your applications take advantage of it to show their progress.

V$SESSION_EVENT

We have used this view a number of times already. I use it frequently to see what is causing a procedure or query to 'wait' for a resource. You can get similar information from a trace file with the appropriate event set, but this makes it so easy to capture the current state of your session's events, run a process, and then display the differences. This is much easier then reviewing a trace file for similar information.

This view contains the wait events for all sessions in the system, so it is useful for seeing the major wait events for sessions, other than your own as well. Similar to the way we can turn SQL_TRACE on in another session using DBMS_SYSTEM, we can use V$SESSION_EVENT to see what wait events other sessions are experiencing.

V$SESSION_LONGOPS

We will look at this view in some detail in Appendix A on Necessary Supplied Packages. This view is used by long running processes such as index creates, backups, restores, and anything the cost based optimizer thinks will take more than six seconds of execution time, to report their progress. Your applications can use this view as well, via the DBMS_APPLICATION_INFO package. If you have a long running process or job, it should be instrumented with calls to DBMS_APPLICATION_INFO to report on its current progress. In this fashion, you can monitor your job easily and see if it is really 'stuck' or just taking its time to complete its task.

V$SESSION_WAIT

This view shows all sessions currently waiting for something and how long they have been waiting. This is typically used to monitor systems that appear to have 'hung' or are running very slowly.

V$SESSTAT

V$SESSTAT is similar to V$MYSTAT but it shows statistics for all sessions, not just yours. This is useful for monitoring another session that you have identified to see how it is doing.

For example, you can use this to monitor the soft parse ratio for a third party application you have installed on your machine. You might do this after observing the hard parse count increasing on your finely tuned system. By monitoring the soft parse ratio of just the third party application, you could quickly determine if they are the culprits of introducing lots of unique, non-bind variable SQL into your system or not.

V$SESS_IO

This is useful to see how much I/O your (or any other) session has done. I use this view in a fashion similar to V$MYSTAT and V$SESSION_EVENT. I take a snapshot, run some processes, and then review the 'difference' between the two points. This shows how much I/O the process performs. I could get this information from a TKPROF report as well, but this lets me aggregate the counts up easily. TKPROF will show the I/O performed by statement. This allows me to run an arbitrary set of statements and collect the I/O statistics for them.

V$SQL, V$SQLAREA

These views show us all of the SQL that is parsed and stored in the shared pool. We have already used both of these in various places in this section.

V$SQLAREA is an aggregate view. There will be one row per SQL query in this view. The VERSION_COUNT column tells us how many rows we could expect to find in the V$SQL table for that query. Try to avoid querying this view and use V$SQL instead. V$SQLAREA can be quite expensive, especially on an already busy system.

We can use the V$SQLAREA and V$SQL views to see what SQL our system executes, how many times this SQL is executed, parsed, how many logical and physical I/Os it does, and so on. We can use these views to find the SQL that is missing bind variables as well.

V$STATNAME

V$STATNAME contains the mapping from the statistic number to the statistic name. It is useful when joined with V$MYSTAT and V$SESSTAT to turn the statistic number into the readable name.

V$SYSSTAT

Whereas V$SESSTAT keeps statistics by session, V$SYSSTAT keeps them for the instance. As sessions come and go, their entries are added and removed in the V$SESSTAT view. The data in V$SYSSTAT persists until the database is shutdown. This is the information StatsPack uses to derive many of its ratios and such.

V$SYSTEM_EVENT

This is to events what the V$SYSSTAT view is to statistics. It contains instance-level wait event information. It is what StatsPack uses to derive many of its reports as well.

Summary

Tuning, when done 'after the fact', is a bit of luck and a lot of sleuthing. When done as part of the development process, it is straightforward and easy to do. I prefer straightforward and easy any day, especially when the other alternative involves 'luck'. Tuning after the fact is one of the hardest things to do. You have to figure out why the system is slow, where the system is slow, and how to make it go faster without ripping everything apart and doing it the way it should have been done in the first place. That last bit is what makes after-the-fact tuning so hard.

If you make performance part of the application from day one you will find tuning to be a science, not an art. Anyone can do it if it is part of the process. In order to achieve this goal, you will need to do things like set metrics by which applications can be tested against. You will need to have instrumented code, so you can diagnose where slowdowns are occurring. The instrumentation part is very important; the database is very much instrumented as shown in the above chapter. I can use this instrumentation to prove that the database is not the slow part. At that point, we still have a 'slow' system - the only thing we now know is that the database is not the slow part. At this point, if the code outside the database is not instrumented, we'll have to guess where the problem might be. Even for code running inside the database, application-level instrumentation is extremely useful for tracking down where things are running slow.

And just in case you missed it above, bind variables are important. I cannot tell you how many times I have seen systems just grind to a halt because the developers found it easier to use string concatenation for all of their queries instead of bind variables. These systems just do not work. It is just too easy to do it the right way in this case; don't make the mistake of not using them. Don't rely on crutches like CURSOR_SHARING as they introduce their own issues.



Expert One on One Oracle
Full Frontal PR: Getting People Talking about You, Your Business, or Your Product
ISBN: 1590595254
EAN: 2147483647
Year: 2005
Pages: 41
Authors: Richard Laermer, Michael Prichinello
BUY ON AMAZON

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net