3.2 Oracle components | Oracle Real Application Clusters

< Day Day Up >

3.2.1 Instance and database

Every Oracle database is associated with a certain memory area. This area is preallocated from the current systems memory and is used by the database to retain and process user and system requests. Oracle calls this area of memory the system global area (SGA). Once it is established, Oracle also starts various background processes for completing the user requests. The SGA and the corresponding Oracle background process are referred to as the Oracle instance.

An Oracle database is composed of various components; their classification is based on the functions that these components perform. These components are broadly classified as the program global area (PGA) and the system global area (SGA).

In Figure 3.1, the various tiers of SGA and PGA are represented. The figure contains the shared pool, database buffer cache, Java pool, large pool, streams pool, and redo log buffer areas that constitute the SGA, and the private SQL area, SQL area, and cursors that constitute the PGA. All of these are areas of memory based on parameter values in a parameter file. When user requests are received, they utilize the memory areas to communicate with the Oracle background processes to accomplish their tasks.

click to expand
Figure 3.1: Oracle instance memory structures.

Oracle 10g

New Feature: Streams Pool is a new memory area introduced in Oracle 10g. This memory area is used for Streams functionality of Oracle.

PGA

The program (process) global area (PGA) stores session specific information. The PGA is a memory area that contains data and controls information pertaining to a single user process.

To accomplish a task or request against the database, the user has to establish a connection. Oracle provides two methods of operations for this purpose: the shared connection and the dedicated user connection. The shared queue connection feature is implemented using the shared server feature (previously called multithreaded server or MTS). The dedicated connection is established when the dedicated server option is configured, which is the default configuration. When the dedicated connection is used, every user process has a dedicated connection to the server and the PGA contains session-related information that is specific to a user process. Not all data contained in this area is shareable by other processes. As illustrated in Figure 3.1, the PGA in a dedicated connection is outside of the SGA.

In a dedicated configuration, the PGA contains the following:

Sort area for sorting data before returning it to the user
Session information, including user privileges pertaining to the session
Current state of various cursors used by the session
Area, or stack space, containing variables used in the session

The PGA is created for a single user session when the connection is established to the server and it is deallocated when the session is terminated. This occurs when the user cancels by aborting his or her session or when the session has completed its task through a commit or rollback operation.

Oracle 9i

New Feature: In Oracle 9i, in a dedicated server configuration, the size of SQL work areas that are part of the PGA could be automatically and globally managed. This is accomplished by specifying the parameter PGA_AGGREGATE_TARGET. By setting a value to this parameter Oracle will try to ensure that the total amount of PGA memory allocated across all database server processes never exceeds this target.

In earlier versions, the DBAs controlled the maximum size of SQL work area by setting SORT_AREA_SIZE, HASH_AREA_SIZE, BITMAP_AREA_SIZE, and CREATE_BITMAP_AREA_SIZE parameters.

In a shared server setup, user connections are queued to the server and every connection or user process shares one or more connections by a method of queuing to a pool of connections. When a session requires a connection to the database, a connection is allocated from this pool. Since the connections are taken from a pool of pre-established connections, the time and resources required to create these connections are eliminated. The number of connections is based on the configuration parameter, SHARED_SERVER_SESSIONS, defined in the parameter file. In the case of the shared server configuration, the PGA is part of the SGA.

Oracle 9i

New Feature: SHARED_SERVER_SESSIONS replaces the MTS_SESSIONS parameter that existed in the Oracle 8i and before.

The shared server option is useful where a large number of users connect to the database almost simultaneously and/or join and leave the connection quite frequently. To illustrate this let us take a large organization, such as an assembly plant of a truck manufacturer. The manufacturer operates in two shifts. The first shift users start up their computer systems and the second shift users turn them off at the end of their shift. When the first shift users come into work, the first thing the users do is connect to the network to access assembly line information from the database. If the server is configured to use a dedicated connection, these several thousand simultaneous user session requests (depending on the configuration) could potentially cause a traffic jam in the network system. The connections requested by the users will all require memory, time, and resources and could cause network blockage. The blockage would occur when the initial requests are not completed while others are also trying to establish a connection. The end result of this is a huge amount of waits encountered by these users, leading to user session requests ultimately timing out.

In a scenario such as this, where a large number of users are connecting or requesting connection to the system at the same time, the shared server option will help balance the resources. Balance can be achieved because user requests are placed in a queue and a connection is provided from a pool of connections already established. When one user session is making a connection, the request from another user session is waiting in the queue and behind this is another user, and so on. If the system is in the situation of having a large number of users waiting in queues, the shared server option could be configured to have multiple connection pools. This is defined by SHARED_SERVERS parameter in the initialization file.

Oracle 9i

New Feature: The initi<SID>.ora file used in Oracle 8i and previous versions of Oracle has been replaced by Spfile<SID>.ora (server parameter file). The spfile is a binary file and changes to the parameters are done online through an active instance. In Oracle 9i most of the parameters used for management of the instance are dynamic and can be changed while the instance is up.

In Oracle 9i the multithreaded server (MTS) option has been renamed to SHARED SERVER.

The SHARED_SERVERS parameter replaces the MTS_SERVERS parameter that existed in prior versions of Oracle (Oracle 8i and before). This parameter specifies the number of server processes to be created when an instance is started. Another parameter, called SHARED_MAX_SERVERS (called MTS_MAX_SERVERS in prior versions of Oracle), specifies the maximum number of shared server processes allowed to be running simultaneously.

Oracle 10g

New Feature: With Oracle 10g, Shared Servers do not have to be preconfigured in the parameter file. They can be dynamically configured.

SGA

The system global area (SGA) is the memory structure of an Oracle instance and contains data and control information for the Oracle server. The SGA is a memory area allocated from the virtual memory area of the computer created during database startup through one or more parameters defined in the parameter file. Unlike the PGA (in a dedicated connection), the SGA contains shared information from various processes accessing the Oracle server. As discussed in the previous section, in the case of a shared server configuration, the PGA is part of the SGA and contains shared connections.

The SGA is further divided into the shared pool, Java pool, large pool, the database buffer cache, and the redo log buffer. Each of these com ponents, or areas, within the SGA has their own functional responsibility.

Shared pool

The shared pool consists of the data dictionary cache and the library cache. Its functionality includes keeping the most recently used metadata information in memory for reuse by other users utilizing or executing the same statements.

The transaction life cycle, covered in more detail in Chapter 5 (Transaction Management) later in this book, is an operation comprising multiple phases and covers the entire span of the transaction, from the point that the user makes a request via a query or data manipulation language (DML) statement through the phase that the data is retrieved and returned back to the user. The initial phase of this operation is the parse phase and is the most expensive phase for processing the statement. To help in maintaining lower overheads, the database administrator and the developer have to thoroughly validate and tune the SQL statements. It would also be advisable to encourage reusability of SQL statements. This could be accomplished by developing queries for a specific table generic enough so that it could be shared. Sharing of SQL statements help in lessening utilization of resources.

Oracle computes a hash value for every statement that is parsed by a user and this hash value is stored in the hash bucket when the statement is placed in the library cache. Once the first statement is stored in the library cache, the next time another query is executed, Oracle computes a hash value and then compares it with the one in the bucket. If a match is found, the statement from the library cache is reused and Oracle will not have to reparse the query. (A similar algorithm is also used to find the actual row of data.) However, if a match is not found, Oracle performs the following steps to execute the query:

Parses the statement
Validates authenticity to the tables referenced in the query
Loads metadata information into the dictionary cache
Retrieves the row (if all the validations and authenticity checks were successful)

If any of the validation steps fails, then Oracle cancels the request and returns an error message to the user.

Because of the fact that the hash value is generated based on the contents in the statement, any variation of the statement structure causes a variance in the hash value. A variance in the hash value makes Oracle maintain another copy of the same statement in memory. (It is important to group all SQL queries used inside the application code to maintain a lower variance between the queries.) Another situation in which a variance is created is when a bind variable is used in the WHERE clause of an SQL query statement. Bind variables have values that change based on the application or process, and the memory area allocated for one value may not be sufficient for another statement.

Oracle, by default, assigns a 50-byte memory area for the initial bind value in the bind variable. Suppose the bind value of the initial query is a small string value, for example ''SMITH.'' When another user executes the same query with a longer value greater than 50 bytes, Oracle finds that the existing memory structure cannot hold this new value and allocates another memory structure with this new WHERE clause and bind value. This variation causes SQL query flooding in applications having several queries from various users. Users can see these variations by querying the SQLTEXT column in the V$SQLAREA data dictionary view. Therefore, tuning or validation of SQL statements during the design and development phase of the development cycle can help to achieve a more efficient system.

The variance of SQL queries brings about another complexity. Every query (cursor) is defined in two parts: the cursor header and cursor body. While the cursor head is fixed, the contents do not change and remain in the cache until they are aged out; the cursor body contains the bind variables and execution plan for the statement and can contain multiple versions. There are several reasons for having multiple versions of the cursor body, the primary reason being the high usage of literals instead of bind variables. Another reason is insufficient length being allocated when the bind variables are defined, for example, in PL/SQL definitions.

Java pool

The Java pool is used to configure memory for session-specific Java code and the Java virtual machine (JVM). Depending on whether the dedicated server option or the shared server option is used, the memory allocated via the Java pool is used differently.

Large pool

The database administrator can configure an optional memory area, called the large pool, to provide large memory allocations for:

Session memory for the shared server option of establishing connections to the database and the Oracle XA interface (used where transactions interact with more than one database)
I/O server processes
Oracle backup and restore operations, when RMAN is used
Parallel execution message buffers, if the initialization parameter PARALLEL_AUTOMATIC_TUNING is set to TRUE (otherwise, these buffers are allocated to the shared pool)

Streams Pool

This pool is new in 10g and is used by the Oracle streams component, to store captured events in addition to its use for internal communication during parallel capture and capture. The pool is part of the SGA and is defined using the parameter STREAMS_POOL_SIZE. Like the other pool areas that are part of the SGA, if STREAMS_POOL_SIZE is not defined or is set to a zero value, then up to 10% of memory from the shared pool may be used.

Database buffer cache

The database buffer cache is the portion of the SGA that holds copies of data blocks read from data files. All user processes concurrently connected to the instance share access to the database buffer cache.

As illustrated in Figure 3.2, the buffers in the cache are organized into two areas, the most recently used (MRU) area and the least recently used (LRU) area. In Version 8i, Oracle introduced a new concept of managing the data in the buffer cache based on an algorithm called the touch count algorithm (TCA). Over the years Oracle has been improving the buffer management algorithm by constantly providing performance improve ments. Unlike the previous versions of the buffer management, under the TCA option, the determination of where the row will reside (MRU or the MRU end of the LRU chain) is based on the number of times the buffer is touched (touch count). Prior to Version 8i, the data retrieved was placed on the MRU area of the buffer. Based on the number of times it was used the data was retained at the top of the MRU area. However, if the data was not used it was moved to the LRU end of the chain before being marked as dirty and eventually getting flushed out of the cache.

click to expand
Figure 3.2: Database buffer cache (shared pool).

Under the TCA, the buffer cache is equally divided into the two basic areas, the hot buffer area and the cold buffer area. Between the hot buffers and the cold buffers is a midpoint, called the insertion point, or a buffer midpoint. When a block is read from disk and placed into the buffer cache it places the buffer in the middle, that is at the midpoint between the hot and cold buffers with an initial touch count value of one. During this process, Oracle looks for a free buffer to place the block read from disk. If it finds that a block's touch count is greater than two it is moved to the LRU end of the MRU chain. (This value of two is the default value used to determine the initial and future placement of blocks. This is controlled by a hidden parameter.^[1]) When a buffer is moved to the MRU end of the LRU chain (i.e., closer to the insertion point), its touch count is usually set to zero. Consequently, if the touch count is not sufficiently incremented quickly enough it could get pushed back to the LRU area of the buffer.

Figure 3.2 represents the database buffer cache and how the data is moved down the buffer space to the LRU end of the buffer. This is based on the data being reused by other processes as well as on the modifications finally written to their respective data files.

How does a buffer move from the MRU end of the LRU chain beyond the midpoint? When a buffer has reached a touch count value to enable movement from a cold buffer region to a hot buffer region, crossing the midpoint, another buffer from the hot region has to provide space for this new hot buffer movement. When this happens, a hot buffer that has, for example, a touch count value of two, when moved over to the cold region, will be reset with a touch count value of one. Now, if the buffer is not frequently touched and the value not incremented, there is a potential that it would be aged out to disk.

As discussed earlier, based on how many times the buffer is touched, the buffer will earn its way to the top of the buffer chain. However, the increase in touch count and movement of buffers do not happen at the same time, rather, they are independent of each other. Additionally, when a touch count is incremented it does not move the buffer up the chain. By default, Oracle only allows a buffer's touch count to be increased once every 3 seconds. Modifying some of the hidden parameters can change this default value of 3 seconds. This process of pushing the rows into the LRU continues until it reaches the edge of the structure and finally gets pushed off. At this point, the data is written to disk. The buffers at the end of the buffer are called dirty buffers.

The database buffer cache contains the data buffers that keep the latest data changes in memory before writing it to disk. As mentioned previously, the changes are kept in memory and pushed down the buffer tree until it reaches the end, when the data is written to disk. This component helps the Oracle engine, or system, to defer the write operation to disk, thus, helping in I/O and diverting the available resources to other parts of the system where it could be used more efficiently. The data buffers contain data blocks, index blocks, rollback segment blocks, and temporary segment blocks. The database buffer cache contains a buffer for the data and a buffer for the rollback block. The data in the database buffer cache is written to the data files by the database writer process (DBWR).

To gain efficient functioning of the database, the database buffer cache should be rightly sized. The number of buffers in the database buffer cache is defined by the DB_CACHE_SIZE parameter defined in the parameter file. The size of each buffer is based on the DB_BLOCK_SIZE parameter, which is also defined in the parameter file. DB_BLOCK_SIZE specifies the size (in bytes) of Oracle database blocks.

Oracle 9i

New Feature: The DB_CACHE_SIZE parameter replaces the DB_BLOCK_BUFFERS parameter that existed in prior versions of Oracle. This parameter specifies the number of database buffers in the buffer cache.

Oracle 9i supports multiple block-sized databases; the block size can now be set at the tablespace definition level, thus allowing tablespaces with different block sizes to be created. In addition, up to five non-standard block sizes may be specified. The data block sizes should be a multiple of the operating system's block size, within the maximum limit to avoid unnecessary I/O. Oracle data blocks are the smallest units of storage that Oracle can use or allocate. However, a default DB_BLOCK_SIZE is defined in the parameter file that is used for primary DB_CACHE_SIZE calculation. Additional subcaches could be defined using each of the additional block sizes. This allows for creation of databases supporting multiple purposes, such as an OLTP system and a data warehouse.

When tuning the buffer cache area, while a 95-99% hit ratio is desired, it does not indicate that the database is performing at its peak. This percentage value purely translates to the fact that 95-99% of the time, the system has found a matching buffer hit. It does not indicate that all the buffer requests are met in a sufficiently good time frame. For example, when a user executes a query to retrieve 100,000 rows from a table, depending on the size of the SGA, performing a 2- or 3-table join would show a 95% or higher hit ratio, despite the fact that the response time for the query to return the result set may not be satisfactory. The best method to determine the reasons is to look at Oracle's wait interface by querying the data dictionary view provided by Oracle, such as the V$SESSION_EVENT, V$SESSION_WAIT, and V$SYSTEM_EVENT.

A more detailed discussion on these views, and other performance related views, are given later in Chapter 13 (Performance Tuning).

Redo log buffers

The redo log buffer keeps all changes, or redo entries, of the data being modified. In the case of system failure and for recovery purposes, this functionality of the component helps in regeneration of changes. This circular buffer is used to record changes to the database and record if the changes have been committed. The changes made to the database are made by the end user issuing any of the various DML operations (e.g., INSERT, UPDATE,or DELETE). When the Oracle server moves data from the user's memory space, it writes to the redo log buffer. With the after image and before image, the redo log buffer also contains a flag that indicates whether the data has been committed. The log writer (LGWR) writes the data in the redo log buffers to the redo log files. The size of the buffer is specified by the parameter LOG_BUFFER. Though the value in this parameter is used to create the redo log buffer, this allocation of redo log buffer provides a mechanism to create equally sized redo log files. Redo log files are explained later in this section.

The redo entry records the block that has changed, the location of the change, and the new value. The contents of the redo log buffer reflect the state of both the old and the new data. Oracle maintains a flag that indicates whether the data has been committed. Once the user has issued the commit statement, the flag is updated and the data is written to the redo log file on the disk.

It is the generation of redo, and the subsequent writing of that redo to the log file that makes the change real. Once an entry is successfully written in the log file then the change can have been considered to have occurred, regardless of what happens to the database system after this.

The redo log buffer is used sequentially. For example, all changes made to the database are recorded sequentially in the redo log buffer. Hence, the changes made by one transaction may intertwine with changes made by other transactions. Only after the redo record has been built in the redo buffer can the data block buffers be changed.

When the redo log buffer is one-third full, the data from the redo log buffers is flushed into the redo log files.

Other circumstances when the redo buffer information is written to disk include:

When the user issues a commit statement
When an LGWR process timeout occurs (which defaults to every 3 seconds)
Before the DBWR process writes the modified blocks to the data files

Oracle 9i

New Feature: In Oracle 9i, the SGA can be dynamically managed while the instance is running. The values of the buffer pool, shared pool and large pool could be changed dynamically. However, to protect an uncontrolled change of these areas, Oracle has introduced a new parameter, SGA_MAX_SIZE, that would help define a max value for these parameters.

The size of the SGA is determined by several parameters:

DB_CACHE_SIZE
LOG_BUFFER
SHARED_POOL_SIZE
LARGE_POOL_SIZE

3.2.2 Files

Figure 3.3 provides an overview of the various types of files that are part of the Oracle architecture. There are other files like the trace files and archive log files that are also part of the architecture.

click to expand
Figure 3.3: Files.

Control file

The control file contains the information needed to maintain and verify database integrity as well as information related to the database, the location of data files, tracking of the system change number (SCN), location of archive log information, etc. Since most of the critical information relating to the status of the database is stored in the control file, it is used during the recovery process. The existence of the control file is important for database startup. For this reason, the control file is sometimes considered the backbone of the database. A control file contains information such as:

Database name
Timestamp of database creation
Names and locations of associated data files and online redo log files
Tablespace information
Data file offline ranges
Log history
Archived log information
Backup set and backup piece information
Backup data file and redo log information
Data file copy information
Current log sequence number
Checkpoint information

Every time an instance of an Oracle database is started, the control file identifies the database and redo log files that must be opened for database operation to proceed. If the physical makeup of the database is altered (e.g., if a new data file or redo log file is created), then the control file is automatically modified by Oracle to reflect the change. A control file is also used in database recovery.

Using the human spinal cord as an analogy: the spinal cord is important for a human body because, without the spinal cord, the body would have different forms and may have to move like a caterpillar. Similarly, the control file is important for a system because if a control file is corrupted and no backup is available to re-create the control file, it may cause the database to be in an unrecoverable state. Due to its importance, it is a normal practice to keep a minimum of three control files (and also keep backup copies of control files) in text format by doing a BACKUP TO TRACE.

Table 3.1 illustrates the differences in the control file layout between Oracle 8i and 9i. The layout has considerably changed, with Oracle storing more database-related information in the control file. This additional information indicates that the control file would have more activity compared with the previous versions, and its importance also increases because more critical database information is now stored in the control file.

Table 3.1: Control File Layout
Oracle 8i	Oracle 9i
Generic file header	Generic file header
Database entry	Database entry
Checkpoint records	Checkpoint progress records
Redo thread records	Extended database entry
Log file records	Redo thread records
Data file records	Log file records
Temp file records	Data file records
Tablespace records	Temp file records
Log file history records	Tablespace records
Offline range records	RMAN configuration records
Archived log records	Log file history records
RMAN information	Offline range records
	Archived log records
	Backup set records
	Backup piece records
	Backup data file records
	Backup log records
	Data file copy records
	Backup data file corruption records
	Data file copy corruption records
	Deletion records
	Proxy copy records
	Incarnation records

Redo log files

The redo log files contain the record of all transactions made against the database. These records, or logs of transactions, are used by Oracle for recovery in case of a database crash. As redo logs are critical to maintaining data integrity, Oracle writes data to this file on a continuous basis. Based on certain conditions, Oracle writes to the database to make the changes permanent; this write operation is performed in a batch mode and it helps to improve overall performance of the database. To allow for continuous writing to the redo log files and, at the same time, to write the existing data to the database, Oracle's architecture requires a minimum of at least two redo log files. The redo logs are used in a round-robin, or cyclical manner, switching between the files when a file is full. Having just two files for rotation could be dangerous, especially when the redo log files are overwritten. This could mean that if the disk containing one redo log file fails, data in that specific log file is lost. If the data in the database is lost, then data pertaining to the time period when this log file was created is also lost. In other words, there is a single point of failure.

To protect the redo log file from a single point of failure, Oracle recommends the creation of redo log groups. Each group should contain two or more redo log files. In this situation, all files belonging to the group are written to simultaneously.

Figure 3.4 represents a configuration with three redo log groups containing two redo log files each. Data is written simultaneously to both files that belong to a specific group. When the files in Group 1 become full, data is written to files in Group 2, and so on. When data in Group 3 is full, data is written to files in Group 1 and again to files in Group 2, in a cyclical manner between the various groups. When a group is full and data starts getting written to the next group, the existing contents that are contained in the redo log files in the redo log group are overwritten.

click to expand
Figure 3.4: Redo log groups.

When performing the log switch, the next log to be used must not contain changes required in the event of instance recovery being needed. Oracle ensures that the current thread checkpoint must be beyond that log file, or in other words, the thread checkpoint SCN must be beyond the highest SCN allocated in the log to be switched.

Oracle recommends that the files belonging to each group be placed on different disks. The advantage of this placement is that when one disk containing one of the redo log files fails, Oracle will continue writing to the other redo log file that is part of the same redo log group. This protects the database from a single point of failure from the redo logs and helps to improve performance. If only one file per group was set up, then the database would hang, or crash when a disk containing the redo log file is not available for the write operation. On the other hand, in a group containing two files, when one file is not available, Oracle continues its operation by writing to the other available files in the same group. Before the files in Group 1 are overwritten, the files are copied into an archive media called archive redo logs if archiving is enabled. The message ''checkpoint not complete'' in the alert log will indicate that LGWR is waiting to switch into an active log file. Under these circumstances increasing the number of log files, the size of the log files or the frequency of checkpointing will help alleviate the issue.

Redo log files are a core component of the Oracle file system; consequently, it is important that considerable effort is given to locate them on high-performance disks for throughput.

While configuring the redo log files it is important that all files in a group, and all groups, are created with the same size. This aids in easy maintenance and consistency in the amount of information stored.

Redo log files are instance-specific because they contain information pertaining to the redo log buffers, which are part of the SGA. Because of this, every instance in a RAC configuration will have its own copy of the redo log files.

Archive redo log files

Archive redo log files, or archive log files, as mentioned above, are offline copies of redo log files that may be necessary to recover the database from media failures. An archive log file is a copy of the redo log files made before being overwritten. For example, when a redo log file becomes full, the redo information starts getting written to the next set of redo log files. (The archive log file is only created if the archive log mode is enabled in the parameter file.) Redo log files help in instance recovery. Archive log files provide for recovery of data not backed up to external media.

For example, when a database needs to be restored to a specific point in time, the following occurs:

The latest backup is restored from the backup media.
All the archive log files from the point that the backup was completed are applied one after the other up to the last archive log file.

In everyday operations, it happens quite often that there is a requirement to restore the database and recover the data lost. To protect the archive log files from any unrecoverable circumstances, it is important to create multiple copies of the archive log files at different locations, to avoid the single point of failure. When one archive log file is corrupt, or is lost due to disk failure, the data from the available archive log file is used for a restore operation. Apart from the situations explained above, archive log files are required for certain backup operations using RMAN; backup and recovery operations using RMAN are explained later in Chapter 12 (Backup and Recovery).

In Figure 3.5, when the redo log files are switched from Group 1 to Group 2, a copy of the redo log files from Group 1 is copied to the archive redo log files. When the files in the next group are filled, the next group (Group 2) is active and the data from Group 2 is written to the archive log directories. Unlike the redo log files, where the files are overwritten in a cyclical manner, the archive redo log files are not. New archive redo log files are created during every redo log switch.

click to expand
Figure 3.5: Archive redo log groups.

Since the archive logs are copies of the redo log files, they inherit the characteristics (such as the file size) of the redo log files. However, care should be taken in placement of these files. For example, archive log files are critical to the operation of the database and, because of this, they should not be placed on the same disk as the redo log files.

With RAC on systems where the data files must be on a raw partition, the redo logs could be placed on either a file system or on a raw partition, whereas the archive logs must be placed on a file system. Archive redo logs are instance-specific; therefore, every instance will have its own copy of the archive redo log files.

Parameter files

Parameter files are used to define the characteristics of an Oracle instance. With every release, certain parameters become obsolete and new ones are added to improve the efficiency and to introduce new features. The parameter file also contains pointers to important files that Oracle needs to be aware of during startup and shutdown of the database.

When an instance is started, the parameter file is identified. The parameter file contains the location of the control file, and the control file contains the location of all other data files, tablespaces, and system files. The control file also maintains the current state of the database. The parameter file contains details of the destination of the archived log files and certain format parameters for the archive log files. Among other important parameters that are defined in the parameter file is the database name used to identify the database.

Password file

The password file contains relevant information for database authentica tion. This file is stored in binary format like most other Oracle files. The file is required to be present during the database startup; however, the file can only be updated when the database and instances are up. The password file contains information for remote authentication of users. When a user, through an SQL*Plus client, or one using the Oracle Enterprise Manager^[2] (OEM), requires access to the database to perform certain operations, the user is required to have an entry in this file. Privileges such as SYSOPER and SYSDBA require the user name to be present in the password file. This file stores the user and the special privileges that he or she has. Like the parameter file discussed above, this file needs to be present for database startup. However, no entries are required when access to the database from a remote location is required. The password file is another file that the control file has no knowledge of but its inclusion in the regular backup strategy of the database is crucial.

Alert files

Alert files (also called alert log files) contain messages generated by the Oracle kernel to inform about any events or changes that happen to the database on a time-to-time basis. Some of the entries found in alert files include:

Database startup and shutdown information
Information regarding when the archive processes and the starting and completion of archive operations
Any errors encountered by the background processes
Any errors with error number encountered by the Oracle kernel

Unlike the other files mentioned above, this file contains messages to help the database and system administrators monitor and deduce any problems that the database or instance has encountered. Upon observing these errors, the database administrator will take precautionary measures to resolve them. This file is written to on a continuous basis and consequently grows to a considerably large size. However, this file is normally not included in the backup strategy. To avoid having to maintain a large file, it is advisable that this file be copied and deleted periodically and to let the database create a new file.

3.2.3 Background processes

Figure 3.6 displays the various background processes used by Oracle. Each background process has a specific function that it performs. During this process, interaction takes place between the logical storage area of the database and one or more physical storage areas of the database. These background processes interact with the SGA. For example, the DBWR process is used for writing rows of data from the database buffer cache from memory to disk. During this process, the LGWR process writes information from the redo log buffers to disk. Other background processes are the process monitor (PMON), responsible for cleaning up unused resources, and system monitor (SMON), responsible for cleaning up unused temporary segments.

click to expand
Figure 3.6: Oracle background processes.

An Oracle instance comprises background and foreground processes that are required to service requests from various users. Every instance has several background processes. The background processes essential for the existence of an instance are:

System monitor (SMON)
Database writer (DBWn)
Process monitor (PMON)
Log writer (LGWR)
Checkpoint process (CKPT)

These background processes interact as a bridge between the memory (or SGA) of the database and the various permanent storage media (or files). The SGA is an area of memory predefined during database startup to store and share user or process related information.

Process monitor

The process monitor, identified by PMON, is responsible for cleaning up the resources when one of the processes fails. Typically users interrupt their existing process by aborting a session, causing the exit from an SQL*Plus client session without a commit or rollback operation. This occurs when the window is shut down before a query completes. These situations could leave a process on the server machine hanging without a client connection. By performing a rollback of the uncommitted transactions and freeing up resources for other processes to use, PMON takes the responsibility of cleaning up such processes. PMON also releases all locks held by the process and restarts processes that have a failed status during this recovery operation.

System monitor

The system monitor, identified by SMON, initiates recovery of the database when a failed instance is started again. SMON is responsible for cleaning up temporary segments that are no longer required, for coalescing free extents in data files to create larger extents, and recovering terminated transactions skipped during recovery. These transactions are eventually recovered by SMON when the tablespace is brought back online. SMON also coalesces free extents in the dictionary-managed tablespaces to make free space contiguous and easier to allocate. Most of the activities performed by SMON occur during low peak times.

SMON works in a cyclical fashion. For example, SMON does its operation performing multiple passes over the transactions that require a recovery operation. SMON only applies a certain number of undo records per transaction during each pass.

SMON is also responsible for the recovery of transactions that have not been committed when the database is started after a database crash or a shutdown abort operation.

With RAC, the SMON process of one instance can perform instance recovery for failed CPU or instance of other instances that have failed.

Database writer

The database writer, identified by DBWR, is responsible for writing changed data to the database. In the section on the database buffer cache, we discussed the movement of data to the bottom of the LRU stack if the touch count was low when data with a higher touch count required space to move up. The rows at the end of the buffer cache are called dirty buffers. The DBWR writes the dirty buffers to disk. By writing these dirty buffers from the buffer cache to the data files, it ensures sufficient free buffers are available in the database buffer cache. The free buffers allow for new data to be stored in the buffer cache, while removing old data no longer used or required. This writing of dirty buffers from the database buffer cache to disk only happens if space is needed to store new data in the buffer.

This feature of the DBWR to write information from the buffer cache only if needed provides performance benefits. This reduces I/O activity on the database because Oracle does not write the committed rows immediately but instead batches them together.

The actual writing of changed data to disk is done at a later time by the DBWR process. This happens when any of the following events occur:

The number of dirty buffers reaches a threshold value: This value is reached when there is not sufficient room in the database buffer cache for more data. In this situation, Oracle writes the dirty buffers to disk, freeing up space for new data.
A process is unable to find free buffers in the database buffer cache while scanning for blocks: When a process reads data from the disk and does not find any free space in the buffer, this triggers the least recently used data in the buffer cache to be pushed down the stack and finally written to disk.
A timeout occurs: The timeout is configured by setting the required timeout interval (LOG_CHECKPOINT_TIMEOUT) through a parameter defined in the parameter file. On every preset interval, the timeout is triggered to cause the DBWR to write the dirty buffers to disk. This mechanism is used by Oracle for transaction recoverability in case of failure. In an ideal system, where the data is occasionally modified but not written to disk (due to not having sufficient activity that would cause other mechanisms to trigger the write operation), the timeout is helpful.
The checkpoint process: During a predefined interval defined by the LOG_CHECKPOINT_INTERVAL parameter, when the CKPT process is triggered, it causes the DBWR and the LGWR process to write the data from their respective buffer cache to disk.

Configuring multiple database writer processes or using I/O slaves is useful, especially in systems where there is a high transaction rate.

The DBWR process is defined using the DB_WRITER_PROCESSES parameter in the parameter file. Using this parameter, multiple DBWn processes (from DBW0 to DBW9 and from DBWa to DBWj) could be configured. Multiple DBWn processes distribute the work required to identify buffers to be written and also distribute the I/O load over these processes. Selection of the number of DBWn processes required for any system should not be arbitrary, but should be based on analysis of the volume of transactions and the number of buffers required to be scanned before writing to disk. It is also dependent on the number of CPUs that the system currently has; the rule of thumb is to configure one DBWn for every eight CPUs.

Another parameter that supports the DBWn process is the DBWR_IO_SLAVES; configuring this parameter allows I/O load to be distributed over multiple slave processes. I/O slave processes are used when the system has only one CPU available and there is high transactional activity causing slowdown in database writes; under other circumstances, multiple DBWR processes should be preferred over the I/O slaves.

Checkpoint process

The checkpoint process, identified by CKPT, is responsible for updating the database status information whenever changes in the buffer cache are permanently recorded in the database. All of the data, regardless of whether it has been committed or not, is written to the data files. During this event, all the headers of these data files are updated by a checkpoint. The process of writing this checkpoint information into the header ensures that the data files are synchronized. At specific intervals, all modified database buffers in the SGA are written to the data files. This event is called a checkpoint. Though the DBWR processes perform the actual writing of data to the data files, the checkpoint process is responsible for signaling the DBWR processes.

The checkpoint process is activated in the following circumstances:

The redo log switch occurs: The redo log switch occurs under various circumstances described in the LGWR process (below). When a specific redo log file is full, the CKPT processes are activated before the process starts writing to the next available redo log file.
The LOG_CHECKPOINT_INTERVAL parameter is set: The LOG_CHECKPOINT_INTERVAL parameter is defined in the parameter file. Giving this parameter an interval value causes the CKPT process to become activated at the specified interval. This value should be set to a high value equal to the size of LGWR processes. Setting this to a small value causes frequent activation of the CKPT process to run and creates a bottleneck for the database operation.
The beginning of a tablespace backup: In the case of a tablespace level backup, when the system is active and in use, the CKPT process is activated at the beginning of a tablespace backup. The tablespace level backup mentioned in this section is also referred to as a hot backup. Hot backups are taken when the system is alive and in use. Due to the state of this process, the data files are active and the CKPT process request ensures that all changes to the file are completed to ensure a cleaner backup copy.
The tablespace is taken offline: When a specific tablespace is taken offline, either manually or through an automated process, the CKPT process is activated to write the current tablespace information to the header of the tablespace and to the control file.
The instance is shut down through a normal shutdown process: When the system is shut down through a normal shutdown process, Oracle RDBMS performs the following:
1. Activates CKPT processes
2. Activates the DBWR process in order to update all header files with the latest information
3. Activates the LGWR processes in order to flush all buffer information to disk

This writing of information to disk ensures synchronization of files and does not cause the recovery process during a database startup operation.

In cases where the database is shut down by an abort or crash operation this information is not written. Therefore, when the database is started again, it goes into a recovery mode.

Log writer

The log writer process, identified by LGWR, is responsible for recording changes registered in the redo log buffer into the redo log files. This is a sequential process and is triggered by the following situations:

The redo log buffer is one-third full: When the redo log buffer is continuously written to and it reaches one-third of its buffer capacity, it triggers the LGWR processes to write the information to disk. This mechanism is to allow for new redo information to be saved in the redo log buffer.
A timeout occurs (every 3 seconds): A default timeout interval is defined at the database level, occurring every 3 seconds. During this timeout, the redo log buffer information is written to disk. If the redo logs are configured as dual files to one group, then both files are written to at the same time. If one of the files is not available due to a bad disk or being taken offline, then the LGWR writes to the available redo log file and continues its operation. However, if none of the redo log files that belongs to the group is available, then the database hangs. This occurs when the redo log file has not been copied when the database is configured in ARCHIVELOG mode.
The DBWR process writes the changed blocks from the database buffer cache to the data files: When the database buffer cache is being flushed to disk it also triggers the LGWR process to write the redo buffer information to the redo log files. The DBWR process writes the dirty buffers to disk when the MRU area is required for new data. During the operation of writing the dirty buffers from the database buffer cache, the LGWR process is triggered.
A user or process commits the transaction: When the user process issues a commit to the transaction, the LGWR writes the redo log buffer to disk. During the commit processes, all redo log buffer information currently in the redo log buffer is flushed to disk. This writing of redo log entries to disk is synchronized for multiple committed transactions, saving I/O and other system resources, after the first transaction's entries are written to the redo log files. When the user commits a transaction, a system change number (or SCN) is assigned to the transaction that is recorded with the transaction's redo entries in the redo log.

Archive process

The archive process, identified by ARCn, is responsible for writing the redo log information to archive log files before allowing for reuse of the redo log files. In the previous section the various types of files used by Oracle and how the archive files are created during redo log file switch activity were discussed. The ARCn background process performs this copy of the redo log to the archive log area.

If the archiving process does not complete before the next redo log file is full, and the switch occurs, the LGWR processes could fail, as the log files are not available for write operation. This could also happen when the log file size is too large, causing the archive to require a significant amount of time to write the information to its destination. Another reason for this delay could be due to slowness of the disk that is used to store the archive log files because of the fact that it is not able to write at the speed at which the data is being read from the file.

The ARCn process is optional and is created when the database is running in ARCHIVELOG mode. The database is set to run under ARCHIVELOG mode by setting the LOG_ARCHIVE_START parameter to TRUE in the parameter file, or by issuing an ALTER command at the database.

An instance can have up to 10 ARCn processes (ARC0 to ARC9). The LGWR process starts a new ARCn process whenever the current number of ARCn processes is insufficient to handle the workload. The starting and stopping of the various archive operations and starting of additional ARCn processes is recorded in the alert file.

Snapshot process

The snapshot process, identified by SNPn, is responsible for refreshing snapshots. The number of SNPn processes is defined by setting the JOB_QUEUE_PROCESSES parameter in the parameter file. The maximum value for this parameter is 36. This means that 36 SNP0... SNPz processes could be started to automatically refresh table snapshots. The process is automatically activated at a predefined interval to refresh any snapshots that are scheduled to be automatically refreshed. With multiple SNPn processes, the workload is shared among these processes.

Lock manager server process

The lock manager server process, identified by LMS, is noticed in a RAC environment and helps the database to communicate with and access shared resources in a database.

Dispatcher process

The dispatcher process, identified by Dnnn, is present in a shared server configuration. There is a dispatcher process for every communication protocol and it is responsible for routing requests from users to available shared server processes and returning results back to the appropriate users.

Recovery process

The recovery process, identified by RECO, is responsible for resolving failures of the database when the database is running in a distributed environment. RECO automatically connects to the other databases involved in an in-doubt distributed transaction and automatically resolves them. During this process it removes from each database pending transaction tables any rows that correspond to the resolved in-doubt transactions.

Queue monitor

The queue monitor process, identified by QMNn, is an optional process that is used with the advanced queuing option. The function of the QMNn process is to monitor the message queues. Approximately 10 QMNn (QMN0 to QMN9) processes could be configured to manage the message queues.

File mapping monitor

The file mapping monitor process, identified by FMON, is a background process started by Oracle when the FILE_MAPPING initialization parameter is set to TRUE. The functionality of this process is to help monitor I/O by mapping files to immediate storage layers and physical devices for I/O monitoring purposes. FMON is responsible for:

Building mapping information, which is stored in the SGA. The information is composed of the following structures:
1. Files
2. File system extents
3. Elements
4. Subelements
Refreshing mapping information when a change occurs because of:
1. Change to size of Oracle data files
2. Addition or deletion of data files
3. Changes to the storage configuration
Saving mapping information in the data dictionary to maintain a view of the information that is persistent across startup and shutdown operations.
Restoring mapping information into the SGA at instance startup. This avoids the need for a potentially expensive complete rebuild of the mapping information on every instance startup.

FMON spawns an external non-Oracle process called FMPUTL. The functionality of this process is to communicate with the vendor-supplied mapping libraries. Based on the availability of the mapping libraries, it obtains information from all levels of the I/O stack.

Oracle 9iR2

New Feature: The FMON feature is new in Oracle 9i Release 2. Prior to this version there existed no such process or functionality to provide mapping of external files to Oracle.

Data guard

The data guard process identified by DMON is a background process started by Oracle when the DG_BROKER_START initialization parameter is set to TRUE.

^[1]Hidden parameters are not visible to the users via the V$PARAMETER view or using the show parameters. The hidden parameters are used by Oracle for controlling certain behavior. Hidden parameters are not guaranteed to be available during a specific future release and hence should be used with caution.

^[2]Oracle Enterprise Manager is a database management tool, available from Oracle Corporation. More discussion on usage of this tool is forthcoming in Chapter 13 (PerformanceTuning :Tools and Utilities).

< Day Day Up >