Isolation Levels | DB2 9 Fundamentals: Certification Study Guide

In Chapter 5, "Working with DB2 Data Using SQL and XQuery," we saw that a transaction (otherwise known as a unit of work) is a recoverable sequence of one or more SQL operations grouped together as a single unit, usually within an application process. The initiation and termination of a single transaction defines points of data consistency within a database-either the effects of all SQL operations performed within a transaction are applied to the database and made permanent (committed) or the effects of all SQL operations performed are completely "undone" and thrown away (rolled back).

In single-user, single-application environments, each transaction runs serially and does not have to contend with interference from other transactions. However in multi-user environments, transactions can execute simultaneously, and each transaction has the potential to interfere with any other transaction that has been started but not yet terminated. Transactions that have the potential of interfering with one another are said to be interleaved, or parallel, whereas transactions that run isolated from each other are said to be serializable, which means that the results of running them simultaneously will be no different from the results of running them one right after another (serially). Ideally, every transaction should be serializable.

Why is it important that transactions be serializable? Suppose a salesperson is entering orders into a database system at the same time an accountant is using the system to generate bills. Now, suppose the salesperson enters an order for Company X to get a price quote but does not commit the entry. While the salesperson is relaying the price quote information to an individual from Company X, the accountant queries the database for a list of all unpaid orders, sees an unpaid order for Company X, and generates a bill. Now, suppose the individual from Company X decides not to place the order because the quoted price is higher than anticipated. The salesperson rolls back the transaction because no order was placed, and the order information used to produce the price quote is removed from the database. However, a week later, Company X receives a bill for an order it never placed. If the salesperson's transaction and the accountant's transaction had been isolated from each other (serialized), this situation wouldn't have occurred-either the salesperson's transaction would have finished before the accountant's transaction started or the accountant's transaction would have finished before the salesperson's transaction started. In either case, Company X would not have received a bill.

When transactions are not serializable (which is often the case in multi-user environments), the following types of events (or phenomena) can occur:

Lost Updates: This event occurs when two transactions read the same data and both attempt to update that data, resulting in the loss of one of the updates. For example: Transaction 1 and Transaction 2 read the same row of data and calculate new values for that row based upon the original values read. If Transaction 1 updates the row with its new value and Transaction 2 then updates the same row, the update operation performed by Transaction 1 is lost.
Dirty Reads: This event occurs when a transaction reads data that has not yet been committed. For example: Transaction 1 changes a row of data, and Transaction 2 reads the changed row before Transaction 1 commits the change. If Transaction 1 rolls back the change, Transaction 2 will have read data that never really existed.
Nonrepeatable Reads: This event occurs when a transaction reads the same row of data twice and gets different results each time. For example: Transaction 1 reads a row of data, then Transaction 2 modifies or deletes that row and commits the change. When Transaction 1 attempts to reread the row, it will retrieve different data values (if the row was updated) or discover that the row no longer exists (if the row was deleted).
Phantoms: This event occurs when a row of data matches some search criteria but isn't seen initially. For example: Transaction 1 retrieves a set of rows that satisfy some search criteria, then Transaction 2 inserts a new row that contains matching search criteria for Transaction 1's query. If Transaction 1 re-executes the query that produced the original set of rows, a different set of rows will be returned (the new row added by Transaction 2 will now be included in the set of rows produced).

Because several different users can access and modify data stored in a DB2 database at the same time, the DB2 Database Manager must be able to allow users to make necessary changes while ensuring that data integrity is never compromised. The sharing of resources by multiple interactive users or application programs at the same time is known as concurrency. One of the ways DB2 enforces concurrency is through the use of isolation levels, which determine how data accessed and/or modified by one transaction is "isolated from" other transactions. DB2 9 recognizes and supports the following isolation levels:

Repeatable Read
Read Stability
Cursor Stability
Uncommitted Read

Table 7-1 shows the various phenomena that can occur when each of these isolation levels are used.

Table 7-1: DB2 9's Isolation Levels and the Phenomena That Can Occur When Each Is Used
Open table as spreadsheet
Isolation Level	Phenomena
	Lost Updates	Dirty Reads	Nonrepeatable Reads	Phantoms
Repeatable Read	No	No	No	No
Read Stability	No	No	No	Yes
Cursor Stability	No	No	Yes	Yes
Uncommitted Read	No	Yes	Yes	Yes
Adapted from Table 2 on page 55 of the IBM DB2 Version 9 for Linux, UNIX, and Windows Performance Guide.

The Repeatable Read Isolation Level

The Repeatable Read isolation level is the most restrictive isolation level available. When it's used, the effects of one transaction are completely isolated from the effects of other concurrent transactions. Lost updates, dirty reads, nonrepeatable reads, and phantoms cannot occur.

When this isolation level is used, every row that's referenced in any manner by the owning transaction is locked for the duration of that transaction. As a result, if the same SELECT SQL statement is issued multiple times within the same transaction, the result data sets produced are guaranteed to be identical. In fact, transactions running under this isolation level can retrieve the same set of rows any number of times and perform any number of operations on them until terminated, either by a commit or a rollback operation. However, other transactions are prohibited from performing insert, update, or delete operations that would affect any row that has been accessed by the owning transaction as long as that transaction remains active.

To ensure that the data being accessed by a transaction running under the Repeatable Read isolation level is not adversely affected by other transactions, each row referenced by the isolating transaction is locked-not just the rows that are actually retrieved or modified. Thus, if a transaction scans 1,000 rows in order to retrieve 10, locks are acquired and held on all 1,000 rows scanned-not just on the 10 rows retrieved.

Tip

If an entire table or view is scanned in response to a query, the entire table or all table rows referenced by the view are locked. This greatly reduces concurrency, especially when large tables are used.

So how does this isolation level work in a real-world situation? Suppose you use a DB2 database to keep track of hotel records that consist of reservation and room rate information, and you have a Web-based application that allows individuals to book rooms on a first-come, first-served basis. If your reservation application runs under the Repeatable Read isolation level, a customer scanning the database for a list of rooms available for a given date range will prevent you (the manager) from changing the room rate for any of the room records that were scanned to resolve the customer's query. Similarly, other customers won't be able to make or cancel reservations that would cause the first customer's list of available rooms to change if the same query were to be run again (provided the first customer's transaction remained active). However, you would be allowed to change room rates for any room record that was not read when the first customer's list was produced; likewise, other customers would be able to make or cancel room reservations for any room whose record was not read in order to produce a response to the first customer's query. Figure 7-1 illustrates this behavior.

image from book
Figure 7-1: Example of how the Repeatable Read isolation level can affect application behavior.

The Read Stability Isolation Level

The Read Stability isolation level is not quite as restrictive as the Repeatable Read isolation level; therefore, it does not completely isolate one transaction from the effects of other, concurrent transactions. When this isolation level is used, lost updates, dirty reads, and nonrepeatable reads cannot occur; phantoms, however, can and may be seen. That's because when the Read Stability isolation level is used, only rows that are actually retrieved or modified by the owning transaction are locked. Thus, if a transaction scans 1,000 rows in order to retrieve 10, locks are only acquired and held on the 10 rows retrieved, not on the 1,000 rows scanned. Because fewer locks are acquired, more transactions can run concurrently. As a result, if the same SELECT SQL statement is issued two or more times within the same transaction, the result data set produced may not be the same each time.

As with the Repeatable Read isolation level, transactions running under the Read Stability isolation level can retrieve a set of rows and perform any number of operations on them until terminated. Other transactions are prohibited from performing update or delete operations that would affect the set of rows retrieved by the owning transaction as long as that transaction exists; however, other transactions can perform insert operations. (If rows inserted match the selection criteria of a query issued by the owning transaction, these rows may appear as phantoms in subsequent result data sets produced.)

So how does this isolation level change the way our hotel reservation application works? Now, when a customer scans the database to obtain a list of rooms available for a given date range, you (the manager) will be able to change the rate for any room that does not appear on the customer's list. Likewise, other customers will be able to make or cancel reservations that would cause the first customer's list of available rooms to change if the same query were to be run again. As a result, if the first customer queries the database for available rooms for the same date range again, the list produced may contain new room rates and/or rooms that were not available the first time the list was generated. Figure 7-2 illustrates this behavior.

image from book
Figure 7-2: Example of how the Read Stability isolation level can affect application behavior.

The Cursor Stability Isolation Level

The Cursor Stability isolation level is even more relaxed than the Read Stability isolation level in the way it isolates one transaction from the effects of other concurrent transactions. When this isolation level is used, lost updates and dirty reads cannot occur; nonrepeatable reads and phantoms, on the other hand, can and may be seen. That's because in most cases, the cursor stability isolation level only locks the row that is currently referenced by a cursor that was declared and opened by the owning transaction. (The moment a record is retrieved from a result data set, a pointer-known as a cursor-will be positioned on the corresponding row in the underlying table, and that row will be locked. The lock acquired will remain in effect until the cursor is repositioned-more often than not by executing the FETCH SQL statement-or until the owning transaction terminates.) And because only one row-level lock is acquired, more transactions can run concurrently. The Cursor Stability isolation level is the isolation level used by default.

When a transaction using the Cursor Stability isolation level retrieves a row from a table via a cursor, no other transaction is allowed to update or delete that row while the cursor is positioned on it. Other transactions, however, can add new rows to the table as well as perform update and/or delete operations on rows positioned on either side of the locked row-provided the locked row itself wasn't accessed using an index. Once acquired, the lock remains in effect until the cursor is repositioned or until the owning transaction is terminated. (If the cursor is repositioned, the lock being held is released and a new lock is acquired for the row to which the cursor is moved.) Furthermore, if the owning transaction modifies any row it retrieves, no other transaction is allowed to update or delete that row until the owning transaction is terminated, even though the cursor may no longer be positioned on the modified row.

As you might imagine, when the Cursor Stability isolation level is used, if the same SELECT SQL statement is issued two or more times within the same transaction, the results returned may not always be the same. In addition, transactions using the Cursor Stability isolation level will not see changes made to other rows by other transactions until those changes have been committed.

Once again, let us see how this isolation level affects our hotel reservation application. Now, when a customer scans the database for a list of rooms available for a given date range and then views information about each room on the list produced (one room at a time), you (the manager) will be able to change the room rates for any room in the hotel except the room the customer is currently looking at (for the date range specified). Likewise, other customers will be able to make or cancel reservations for any room in the hotel except the room the customer is currently looking at (for the date range specified). However, neither you nor other customers will be able to do anything with the room record the first customer is currently looking at. When the first customer views information about another room in the list, you and other customers will be able to modify the room record the first customer was just looking at (provided the customer did not reserve it for himself). Again, neither you nor other customers will be able to do anything with the room record at which the first customer is currently looking. Figure 7-3 illustrates this behavior.

image from book
Figure 7-3: Example of how the Cursor Stability isolation level can affect application behavior.

The Uncommitted Read Isolation Level

The Uncommitted Read isolation level is the least restrictive isolation level available. In fact, when the Uncommitted Read isolation level is used, rows retrieved by a transaction are only locked if the transaction modifies data associated with one or more rows retrieved or if another transaction attempts to drop or alter the table the rows were retrieved from. Because rows usually remain unlocked when this isolation level is used, dirty reads, nonrepeatable reads, and phantoms can occur. Thus, this isolation level is typically used for transactions that access read-only tables and views and for transactions that execute SELECT SQL statements for which uncommitted data from other transactions will have no adverse affect.

As the name implies, transactions running under the uncommitted read isolation level can see changes made to rows by other transactions before those changes have been committed. However, such transactions can neither see nor access tables, views, and indexes that are created by other transactions until those transactions themselves have been committed. The same applies to existing tables, views, or indexes that have been dropped; transactions using the uncommitted read will learn that these objects no longer exist only when the transaction that dropped them is committed. (It's important to note that when a transaction running under this isolation level uses an updatable cursor, the transaction will behave as if it is running under the Cursor Stability isolation level, and the constraints of the Cursor Stability isolation level will apply.)

So how does the Uncommitted Read isolation level affect our hotel reservation application? Now, when a customer scans the database to obtain a list of available rooms for a given date range, you (the manager) will be able to change the room rates for any room in the hotel over any date range. Likewise, other customers will be able to make or cancel reservations for any room in the hotel, including the room at which the customer is currently looking. In addition, the list of rooms produced for the first customer may contain records for rooms for which other customers are in the processing of reserving or canceling reservations. Figure 7-4 illustrates this behavior.

image from book
Figure 7-4: Example of how the Uncommitted Read isolation level can affect application behavior.

Choosing the Proper Isolation Level

In addition to controlling how well the DB2 Database Manager provides concurrency, the isolation level used determines how well applications running concurrently will perform. Typically, the more restrictive the isolation level used, the less concurrency is possible.

So how do you decide which isolation level to use? The best way is to identify which types of phenomena are unacceptable, and then select an isolation level that will prevent those phenomena from occurring. A good rule of thumb is:

Use the Repeatable Read isolation level if you're executing large queries and you don't want concurrent transactions to have the ability to make changes that could cause the query to return different results if run more than once.
Use the Read Stability isolation level when you want some level of concurrency between applications, yet you also want qualified rows to remain stable for the duration of an individual transaction.
Use the Cursor Stability isolation level when you want maximum concurrency between applications, yet you don't want queries to see uncommitted data.
Use the Uncommitted Read isolation level if you're executing queries on read-only tables/views/databases or if it doesn't matter whether a query returns uncommitted data values.

Always keep in mind that choosing the wrong isolation level for a given situation can have a significant negative impact on both concurrency and performance-performance for some applications may be degraded as they wait for locks on resources to be released.

Specifying the Isolation Level to Use

Although isolation levels control concurrency at the transaction level, they are actually set at the application level. Therefore in most cases, the isolation level specified for a particular application is applicable to every transaction initiated by that application. (It is important to note that an application can be constructed in several different parts, and each part can be assigned a different isolation level, in which case the isolation level specified for a particular part is applicable to every transaction that is created within that part.)

For embedded SQL applications, the isolation level is specified at precompile time or when the application is bound to a database (if deferred binding is used). In this case, the isolation level is set using the ISOLATION [RR | RS | CS | UR] option of the PRECOMPILE and BIND commands.

The isolation level for Call Level Interface (CLI) and Open Database Connectivity (ODBC) applications is set at application run time by calling the SQLSetConnectAttr() function with the SQL_ATTR_TXN_ISOLATION connection attribute specified. (Alternatively, the isolation level for CLI/ODBC applications can be set by assigning a value to the TXNISOLATION keyword in the db2cli.ini configuration file; however, this approach does not provide the flexibility of changing isolation levels for different transactions within the application that the first approach does.)

Finally, the isolation level for Java Database Connectivity (JDBC) and SQLJ applications is set at application run time by calling the setTransactionIsolation() method that resides within DB2's java.sql connection interface.

When the isolation level for an application isn't explicitly set using one of these methods, the Cursor Stability isolation level is used as the default. This holds true for DB2 commands, SQL statements, and scripts executed from the Command Line Processor (CLP) as well as to Embedded SQL, CLI/ODBC, JDBC, and SQLJ applications. Therefore, it's also possible to specify the isolation level for operations that are to be performed from the DB2 Command Line Processor (as well as for scripts that are to be passed to the DB2 CLP for processing). In this case, the isolation level is set by executing the CHANGE ISOLATION command before a connection to a database is established.

DB2 Version 8.1 and later provides a WITH clause (WITH [RR | RS | CS | UR]) that can be appended to a SELECT statement to set a specific query's isolation level to Repeatable Read (RR), Read Stability (RS), Cursor Stability (CS), or Uncommitted Read (UR). A simple SELECT statement that uses this clause looks something like this:

 SELECT * FROM employee WHERE empid = '001' WITH RR

If you have an application that needs to run in a less-restrictive isolation level the majority of the time (to support maximum concurrency), but contains one or two queries that must not see some phenomena, this clause provides an excellent way for you to meet your objective.