Transactions | Microsoft SQL Server 2008 Bible

A transaction is one of the mechanisms provided within SQL to enforce database integrity and maintain data consistency. The details of implementation differ among the RDBMS vendors, though the SQL92/99 spirit is generally preserved.

What is a transaction?

A transaction complements the concept of the session with additional granularity — it divides every operation that occurs within the session into logical units of work. In this way, database operations — those involving data and structure modifications — are performed step-by-step and can be rolled back at any time, or committed if every step is successful. The idea of the transaction is to provide a mechanism for ensuring that a multistep operation is performed as a single unit. If any of the steps involved in a transaction fails, the whole transaction is rolled back. If all steps have been completed successfully, the transaction can be either committed (to save all the changes into a database) or rolled back to undo all the changes.

The SQL standard defined transactions from the very beginning and enhanced the concept during subsequent iterations. According to the standard, a transaction is started automatically by RDBMS and continues until COMMIT or ROLLBACK statements are issued; the details were left for the vendors to implement.

A transaction must pass the ACID test:

Atomicity. Either all the changes are made or none.
Consistency. All the data involved into an operation must be left in a consistent state upon completion or rollback of the transaction; database integrity cannot be compromised.
Isolation. One transaction should not be aware of the modifications made to the data by any other transaction unless it was committed to the database. Different isolation levels can be set to modify this default behavior.
Durability. The results of a transaction that has been successfully committed to the database remain there.

One of the classic real-life example of a transaction involves an ATM (bank machine) withdrawal operation. Suppose you need $20 and you decide to withdraw this money from the nearest bank machine; you put in your bank card (User ID) and enter your PIN (personal identification number) to initiate the session. Once the bank confirms your identity, you are allowed to proceed; you ask for a money withdrawal operation in the amount of $20. That's where the transaction begins. There are several operations involved: the machine needs to check your account to verify that you have enough money to cover the transaction, subtract the money from your account, and release the money to you. If any of these steps (and some others, depending on the given bank policies) fails, the transaction must be aborted, and everything must revert to a state where it was before the transaction even began.

Explicit and Implicit Transactions

An implicit transaction has been chosen as the default in SQL92/99 standard. Whenever certain statements (of DDL and DML type) are executed within a session, they start (or continue) a transaction. A transaction is terminated by issuing either a COMMIT statement or a ROLLBACK statement.

An explicit transaction is started by the client application with a BEGIN TRANSACTION statement and is terminated in a manner similar to the implicit transaction protocol. This is a Microsoft SQL Server 2000–only feature, which is the default setting. Microsoft SQL Server 2000 provides a statement SET IMPLICIT_TRANSACTIONS {ON | OFF} to configure the default behavior of the transaction. When the option is ON, the SQL Server automatically starts a transaction when one of the following statements is specified: ALTER TABLE, CREATE, DELETE, DROP, FETCH, GRANT, INSERT, OPEN, REVOKE, SELECT, TRUNCATE TABLE and UPDATE. The transaction must be explicitly committed or rolled back, though; a new transaction is started once any of the listed statements gets executed. Turning the IMPLICIT_TRANSACTIONS option OFF returns the transaction to its default autocommit transaction mode.

While not required by the SQL standard, in every RDBMS implementation COMMIT is issued implicitly before and after any DDL statement.

This means that you cannot get your cash, unless it was subtracted from your balance; the bank cannot subtract the money from your balance unless you have enough money to cover the transaction and you actually received your cash.

The transaction model, as it is defined in the ANSI/ISO standard, utilizes the implicit start of a transaction, with an explicit COMMIT, in the case of successful execution of all transactions logical units, or an explicit ROLLBACK, when the noncommitted changes need to be rolled back (e.g., when program terminates abnormally). Most vendors follow this model, while some — Microsoft SQL Server 2000 is one example — allow for explicit start of a transaction.

Transactions COMMIT and ROLLBACK

The COMMIT statement ends the current transaction and makes all changes made to the data during transaction permanent. The syntax is virtually identical for all three RDBMS vendors, as well as for the SQL99 standard, and is very straightforward:

COMMIT [WORK]

The keyword WORK is not required, though it might be added for clarity; a simple COMMIT is usually all that is required.

Oracle 9i syntax looks like follows

COMMIT [WORK] [COMMENT   (<text>)] [FORCE (<text>), [<int>]] ;

Here the COMMENT clause enables you to specify a comment (up to 255 bytes long) that is recorded for every pending transaction and can be viewed through DBA2_PC_PENDING dictionary view (see Chapter 13 for more information on system catalogs). The FORCE clause allows you to commit an in-doubt distributed (see more about distributed transactions later in the chapter) transaction manually; it commits only a named transaction and has no effect on all other transactions.

The IBM DB2 UDB syntax is identical to the standard. In IBM terminology, transaction is a unit of work (UOW). No authorization is required to issue the statement; all locks held by the transaction are released. Named transactions are not supported.

The following syntax will work both for Oracle 9i and IBM BDF2 UDB:

UPDATE customer SET   cust_status_s = 'N'; COMMIT;

Microsoft SQL Server 2000 does support the SQL99 standard syntax — in addition to its own. The Microsoft syntax allows for committing named transaction whereas the standard one does not.

COMMIT [ TRAN [ SACTION ]   [<transaction name>]]

As you can see, only COMMIT is required, everything else is optional, and the keywords can be shortened (i.e., TRAN instead of TRANSACTION). Alternatively COMMIT WORK can be used.

The following example illustrates the COMMIT statement using Microsoft SQL Server 2000 explicit transactions mode.

BEGIN TRAN SELECT * FROM   customer UPDATE customer SET cust_status_s = 'N' COMMIT TRAN

No changes are taking place until the last COMMIT is executed. Only Microsoft requires a BEGIN TRANSACTION statement to start an explicit transaction; in both Oracle and DB2 UDB, transaction are always started implicitly for every DML or DDL statement.

Nested Transactions

Named transactions are especially handy for nested transactions. This concept is not implemented by either Oracle or IBM DB2UDB. The idea is to have a transaction within a transaction within a transaction — ad infinitum. At any time you can check the total number of pending transactions using the @@TRANSCOUNT unary function. Nested transactions in Microsoft SQL Server 2000 are introduced for readability purposes only; committing an internal transaction does not really commit anything, only the outermost COMMIT actually commits the changes; all other commits just decrement the transaction counter. Here is an example illustrating the concept:

BEGIN  TRANSACTION trans1 -- the transaction counter @@TRANSCOUNT = 1 INSERT INTO  <table> VALUES <values> BEGIN TRANSACTION trans2 -- the transaction  counter @@TRANSCOUNT = 2 INSERT INTO <table> VALUES <values> BEGIN  TRANSACTION trans3 -- the transaction counter @@TRANSCOUNT = 3 INSERT INTO  <table> VALUES <values> COMMIT TRANSACTION trans3 -- Nothing  committed at this point but the transaction -- counter is decremented by 1;  @@TRANSACOUNT = 2 COMMIT TRANSACTION trans2 -- Nothing committed at this point  but the transaction counter -- is decremented by 1; @@TRANSACOUNT = 1 COMMIT  TRANSACTION trans1 -- All INSERTs are committed to the database -- the  transaction counter is decremented by 1; @@TRANSACOUNT =0

In this case, three transactions were initiated to insert three records into a table; only the very last COMMIT actually made the changes to the table.

When COMMIT is executed, SQL Server must start a transaction either implicitly or explicitly for another COMMIT to execute successfully; if no transaction is started, issuing this command will result in an error:

Server: Msg   3902, Level 16, State 1, Line 1 The COMMIT TRANSACTION request has no   corresponding BEGIN TRANSACTION.

Neither Oracle nor DB2 UDB will complain, no matter how many times you execute COMMIT.

When changes made to the data in the databases need to be "undone" the ROLLBACK should be used. It may be issued anytime before the last COMMIT and results in automatic rollback of all changes made since the controlling transaction had started.

The syntax is identical in all RDBMS and SQL99 standards (see Table 7-4), save for using named transactions in Microsoft SQL Server 2000 and some Oracle-specific optional clauses. The following statement will attempt to update column CUST_STATUS_S in the CUSTOMER table of the ACME database, but all changes will be rolled back:

UPDATE customer SET   cust_status_s = 'N' ROLLBACK WORK

Table 7-4: Vendor-Specific ROLLBACK Statements
RDBMS	ROLLBACK Syntax
Oracle 9i	ROLLBACK [WORK] [TO SAVEPOINT <savepoint name>] \| [FORCE <text>]
IBM DB2 UDB	ROLLBACK [WORK] [TO SAVEPOINT <savepoint name>]
Microsoft SQL Server 2000	ROLLBACK [TRAN[SACTION]] [<transaction name>] [<savepoint name>]

As with a COMMIT statement, all the locks are released if the ROLLBACK command is issued.

The Oracle 9i WORK clause is optional and the TO SAVEPOINT clause is explained later in this chapter; the FORCE clause pertains to distributed transactions, acting very much the same as in the COMMIT transaction case; Microsoft SQL Server has an optional transaction name clause.

Note

Because certain statements (like DDL) automatically issue a COMMIT before and after, every change to data that happened prior to the DDL statement would be committed as well.

Here is an example that is valid for all three RDBMS (assuming the IMPLICIT_TRANSACTIONS option is set to ON in Microsoft SQL Server 2000):

UPDATE customer SET   cust_status_s = 'N' WHERE   cust_id_n = 1 DELETE customer WHERE cust_id_n = 1 ROLLBACK   WORK

Neither UPDATE nor DELETE will be committed to the database, as the whole transaction is rolled back.

Usually, a transaction consists of more than one SQL statement that you may want to either COMMIT or ROLLBACK. To add granularity to the transaction processing, the SAVEPOINT concept was introduced. It allows you to specify a named point within the transaction, usually after the last successful statement, and, if any error occurs after that, roll all the changes back not to the beginning of the transaction but to that particular SAVEPOINT. An explicit (or implicit, like the one issued after a DDL statement) COMMIT releases all SAVEPOINTs declared within a transaction.

Oracle 9i has the most straightforward syntax for the SAVEPOINT:

SAVEPOINT <savepoint   name>;

Here is an example of using the SAVEPOINTs in Oracle:

UPDATE customer SET   cust_status_s = 'N' WHERE cust_id_n = 1; SAVEPOINT first_upadate; DELETE   customer WHERE cust_id_n = 2; SAVEPOINT first_delete; DELETE customer WHERE   cust_id_n = 10; ROLLBACK first_update; COMMIT;

In the example above, only UPDATE gets committed to the database, all DELETEs are rolled back, and the SAVEPOINT first_delete is erased.

The savepoint name must be unique within the current transaction; if a new savepoint uses the same name, the previous savepoint is destroyed.

Here is the IBM DB2 UDB syntax for SAVEPOINT:

SAVEPOINT <savepoint name   > [UNIQUE] [ON ROLLBACK RETAIN CURSORS] [ON ROLLBACK RETAIN   LOCKS]

Several optional clauses can be specified with the standard SAVEPOINT statement. The UNIQUE clause indicates that the session does not intend to reuse the name, rendering it therefore unique; if this statement is omitted and the same name is used later in the transaction, the previous SAVEPOINT with that name will be destroyed and a new one created.

The ON ROLLBACK RETAIN CURSORS clause specifies what the system will do with implicit or explicit cursors opened after the SAVEPOINT statement in the case of a rollback; the last clause — ON ROLLBACK RETAIN LOCKS — changes the default behavior that instructs RDBMS not to release locks acquired after the SAVEPOINT statement.

Cross-References

See Chapter 14 for more information on explicit cursors. Both IBM and Oracle employ a concept of an implicit cursor — a special structure for manipulating data, when virtually every select statement opens one. The discussion of implicit cursors is beyond the scope of this book.

DB2 UDB also has RELEASE SAVEPOINT statement that destroys all the SAVEPONTS created after that named savepoint.

Microsoft SQL Server 2000 has the most unorthodox syntax, when it comes to establishing the SAVEPOINTs.

SAVE TRAN[SACTION]   <savepoint name>

When rolling back to a specific SAVEPOINT, all data changes become undone, but all the locks are held until COMMIT or full ROLLBACK commands are issued. The SAVE TRAN [SACTION] statement is not supported in distributed transactions.

Here is an example illustrating use of the SAVE TRANSACTION statement in Microsoft SQL Server 2000:

BEGIN TRANSACTION trans1 UPDATE   customer SET cust_status_s = 'N' WHERE   cust_id_n = 1 SAVE TRANSACTION cust_1 UPDATE customer SET cust_status_s = 'N'   WHERE cust_id_n = 2 ROLLBACK TRANSACTION cust_1 COMMIT   TRANSACTION

Distributed Transactions

Transactions that involve more than one database are referred to as distributed transactions. Such transactions are by their very nature complex and require advanced skills and knowledge.

In Oracle 9i, a distributed query uses dblinks to qualify the object, and there are several restrictions for such transactions. The RDBMS server manages these transactions and ensures data consistency; a special ADVISE statement issued within the session determines whether the transaction needs to be rolled back or committed whenever its status is set in doubt by the database.

IBM DB2 UDB labels distributed transactions as DUOW (Distributed Unit Of Work) and uses the Database Manager to coordinate distributed transactions.

In Microsoft SQL Server 2000, the task of managing the distributed transactions belongs with MSDTC (Microsoft Distributed Transaction Coordinator). (Other transaction managers complying to the X/Open XA specification could be employed instead.) The transaction can be explicitly started with the BEGIN DISTRIBUTED TRANS[ACTION] statement.

A distributed transaction must minimize the risk of data loss in case of a network failure. The two-phase commit protocol is employed in distributed transactions, and while details of the implementation are different between the vendors, they generally follow the same phases.

Prepare Phase. When the transaction manager receives a COMMIT request, it communicates it to all resource managers involved in the transaction, and they prepare to do a COMMIT
Commit Phase. In this phase, they actually issue COMMIT and report to the coordinator; when all COMMITs are successful, the coordinator sends notification to the client application. If any of the resource managers fails to notify the coordinator, a ROLLBACK command is issued to all resource managers. To perform a ROLLBACK after a COMMIT is executed, log files are normally used.

This code begins a named transaction TRANS1, updates field CUST_STATUS_S for the customer whose ID is 1, then creates a SAVEPOINT with the name CUST_1. It then proceeds to update another customer's status, and then it rolls back the changes made for customer 2 by rolling back the transaction to the savepoint. The transaction is finally committed, and only the first update actually takes place.

Transaction isolation levels

There are different transaction isolation levels. Isolation levels refer to the ability of the transaction to see the world (data) outside its own scope, i.e., data modified by any other transaction. The SQL99 standard isolation levels are listed in Table 7-5.

Table 7-5: SQL99 Transaction Isolation Levels
Isolation Level	Description
READ UNCOMMITED	This level is the lowest of all isolation levels, permitting dirty reads (i.e., able to see uncommitted data). No locks are issued, none honored.
READ COMMITED	This level specifies that shared locks will be held while data is being read. No dirty reads (containing uncommitted data) are permitted; though phantom reads (when row number changes between the reads) may occur.
REPEATABLE READ	No changes will be allowed for the data selected by a query (locked for updates, deletes, etc.), but phantom rows may appear.
SERIALIZABLE	The highest level of transaction isolation; places a lock for the whole dataset; no modifications from outside are allowed until the end of the transaction.

Oracle 9i has two transaction isolation levels — SERIALIZABLE and READ COMMITED. The SET TRANSACTION syntax for Oracle can be complicated:

SET TRANSACTION [READ ONLY] |   [READ WRITE] [ISOLATION LEVEL [SERIALIZABLE | READ COMMITTED]] [USE ROLLBACK   SEGMENT <segment name>] [NAME <transaction name>]

As you can see, the statement can be used to set many parameters, though it cannot be done all at once. To set a transaction as READ ONLY, the following statement could be used:

SET TRANSACTION READ ONLY NAME   'trans1'; SELECT * FROM CUSTOMER ; COMMIT;

After the transaction was set as READ ONLY, you cannot modify any data within this transaction either with UPDATE or INSERT statements.

Oracle is the only one among the "big three" RDBMS that provides for READ ONLY mode of a transaction. In full compliance with the SQL99 standard, this clause sets the transaction for read-only mode, and an error is generated if an attempt to change data is made. It establishes statement-level behavior, which becomes the default for the session.

There is some terminology confusion in how DB2 UDB defines transaction isolation levels. What SQL99 specifies as SERIALIZABLE, it names REPEATABLE READ (RR), which is the highest isolation level in DB2 UDB.

SQL99 REPEATABLE READ becomes READ STABILITY (RS), and a new level — CURSOR STABILITY — is introduced.

The last one, CURSOR STABILITY (CS), is the default for IBM DB2 UDB and resembles the READ COMMITTED level of the SQL99 standard. Essentially, it guarantees that a row of data will remain unchanged.

The UNCOMMITED READ (UR) level is the same as it is defined by the standard: no locks are acquired, so dirty reads are possible.

DB2 UDB also has NO COMMIT (NC) as the isolation level, which is not supported by its mainframe big brother DB2.

When establishing connection from within an application, the isolation level can be specified using PREP or BIND API directives, from the command-line processor the following statement may be used:

db2 => CHANGE ISOLATION TO   UR DB20000I The CHANGE ISOLATION command completed successfully

Tip

You cannot change isolation levels while connected to DB2 UDB; the isolation level is specified before the connection is established. Use the TERMINATE command to disconnect from the DB2 UDB database.

Microsoft SQL Server 2000 supports all four levels of isolation. The isolation level is set for the whole session, not just a single transaction. To specify a level within the session, the following statement is used:

SET TRANSACTION ISOLATION LEVEL   <level>

Here is an example, illustrating the importance of the transaction isolation level to manipulate consistent data using Microsoft SQL Server 2000. (The example, with minor modifications, is applicable to Oracle and DB2 UDB as well.) This example performs an update, selects the updated value, and then rolls back the transaction (OSQL interface, see Appendix E for more information):

1> SELECT cust_status_s   2> FROM customer 3> WHERE cust_id_n = 1 4> GO cust_status_s   ------------- N (1 row affected) 1> SET TRANSACTION ISOLATION LEVEL READ   COMMITTED 2> GO 1> BEGIN TRAN TRAN1 2> UPDATE customer 3> SET   cust_status_s = 'Y' 4> WHERE cust_id_n = 1 5> GO (1 row affected) 1>   SELECT cust_status_s 2> FROM customer 3> WHERE cust_id_n = 1 4> GO   cust_status_s ------------- Y (1 row affected) 1> ROLLBACK TRAN TRAN1 2>   GO 1> SELECT cust_status_s 2> FROM customer 3> WHERE   cust_id_n = 1 4> GO cust_status_s ------------- N (1 row   affected)

The transaction TRANS1 updates the field CUST_STATUS_S, changing it from Y to N, and then issues a SELECT statement that shows the changed data. The transaction isolation level for the session is READ COMMITED, so only changes committed to the database are supposed to be selected. Since the SELECT was issued within the same transaction, it will be able to see uncommitted changes made by this transaction update. The data changes will be visible to other transactions that attempt to select it within the sessions with transaction isolation level set to READ UNCOMMITED; but they are invisible for transactions with other levels of isolation — if they were issued prior to the ROLLBACK TRANSACTION statement. The example also shows that the data, after the transaction was rolled back, remain unchanged.