Transactions | SQL Server 2000 Stored Procedures Handbook (Experts Voice)

When it comes to writing stored procedures, transactions are an important topic. Most of the procedures we write, for modifying data in any SQL Server database table, will require transactions. In this section, we will briefly define what a transaction is, and cover the basics of the syntax.

Transactions are used to keep the data that we write to and read from tables consistent. A transaction guarantees that one or more statements either complete successfully or fail totally. Every statement that is executed in SQL Server is run within the control of a transaction. This includes INSERT, UPDATE, DELETE, and even SELECT statements. For modification statements, like INSERT, UPDATE, and DELETE, locks are placed, and all changes are recorded in the transaction log. If any operation fails, or if the user asks for an undo operation, the transaction log is used to undo the operations already performed.

We have said that we log whenever we modify data in the database. Since every modification operation is logged, we have to be aware of this when considering how large to make our transaction log. How long this change log is stored is based on the recovery model that your database is operating under.

There are three models:

Simple: The log is maintained only until your operation is executed.
Full: The log is maintained until you explicitly clear out the log.
Bulk-Logged: It keeps a log much like the full mode, but does not log some operations, like SELECT INTO, bulk-loads, index creations, or text operations.

Even in simple mode, we have to be careful about log space, since if we make large numbers of changes we have to hold the log rows until we commit our transaction. This is just a taste of transaction log management; for a more full explanation, please see SQL Server 2000 Programming by Robert Viera, ISBN: 1-86100-448-6.

There is a common acronym which explains the purpose of transactions. This acronym is ACID, and it stands for:

Atomicity

Every operation within a transaction is treated as a singular operation; either all of its data modifications are performed, or none of them.
Consistency

Once the transaction is completed, the system must be left in a consistent state.
Isolation

This means that the operations within a transaction must be suitably isolated from other transactions. In other words, no other transactions should see data in the intermediate state within the transaction until it is finalized. This is done by using locks (for details on locks refer to the Concurrency Controls section).
Durability

Once a transaction is complete, all changes must be persisted, as requested. The modifications should persist even in the event of a system failure.

The key is that, when we write statements to modify data by using one or more SQL statements, we need to make use of transactions to ensure that data is written securely. A typical problem with procedures and operations in T-SQL code using transactions is that when unexpected errors (like security problems, constraint failures, and so on) occur in code, we end up leaving orphaned data.

Syntax

The syntax for transactions is simple. There are three variants that we will need to look at: simple transactions, savepoints, and distributed transactions.

Simple Transactions

In the simplest form, there are three commands that are required: BEGIN TRANSACTION (to start the transaction), COMMIT TRANSACTION (to save the data), or ROLLBACK TRANSACTION (to undo the changes that were made).

For example, consider the case where we are building a stored procedure to modify two tables. Let's call these tables table1 and table2. We will modify table1, check the error status, and then modify table2:

     BEGIN TRANSACTION     UPDATE table1     --Code for updating table1     ...     --check error status     if @@error <> 0       ROLLBACK TRANSACTION       RETURN -100     UPDATE table2     --Code for updating table1     ...     if @@error <> 0       ROLLBACK TRANSACTION       RETURN -100     COMMIT TRANSACTION

Now, in case some unforeseen error occurs while updating either table1 or table2, we will not be stranded. Be careful not to forget to close the transaction (either commit it, or roll it back) because it can cause many problems. The biggest is that you can freeze resources that other users may wish to use (covered in detail in the Concurrency section of this chapter), and nearly as bad is that, if you forget to close the transaction, the status of the changes made will be in limbo.

You can extend the functionality of transactions by adding a transaction name, as shown:

     BEGIN TRANSACTION <tranName> OR <@tranvariable> WITH MARK

This will make a mark in the transaction log, which we can use as a placeholder when restoring a transaction log. A good example of its use might be to build a process that marks the transaction log every day before some daily batch process, especially where we put the database in a single-user mode. We mark the transaction log, run the process, and if there are any troubles, we can restore the database to the mark in the log, no matter when the process was executed.

WITH MARK is optional, but without it, the code gets complicated. It allows us to match our BEGIN TRANSACTION to a ROLLBACK TRANSACTION, for example:

     BEGIN TRANSACTION one     ROLLBACK TRANSACTION one

Note that only the first mark is registered, so the following code will return an error:

     BEGIN TRANSACTION one     BEGIN TRANSACTION two     ROLLBACK TRANSACTION two

The error message will be:

     Server: Msg 6401, Level 16, State 1, Line 3     Cannot roll back two. No transaction or savepoint of that name was found.

This will leave the transaction open, and so named transactions are seldom used in our stored procedure transaction code.

Nesting Transactions

While coding transactions, we can nest them. Note that we will have to execute the same number of COMMIT TRANSACTION commands as BEGIN TRANSACTION commands. We can tell how many BEGIN TRANSACTION commands have been executed without being committed by using the @@TRANCOUNT global variable, which gives the number of BEGIN TRANSACTIONs. When @@TRANCOUNT equals zero, we are not in a transaction.

There does not seem to be any functional limit to the number of transactions that can be nested. Testing ten million nest levels on my 1.8 GHz laptop test machine with 512MB of RAM took fifty seconds, which is clearly more than any transaction will ever need.

As an example, execute the following:

     SELECT @@TRANCOUNT AS zeroDeep     BEGIN TRANSACTION     SELECT @@TRANCOUNT AS oneDeep

It returns:

     zeroDeep     --------     0     oneDeep     --------     1

Then, we can nest another transaction, and check @@TRANCOUNT to see whether it has incremented. Afterwards, we commit that transaction and check @@TRANCOUNT again:

     BEGIN TRANSACTION     SELECT @@TRANCOUNT AS twoDeep     COMMIT TRANSACTION     SELECT @@TRANCOUNT AS oneDeep

This returns:

     twoDeep     -------     2     oneDeep     -------     1

Finally, we close the final transaction:

     COMMIT TRANSACTION     SELECT @@TRANCOUNT AS zeroDeep

It returns:

     zeroDeep     --------     0

One additional feature of transactions is that it only takes one ROLLBACK TRANSACTION command to rollback as many transactions as you have nested. Therefore, if you have nested 100 transactions, and you issue one rollback transaction, all transactions are rolled back. For example:

     BEGIN TRANSACTION     BEGIN TRANSACTION     BEGIN TRANSACTION     BEGIN TRANSACTION     BEGIN TRANSACTION     BEGIN TRANSACTION     BEGIN TRANSACTION     SELECT @@trancount AS InTran     ROLLBACK TRANSACTION     SELECT @@trancount AS OutTran

will return:

     InTran     --------     7     OutTran     --------     0

This is the trickiest part of using transactions in our code. We cannot issue ROLLBACK TRANSACTION command without being aware of what its influence will be on the code.

In the above example, had we issued an UPDATE statement immediately after the rollback command, it would not have been executed within an explicit transaction. If we issue a COMMIT TRANSACTION, after the rollback command, we will get the following error:

     Server: Msg 3902, Level 16, State 1, Line 12     The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.

Savepoints

A tool is available to rollback only certain parts of a transaction:

     SAVE TRANSACTION <savePointName>

For example:

     CREATE TABLE performer       (         performerId INT identity,         name VARCHAR(100)       )     BEGIN TRANSACTION     INSERT INTO performer(name) VALUES ('Simple Minds')     SAVE TRANSACTION savePoint     INSERT INTO performer(name) VALUES ('Air Supply')     ROLLBACK TRANSACTION savePoint     COMMIT TRANSACTION     SELECT * FROM performer

The output of this listing is:

     performerId name     --------- -----------     1         Simple Minds

In the code, we have two INSERT statements within our transaction, but there is only one row in the output! Obviously, the row that we rolled back in the savepoint was not persisted.

Note that you do not have to commit a savepoint; it will not harm anything if not used. It simply places a mark in the transaction log, if the user asks for a rollback to the savepoint, it tells where to rollback to. Note also that the rest of the operations in the overall transaction are not affected. Savepoints give us the power to reflect changes on only a part of the transaction, giving us more control of what to do if we are handling a large number of operations.

Savepoints do not affect the value of @@TRANCOUNT, nor do they release any locks that might have been held by the operations that are rolled back.

Distributed Transactions

Occasionally, we may need to update data on a server different from the one on which our code resides. Microsoft Distributed Transaction Co-ordinator service (MSDTC) gives us this ability.

If our server is running the MSDTC service, we can use the BEGIN DISTRIBUTED TRANSACTION command to start a transaction that covers the code residing on our server, as well as the remote server, for example:

     BEGIN DISTRIBUTED TRANSACTION     --remote server     UPDATE remoteServer.pubs.dbo.authors     SET phone = '555 555-5555'     WHERE au_id = '267-41-2394'     If @@error <> 0       BEGIN         ROLLBACK TRANSACTION         RETURN 0       END     --local server     UPDATE pubs.dbo.authors     SET phone = '555 555-5555'     WHERE au_id = '267-41-2394'     IF @@error <> 0       BEGIN         ROLLBACK TRANSACTION         RETURN 0       END     COMMIT TRANSACTION

The distributed transaction syntax also covers the local transaction. There is a configuration option, REMOTE PROC TRANS, which will automatically upgrade a BEGIN TRANSACTION command to a BEGIN DISTRIBUTED TRANSACTION command. This is useful if you frequently use distributed transactions.

Explicit versus Implicit Transactions

Before we finish our discussion on transactions, we should cover one further thing. We have alluded to the fact that every single statement is executed in a transaction. This is an important point, which we should understand when writing code. Internally, SQL Server starts a transaction every time we execute any SQL statement. Therefore, even if we do not explicitly start a transaction, SQL Server automatically starts a new transaction whenever a statement starts, and commits or rolls it back depending on whether or not any errors occur. Just as we can code nested transactions, SQL Server can also nest these transactions.

There is a setting that SQL Server provides us to changes this behavior – SET IMPLICIT_TRANSACTION. When this setting is turned on, and we are not already within an explicitly executed BEGIN TRANSACTION, then a BEGIN TRANSACTION is automatically executed when any of ALTER TABLE, FETCH, REVOKE, CREATE, GRANT, SELECT, DELETE, INSERT, TRUNCATE TABLE, DROP, OPEN, or UPDATE are executed.

Note that, in such a scenario, we will have to explicitly execute a COMMIT TRANSACTION or ROLLBACK TRANSACTION command to end the transaction, otherwise if the connection terminates, all data will be lost.

It is not a typical setting used by SQL Server programmers, but is worth mentioning because if you set ANSI_DEFAULTS ON, IMPLICIT_TRANSACTION will be enabled.

Because every SELECT statement is executed within a transaction, the rows are moved from the server to the client inside a transaction; the SELECT statement is not finished until the entire result set is exhausted, and the transaction has not ended. Bear this in mind while studying the Concurrency section, as we will discuss all of the resources and settings for managing concurrent SQL operations.

Compiled SQL Server Code

Now that we have discussed the basics of transactions, we need to discuss the slight differences involved in using them in our compiled code. It is nearly the same, but there are some subtle differences. We cannot use transactions in user defined functions, but we can use them in stored procedures and triggers.

Stored Procedures

Stored procedures, being simply compiled batches of code, use transactions just like we discussed. We cannot affect the transaction nesting level within a procedure. In other words, if we start a transaction, we must also commit the transaction.

It is better not to execute a rollback transaction in a stored procedure, so that there is no chance that we will rollback to a transaction count different from that at the beginning of the procedure, protecting us from the situation where our code is executed in another transaction. Rather, it is always better to start a transaction, and then follow it by a savepoint. Later, if we need to rollback the transaction, we can rollback to the savepoint, and finally commit the transaction. It is then up to the stored procedure to signal any caller that it has failed.

As an example, let's build the following simple procedure that does nothing but execute a BEGIN TRANSACTION and a ROLLBACK TRANSACTION:

     CREATE PROCEDURE tranTest     AS     BEGIN       SELECT @@TRANCOUNT AS trancount       BEGIN TRANSACTION       ROLLBACK TRANSACTION     END

If we execute this procedure outside a transaction, it returns a single row with a 0 value. However,

     BEGIN TRANSACTION     EXECUTE tranTest     COMMIT TRANSACTION

However, if we execute this, it returns:

     Server: Msg 266, Level 16, State 2, Procedure tranTest, Line 5     Transaction count after EXECUTE indicates that a COMMIT or ROLLBACK     TRANSACTION statement is missing. Previous count = 1, current count = 0.     Server: Msg 3902, Level 16, State 1, Line 3     The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.

The errors occur as we affect the transaction depth when we roll back the transaction inside the procedure.

We can recode our procedure as the following:

     CREATE PROCEDURE tranTest     AS     BEGIN       DECLARE @savepoint VARCHAR(30)       SET @savepoint = cast(object_name(@@procId) AS VARCHAR(27)) +                        cast(@@nestlevel AS VARCHAR(3))       SELECT @savepoint AS savepointName, @@TRANCOUNT AS trancount       BEGIN TRANSACTION         SAVE TRANSACTION @savepoint       ROLLBACK TRANSACTION @savepoint       COMMIT TRANSACTION     END

Now, we can execute it from within any number of transactions, and it will never fail. Since we are able to call our procedures from other procedures (even recursively from the same procedure), or even external programs, it is important that we take these precautions to make sure that our code is safe under any calling circumstances.

In addition, naming of savepoints is important. As savepoints are not scoped to a procedure, we need to ensure that they are always unique. I tend to use the procedure name (retrieved here by using the object_name function called for the @@procId, but you could just enter it textually) and the current transaction nesting level. This guarantees that I can never have the same savepoint active. It will be also possible to use @@nestLevel, as it will be always unique in the calling chain, for a given connection.

Triggers

Triggers are part of a special transaction that is started and completed for every data modification operation. We discussed this in the Implicit and Explicit Transactions section. As we mentioned, every operation is done within a transaction, and the trigger is implicitly part of that transaction.

Just as in stored procedures, we can start transactions, set savepoints, and rollback to a savepoint. However, if you execute a rollback transaction, instead of an error, the entire batch is cancelled. This is a common practice and SQL Server does this to protect our data. For more details on triggers, please refer to Chapter 7.

Best Practices

Transactions are an extremely valuable tool for maintaining the consistency of our database. Any place where we need to modify data is a likely candidate for a transaction. They give us a great deal of control over what goes in our database and how.

Consider the following best practices when using transactions in your stored procedure code:

Don't call rollback transaction without a savepoint name in a procedure

Since you aren't allowed to change the transaction nesting level (@@TRANCOUNT), in cases where you need to call the code from a different procedure this practice will prevent us from causing problems.
Use transactions as liberally as needed

It is very important to protect our data, one hundred percent of the time. Each time you modify the data, you need to enclose it in a transaction. This gives you a chance to check the status, the number of rows modified, and so on; and, even rollback the modification if you want to.

Best practices (or actually worst practices) for transactions are highly tied to the concepts that we will cover while discussing concurrency. Quite often, they are painted as bad things, due to some of the problems involved with bloating of the transaction log, and blocking of other users – a topic that will be covered in detail in the next section. For now, we will keep our focus narrow, and say that transactions are a positive thing, as they protect our data from spurious problems arising from partial sets of data being entered.