Introducing Transactions | Beginning Transact-SQL With SQL Server 2000 and 2005

When I count money for a purchase and hand it to a store clerk, I have confidence that it's all going to make it to the cash drawer. If the power goes out or the cash register malfunctions, this shouldn't greatly affect our transaction because one of us has the money physically in our hand. As data is processed and moved from one place to another, electronic impulses cause memory registers to "remember" our data. In simple computer processes, this exchange can take place thousands of times. Until it is written to some form of permanent storage, the process can be easily interrupted and the data lost. Most of us have learned that the most effective way to prevent data loss is to make sure you always have at least two copies of your data. For example, if you intend to move an important file from one stand-alone computer's hard disk to another, you copy it from the hard disk to a portable disk or memory device; but you leave the original copy on the hard disk. Only after you verify that it has been copied to the target computer's hard disk would you consider deleting it from the source. And, you always make a backup copy of important files from the hard disk in case it fails, right?

In a database, a transaction is simply a mechanism to ensure and verify that data gets to its intended destination. Just like a purchase or bank transaction, both parties must be satisfied with the results. They must agree on the anticipated outcome. After the transaction has been completed, they should agree that all of the conditions of the transaction have been met and that everything is in the proper place.

Transaction Types

All modifications to data take place in the context of a transaction. SQL Server uses three types of transactions to manage data modification:

Explicit Transaction. The explicit transaction is defined by the presence of an explicit BEGIN TRANSACTION statement followed by one or more dependent data modification statements and completed with an explicit COMMIT TRANSACTION statement. Error checking is added prior to the COMMIT TRANSACTION statement so that if an error occurred the transaction can be reversed with a ROLLBACK TRANSACTION statement.
Implicit Transaction. The implicit transaction follows the behavior of some other database products in that whenever a data modification is executed it implicitly begins a transaction. However, it does not complete the transaction and release the modified data until an explicit COMMIT TRANSACTION or ROLLBACK TRANSACTION statement is issued. Implicit transactions are enabled on a connection basis with the SET IMPLICIT_TRANSACTIONS ON command.
Auto-Commit Transaction. If a data modification statement is executed against the database without an explicit or implicit transaction, it is considered an auto-commit transaction. The modification contained in an auto-commit transaction follows the same pattern as other transactions as described in the next section.

The ACID Test

Most of us have been burned enough by data loss problems to realize that steps must be taken to ensure that data gets from one place to another. Although there are a number of benefits, this is what transactions are all about. A bona fide transaction must meet the following criteria:

Atomic — All steps and operations that are part of a transaction are treated as an atomic unit. Either all succeed or all fail together.
Consistent — The outcome of any transaction is always predictable; all of the operations either fail or succeed. All operations abide by consistency rules and checks to ensure data integrity within the database.
Isolated — Any operations performed before, during, or after the transaction will see related data in a consistent state, rather than in a state of partial completion. Any user or operation that queries data affected by a transaction will perceive that the entire transaction was committed instantaneously.
Durable — If a transaction succeeds, data is written to disk and does not revert to its previous state. Data can survive system failure.

The Transaction Log

Meeting all of these criteria may seem like a rather tall order, but the way it works is actually quite elegant. SQL Server pulls this off with redundancy. When any data modification is executed, the following actions take place:

The appropriate data pages are located in memory. If they are not in memory, they are placed in memory from the disk.
The modifications (insert, update, or delete) are made to the applicable pages in memory.
The modifications are written to the transaction log.
The server issues a checkpoint that causes the changed (dirty) pages in memory to be written back to the hard disk. The pages in memory then have their "dirty" flag removed. If the transaction making the changes has been committed, the pages are released and other requests or transactions have access to them. If the checkpoint occurs prior to the transaction being committed, the pages are still locked until the transaction is committed.

The transaction log is a separate file used to collect all successful data modification requests from all users and applications. During the execution of a data modification request, the transaction exists in the buffer cache and log cache. During the changes to the data pages in memory, the pages (and the pages on disk representing the data to be changed) are locked (or isolated) from access by other requests or transactions. The pages remain locked until they are released from the transaction. Figure 9-1 illustrates the transaction process.

image from book
Figure 9-1:

Depending on a database setting, the transaction log either continues to fill with archived transaction processes or gets truncated (removing all check-pointed transaction processes by setting the area of the transaction log up for overwrite). In a production database, the transaction log is typically allowed to grow until a scheduled backup cycle. This way, if the main database is damaged or lost, the transaction log becomes a short-term backup solution for new data.

Logged Operations

SQL Server's default behavior is to log all Insert, Update, and Delete operations. For the reasons I explained in the previous section, this generally makes sense. However, the transaction log and its related processes can burden the server with unnecessary work if you don't need this level of protection. For example, if I had exported several thousand records to a text file, intending to load this data into my SQL database, this would be an exception to the day-to-day inserts normally performed by applications. Because I have a copy of the data in the export file and have backed up the data in my database, if this operation failed, I would simply correct the problem and try again. In this case, I could speed things up by performing a non-logged insert and bypass the transaction log. This is usually done using a command-line utility or a feature of Data Transformation Services (DTS). Non-logged operations are really the exception to the rule. Bulk operations are not a core feature of SQL and are beyond the scope of this book. Two options for performing bulk inserts from external files include the bulk copy utility (BCP) and the Bulk Copy Task within Data Transformation Services (DTS), or Data Integration Services in SQL Server 2005. For more information, use the Books Online to read about the BCP utility and DTS.