Chapter 10: Using Database Transactions | Perl Database Programming

This chapter introduces the concept of transactions, one of the core features of SQL and an easy way to increase the reliability of any database-backed system. Transactions are a set of SQL queries that are 'grouped' together. Then if any, or all, of the SQL queries fails for any reason, the entire group can be reversed so the database is exactly the way we left it before we began the transaction. We'll be discussing transactions, their methods, and finally look at a small example that shows a transaction in action.

Using Transactions

MySQL has only recently introduced transactions, and they apply only to some table types (InnoDB and BDB); these are slower than other table types and take more memory and disc space. Other SQL databases such as PostgreSQL or Oracle support transactions; even in these cases, however, the code becomes somewhat more complex. Why, then, you might be wondering, should you introduce complexity and overhead?

Transactions are a means of ensuring data integrity. Data integrity is just a fancy term that means keeping your data intact, and correct. Many database operations are implemented as multiple SQL statements; for example, a payment system for online purchase might include the following:

Decrement customer's account by purchase amount
Increment merchant's account by purchase amount
Order goods to be shipped

Obviously, if the first step fails, the error-checking routine should abort the remaining steps. However, what if the latter steps fail? Even assuming the system is perfectly reliable, perhaps the merchant's account has been closed or the goods are no longer available. In this case, the customer must receive a refund; but undoing all of these steps is time-consuming and prone to error. (What happens if an error occurs during the undo process?) If only there were a way to perform these tasks atomically: if an error occurs, all changes are rejected.

Naturally, there is such a way: the entire operation is wrapped in transaction statements and treated as a single process. If any stage fails, the whole procedure can be aborted and the database reset to its initial state.

In essence, transactions allow a multistatement process to be treated by the database as if it contains only a single statement. Everything works, or everything is rejected.

As shown, failures can occur even with a perfectly reliable system. In practice, system failures also occur. The front-end querying machine or the database server itself may fail, or communication between them may be lost. In any of these cases, the database server aborts the entire transaction and the database is restored to the state it was in before the transaction began.

Even if no failures occur, transactions can still be useful in a system with multiple simultaneous client accesses; in the modern computing environment, this includes almost any Web-based system. Use of transactions can effectively create a 'snapshot' of the database for the exclusive use of a client. This is unaffected by changes made elsewhere. This allows the client software to treat the database as a single consistent data source, rather than constantly checking to avoid race conditions. Race conditions occur when two or more processes try to read or write the same record/table/data at the same time. The degree of isolation does vary somewhat among databases, and the client often controls it. Check the database system's documentation for specifics.The ideal solution for a multiuser system is the heavy use of both transactions and explicit locks (which enable you to lock others from modifying the data); neither is sufficient alone.

In SQL, transactions are controlled with three keywords:

BEGIN initiates a transaction.
COMMIT ends a transaction, requiring the database to store all changes that have been made since the beginning of that transaction.
ROLLBACK aborts a transaction, restoring the database to the state that the database was in before the first statement in the transaction. If connectivity to the client is lost before a COMMIT is issued, ROLLBACK is implicitly executed.

Many databases allow transactions to be nested: one can BEGIN a transaction; perform operations on the database; BEGIN a subsequence of statements that are part of the main transaction; COMMIT that, then continue with the main transaction. Consult your database server documentation to see if the database you are using supports nested transactions.