Flylib.com

Books Software

 
 
 

Chapter 3. Transactions


Chapter 3. Transactions

One of the issues we run into when differentiating among different technologies used in software fortress architectures is transactions. So before I look too closely at software fortresses , I will spend some time on transaction basics.

According to Microsoft's Encarta computer dictionary, a transaction is "a discrete activity within a computer system, such as an entry of a customer order." This is an overly simplistic definition, making a transaction sound like little more than a read request to a disk drive. Transactions are much more important and a lot more complex than Microsoft's Encarta, or even most database folks, would have you believe.

Transactions come in three standard varieties: (1) tightly coupled single-resource, (2) tightly coupled multiple-resource , and (3) loosely coupled multiple-resource. These three varieties differ in how they coordinate updates across different transactionally aware resources.


3.1 Transactionally Aware Resources

A transactionally aware resource is a system with all of these characteristics:

  • It can accept some update requests .

  • It can group a collection of update requests into a set, called a transaction .

  • It can guarantee that when processing a transaction's worth of update requests, either the entire transaction collection is processed or none of the transaction collection is processed .

  • It can guarantee that, once that transaction has been processed, there is no likely scenario under which any of the updates included in the transaction will be lost.

The most common transactionally aware resource is a database. We'll run into other transactionally aware resources shortly, but for now I'll limit my discussion to databases.


3.2 Tightly Coupled Single-Resource Transactions

I'll start with the simplest transaction variety, the tightly coupled single-resource transaction. Because all single-resource transactions are tightly coupled, I'll simplify the terminology by calling them single-resource transactions . Let me take you through a typical single-resource transaction involving a database.

Consider writing an application to withdraw money from a checking account. Let's say that " withdrawing money from a checking account" means the following:

  1. Finding the specific account record in the CheckingAccount table.

  2. Reading the CurrentBalance field.

  3. Deducting the amount from the current balance.

  4. Logging the withdrawal in a Logging table.

Imagine the problems that would be caused by any of the following events:

  • We read the balance from the account record, but before we have a chance to deduct the withdrawal amount, the account is closed by another program.

  • We deduct the amount from the current balance, but we are unable to log the withdrawal in the Logging table.

  • We log the withdrawal, but we are unable to deduct the amount.

This is just a sample of the things that can go wrong. When all of these updates are heading toward a single transactionally aware resource (i.e., a database), the resource itself can prevent any of these scenarios.

With a transactionally aware resource, we can enclose all of these withdrawal database activities inside a single transaction. We can then ask the database to process that transaction as a whole. The database will then process either all or none of the transaction updates. The database gets to choose which. Further, the database guarantees that if it does process the entire transaction, it will do so without consistency errors (such as the account being closed after the balance has been read) and without any chance of losing the updates after the fact (say, because the disk drive happened to choose that inopportune moment to disintegrate).

How the database makes these guarantees is a secret known only to the database. We don't care. We care only about deciding which updates to bunch together into the transaction and about letting the database know where the transaction boundary begins and ends.

Several techniques are available for letting the database know about the transaction boundaries, but most use some transaction boundary-marking APIs, such as BeginTransaction and Commit (the standard name for EndTransaction). Any database updates issued after BeginTransaction and before EndTransaction are assumed to be part of the same transaction collection. The checking account system then looks something like this:

  1. Issue a BeginTransaction request.

  2. Issue a Read request to get the CurrentBalance field from the appropriate account record.

  3. Do whatever programming is necessary to update the account balance.

  4. Issue an Update request to the database to store the new account balance.

  5. Issue an Append request to the database to log the withdrawal in the Logging table.

  6. Issue a Commit request, signaling to the database that you have reached the end of the transaction.

Once the database knows that the transaction has been concluded, it will attempt to apply all of the Update and Append requests en masse. If, by some horrible chance, the database concludes that one or more of the updates or appends is impossible , it will toss out the whole collection and notify you, giving you time to drown your sorrows at the local pub or take other appropriate remedial action.

Figure 3.1 illustrates the flow in a tightly coupled single-resource transaction.

Figure 3.1. Flow in a Tightly Coupled Single-Resource Transaction