Transaction processing in J2EE applications is a large topic that merits an entire book for thorough coverage. For a more in-depth discussion we recommend Java Transaction Processing: Design and Implementation by Mark Little, Jon Maron, and Greg Pavlik (Prentice Hall PTR, 2004).
The following discussion will give you enough background to be able to understand Spring's transaction support.
For the discussion here, we will define a transaction as a unit of work that is made up of a set of operations, against one or more resources, that must be completed in its entirety. One example of a transaction is when you go to the bank and transfer money from your savings account to your checking account. First your savings account is debited and then your checking account is credited. If the credit to your checking account failed for some reason, you would want the debit to the savings account undone. If this did not happen, then you would have lost the transferred amount. The entire transfer should take place as a single unit of work, which we call a transaction.
For a Java application, transactions behave in a similar manner. All individual steps of the transaction must complete. If one step fails, they must all fail or be undone. This is referred to as the rule that a transaction must be atomic.
Transactions must also leave any affected resources in a consistent state. The consistent state is defined as a set of rules for the resource that you are interacting with and they should also be enforced by the resource. One example could be that if you transfer funds between two accounts, the deposit to one account can't be more or less than the amount that you withdrew from the other account.
When you withdraw money from an account, it could be bad for the bank if your spouse were allowed to withdraw money at the same time because that could leave you with insufficient funds in the account. Each transaction should operate isolated from each other and also in the required sequence. If your withdrawal happened first, then your spouse's withdrawal transaction should be required to recheck the balance of the account.
Finally, we want the transactions to be durable. If you deposit money into your account, you expect the balance of your account to be maintained until you perform another transaction against the same account. Even if your bank experiences a hardware failure, you would expect them to keep track of every single transaction.
In combination, these four central requirements for a transaction are referred to as the ACID (Atomic, Consistent, Isolated, Durable) properties.
Which problems do transactions solve for the application developer? The two aspects of the ACID rule that most commonly relate to application development are Atomicity and Isolation. Consistency and Durability are more related to the actual transactional resource. They are necessary for a complete transactional environment, but they don't really affect how we code our application.
Let's look at what this means for us as Java developers.
We need to guarantee that all steps complete or none of them complete. This is done by first declaring the start of a transaction, and then making a decision whether to commit all changes or roll them back. The sequence in your code would look something like this:
//start the transaction begin work // do your work read ... update ... read ... update ... ... // decide on the outcome if (everythingIsOk()) commit else rollback
Here we included explicit transaction handling statements in our code. It is also possible to use declarative transaction management, where you declare which methods should be surrounded by transaction management, and how specific events would trigger a commit or rollback of the transaction. The J2EE platform provides container-managed transactions (CMT) for EJBs where a runtime or system exception would trigger a rollback and any other outcome would trigger a commit. Spring expands on this concept and provides its own version of declarative transactions for POJOs. It has the same basic features as the J2EE CMT, but it also adds more control for when the transactions should be rolled back. You can specify in detail which exceptions should trigger a rollback. We will see much more of this later in this chapter.
We need to make sure our changes are not affected by other concurrent changes. Transactions are one way of solving this. Optimistic concurrency control using a version number (defined in the next section) can help here, too. You effectively span transactions, but avoid lost updates. Isolation is one area where we usually don't achieve 100 percent success. To guarantee isolation, you have to restrict processing and not allow multiple processes to operate on the same data simultaneously. This is usually achieved by locking resources, which sooner or later will lead to some process being blocked for a period of time. This blocking will reduce the throughput and make your application less scalable. One way to solve this problem is to relax the isolation requirement. Maybe we can allow certain types of concurrent access to avoid excessive locks. It comes down to a tradeoff. Most database resources allow you to specify a number of different isolation levels:
SERIALIZABLE: This is the most restrictive level. Transactions should appear to run as if they were executed one by one after each other. This means that two transactions are not allowed to read or write a piece of data that the other one has changed or will change during the entire life span of the transactions. This is the isolation level that gives you the best protection against interference by concurrent changes. It is also the level that is the most expensive to maintain in terms of resource usage.
REPEATABLE READ: Now we have to guarantee that we will not see any updates made by other transactions to any data that we have accessed within a transaction. If we read the data again, it should always be unchanged. There could, however, be additional data that has been added by another transaction. This is called a phantom read.
READ COMMITTED: Here we relax the isolation a bit further. We will not see any changes made by other transactions while they are active. Once they finish and commit their work we will be able to see the changes. This means that we can't guarantee repeatable reads; instead we get unrepeatable reads — data we have already read can change while our transaction is running and a later read or update could operate on modified data.
READ UNCOMMITTED: All bets are off. There is practically no isolation. Any transaction can see changes made by other transactions even before they are committed. These types of reads of uncommitted data are called dirty reads. You are, however, not able to update data that has been modified by another transaction until the other transaction has completed.
NONE: This level indicates that there is no transaction support. This level is not provided by most databases.
Most database systems are delivered with READ COMMITTED as the default isolation level. HSQLDB supports only READ UNCOMMITTED, so any use of HSQLDB is fine in terms of providing atomic changes, but for transaction isolation it is not the best choice. This is why we use MySQL/Oracle for all examples in this chapter.
When multiple processes are accessing the same transactional resource concurrently, we need a way to control the access to this resource. To provide the required isolation, we need to ensure that the same object is not updated by two processes running concurrently.
The most common way of doing this is by locking the data to prevent others from updating or accessing it for as long as the lock is held. The amount of time that we hold on to this lock will affect the performance of our applications because it limits how many processes can access the data concurrently. This strategy is referred to as pessimistic locking because we make the pessimistic assumption that another transaction will try to modify the resource.
One way to avoid the need to lock resources is to check at update time whether the object that is being changed is in the same state as it was when we started our transaction. We are hoping that the object has not changed, so this is called optimistic locking. This means we need some way of detecting changes. The most common way this is done is via a timestamp or a version number. If none of these is available, then we would have to check the entire object against a copy made when the object was read the first time.