SQL Server Transaction Architecture


Every DBMS system needs to be a good OLTP system. This means that the integrity of the data must be ensured by the transactions that take place in the database whenever data is read, written, removed, or updated.

A transaction is a complete unit of work from the start of a query to the end of a query A transaction cannot be broken or suspended, or resumed at some later date, without risking the integrity of the data, because it cannot lock up or command the exclusive service of the entire DBMS. Other transactions also need to do work in the DBMS (concurrency); that’s the purpose of a DBMS or an OLTP system. A transaction cannot own the DBMS, exclusively holding or locking it until its work completes sometime in the unknown future.

Integrity is ensured by an all or nothing philosophy (atomicity). In other words, a transaction must complete without errors or not at all. A typical transaction is represented by the following code:

 BEGIN TRANSACTION       INSERT INTO CustDetails (CustID) VALUES ('15') COMMIT TRANSACTION

A database table is assumed to be in a consistent or plausible state at the beginning of a transaction. The client signals the start of a transaction; this can be done explicitly by enclosing the transaction steps between, typically, the BEGIN TRANSACTION-COMMIT TRANSACTION statements. Transactions can also be started implicitly in SQL Server without BEGINCOMMIT by merely sending a T-SQL query to the server, which places it into autocommit mode.

During the transaction, the DBMS takes whatever steps it needs to take in the execution of a query to maintain the integrity of the data under control of the transaction. At this point, the transaction “owns” the data it is currently working with, but at any point in the transaction the data may be in an inconsistent state. If multiple tables are being referenced or updated in the transaction, they naturally cannot all be updated at exactly the same time. During the course of the transaction, some rows may be updated and others not.

If an error thus occurs during the transaction, tables that have not yet been touched by the transaction will be left out of the loop. In the case of an error, a ROLLBACK TRANSACTION is required to restore all data affected in the transaction to its previous state. If no error occurs and the transaction completes successfully the COMMIT TRANSACTION can be issued. After the issuance of a commit, all modifications become part of the database.

As discussed in Chapter 2, the transaction log provides us with the ability to roll back the database to a former state, even if the transactions were successful. The transaction logs record the sequence of events in a transaction from start to finish. The log contains sufficient information to either redo, or roll forward, a transaction or undo, or roll back, a transaction. The subject of recovery and the work performed by transaction logs is fully covered in Chapters 7 through 9.

Isolation, Concurrency, and Locking Architecture

Isolation and concurrency management is part and parcel of transaction management. All DBMS systems implement a system of isolation and concurrency mechanisms to ensure that multiple transactions, representing at least two and perhaps many thousands of users all working on the same database at the same time, are able to access the same data at-almost-the same time without crashing into each other. The isolation algorithms implemented by a DBMS, SQL Server included, also ensure that the transactions being instantiated by concurrent sessions in the DBMS do not interfere with one another.

Two concurrency specifications underpin the methods for ensuring concurrency: pessimistic concurrency control and optimistic concurrency control. They are defined as follows:

  • Pessimistic concurrency control consists of a system of locking mechanisms that prevents more than one application from accessing the same data at the same time. The reason it is called pessimistic concurrency control is that it is deployed, usually, in high-transaction situations in which it is highly likely that many connections are going to contend for the same data in an uncontrolled state. Most transaction applications fall into this category. Shopping carts on the Web are a good example of a high-transaction environment in which the middle tier is required to open a huge number of connections to the same database and access the same rows of data for all the connections.

  • Optimistic concurrency control involves situations in which connections or client applications do not lock the data they are accessing. The system simply checks the data, and if a contention arises, someone has to stand down from his or her transaction and start over. Optimistic control is possible in low-traffic or highly controlled environments.

Your applications or solutions will have to take into account the cost of locking database objects during transactions. Many high-stakes or mission-critical applications are controlled by a single scheduler that is perfectly aware of all the transaction threads it is establishing to a database server, such as a PBX system. Locking and unlocking are resource intensive, and the milliseconds that add up might cause serious problems for the application. A single scheduler needs to be aware of how and when contentions arise and deal with them using optimistic concurrency control algorithms.

On the other hand, most transaction processing DBMS environments will use locking mechanisms, especially when all sessions or connections are established by independent applications on the client, most likely initiated by human users who have no idea who else is in the databases they are working in.

Locks and Latches

SQL Server 2005 makes use of a sophisticated locking architecture to implement pessimistic concurrency control. Locking is essential in a system that can potentially have tens of thousands of users all working with the same pool of data. Locking is a means of enforcing or ensuring data integrity. It is a system of assurances that allow a process to safely work with data and to be assured that data is not going to be changed “behind its back,” so to speak.

Locks are managed on a connection basis. In other words, a connection AA that requires a lock cannot use the lock created by connection AB, even if both connections were initiated by the same client application. Each connection must establish a lock created exclusively within the context of its connection. An exception can be found in the concept of bound connections, but I leave that subject for discussion in Chapter 16.

SQL Server also supports several types of locking modes. The modes include shared, update, exclusive, intent, and schema. Lock modes indicate the level of dependency the connection obtains on the locked object. SQL Server also controls how the lock modes relate to one another. You cannot, for example, obtain an exclusive lock on an object if shared locks are already acquired on the same object.

SQL Server threads or fibers place locks on databases, tables, rows, indexes, keys, key ranges, and pages. This approach is known as locking granularity. The locking granularity is dynamically determined by SQL Server (during formulation of the query plan discussed later) and needs no overt action on the part of the application or you, although this does not preclude the application from requesting a specific lock and lock mode. SQL Server determines exactly what locking level is needed for each T-SQL query it receives. One query may generate a row-level lock, while another may generate a lock that smacks down the entire database. The connections also respect the locks, no matter what the levels. In other words, if connection AA has a table lock on data, connection AB is prevented from establishing a row lock on the same table.

From time to time, SQL Server may decide to escalate a lock. For example, if a row-level lock consumes most of a table, the relational engine may escalate the lock to the whole table. The query processor usually determines the correct lock required by the query.

Locks are maintained only for the length of time needed to protect data at the level requested by the client. For example, share locks in SQL Server are held for a duration that depends on the transaction isolation level. The default transaction isolation level is READ COMMITTED, which corresponds to a share lock that persists as long as it takes to read the data. Scans also result in locks, but the duration of the lock in a scan is much shorter-essentially as long as it takes to scan the page and place a lock on the next page. Other locks are held for the duration of transactions, such as when a transaction isolation level is set to REPEATABLE READ or SERIALIZABLE READ (which are “clean” reads requiring noninterference by any other process).

Cursors are also protected from concurrency operations. A cursor may acquire a share-mode scroll lock to protect a fetch. Scroll locks are typically held until the end of a transaction. And updates also require locks on data, for the duration of the update transaction.

Like all modern DBMS systems, SQL Server includes mechanisms to prevent lock conflicts. If a connection attempts to acquire a lock on data that is already locked, the late connection is blocked from obtaining the lock and essentially waits until the first connection has completed the transaction. You will have to write your own “lock wait” handler in the client or stored procedure to decide how long your application’s thread will wait for the opportunity to lock data. Applications obtain their locks on a first-come, first-served basis.

SQL Server also supports deadlock detection. The detection is proactive. In other words, an algorithm is employed to go out and ensure that threads are not frozen in a lock deadlock, which is a condition in a DBMS when two or more connections (mostly two) have blocked each other in a deadly embrace. If SQL Server detects such a condition, it will terminate one of the transaction threads, which will allow the remaining thread to continue (see “Deadlocking” in Chapters 16 and 17).

Latching and the Storage Engine

The storage engine manages another concurrency mechanism known as a latch. Latches occur when the relational engine asks the storage engine to return a row during the processing of a query. The storage engine latches the data page to ensure that no other process modifies the page during a transaction. For example, a latch will ensure that the page offset table entry pointing to the row does not get changed until the transaction has completed.




Microsoft SQL Server 2005. The Complete Reference
Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features
ISBN: 0072261528
EAN: 2147483647
Year: 2006
Pages: 239

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net