Concurrency


If you have studied queuing theory, or you have ever stood in a line waiting to purchase your favorite flavor of ice-cream behind a hundred other patrons in the only active line, as there was a single cashier at the counter, you will understand the basics of concurrency. Concurrency is all about multi-tasking, or being able to do more than one thing at a time.

The answer to concurrency problems in the ice-cream store is easy – get more cashiers to operate the other registers. If there are 100 patrons and 10 lines rather than 1, things will speed up by a factor of 10. If all the users of the database did not access the same resources; database system design would be as simple as this. However, in building database solutions, there is always more to it.

When users are working on different areas of the database, it is not a big problem (though there are issues involved in using the same hardware subsystems). However, this is not always the case, and we certainly do not want our users to make wrong decisions based on the data we show them. Sometimes, users will use our database for storing time-sensitive data. For example, if we are writing a shipping program, we certainly do not want to accidentally ship the same item to multiple customers because two users happened to use the same data at the same time. It is possible to write triggers that cover such situations, but we still have concurrency issues that allow data anomalies to slip through, caused by multiple users modifying our data at the same time. Therefore, what if a user checks to see if a situation is correct, then does some other action, like checking a balance and then making a withdrawal? Another user might have done the same thing at the same time, and then both withdraw the total amount from the same account, with the final balance ending up below zero. Whether this is acceptable depends on the individual situation.

In the rest of this section, we will look at some of the different issues surrounding concurrency in SQL Server applications:

  • OS and Hardware Issues

    We will briefly discuss various issues that are out of the control of SQL code

  • SQL Server Concurrency Controls

    Here, we will explain locks and isolation levels

  • Balancing Safety and Concurrency

    Here, we will discuss ways of protecting our data from users, simultaneously making changes to data, and placing our data into less-than-adequate situations, while also maximizing concurrency

OS and Hardware Issues

SQL Server is designed to run on a variety of hardware. The same code will run on a low-end PC and even on a clustered array of servers that rival some supercomputers. There is even a version running on a handheld running Pocket PC (though in its current version, it will not support stored procedures). Every machine running a version of SQL Server, from Personal to Enterprise edition, can have a vastly different concurrency profile. In this section, we will briefly touch on some of the issues governing concurrency which our stored procedures do not need to concern themselves with.

We execute the stored procedure code in a single thread. In other words, we cannot spawn additional threads from our procedure code. However, SQL Server and the OS are balancing all the requests from multiple users, in terms of executing code to support requests for memory (both RAM and disk), networking, video, running all of the operating system functionality, including all the services loaded on our server, and even executing our stored procedure code.

It is beyond the scope of this book to delve deeply into these details, but it is important to mention that concurrency is heavily tied to our hardware architecture. The number of CPUs, disk channels and disk drives, network cards, and the amount of RAM we have, will all affect concurrency. However, for the rest of this chapter, we will ignore these types of issues, and leave them to hardware-oriented books, such as Kalen Delaney's Inside SQL Server 2000, from Microsoft Press.

We will focus only on software-related issues pertaining to, how SQL Server manages concurrency between SQL Server processes. We will also look at ways and means of maximizing concurrency in our SQL code.

Concurrency Controls

In this section, we will look at the different controls used internally by SQL Server, as well as those programmed by us in our stored procedure code, to manage issues of concurrency.

SQL Server

To get a reasonable understanding of how SQL Server manages concurrency, we need to cover:

  • Locks

    These are holds puts by SQL Server on objects for access by a single user only.

  • Isolation Levels

    These are settings used to control the length of time for which SQL Server holds on to the locks.

Locks

Locks are tokens laid down by the SQL Server processes to "stake their claim" to the different resources available, to prevent another process, and causing inconsistencies. They are like the ‘Diver Down’ markers that deep-sea divers place on top of the water when working below the water. They do this to alert other divers, pleasure boaters, fishermen, and others, that they are underwater. Every SQL Server process applies a lock to almost everything it does, to ensure that no other user can affect the operation that it is doing.

A lock has two parts – the type of lock and the mode of lock. In the following table, we have listed the different types and modes of locks and a brief explanation of each.

Much is made of the fact that SQL Server uses row-based locks, but there are five different types of locks that SQL Server can use to lock varying portions of the database, with the row being the finest, and full database lock being the coarsest. They are used to protect our processes in various manners. The granularities of locks are:

Type of Lock

Granularity

Row

A single row in a table

Key or Key-Range

A single value or range of values

Page

A eight kilobyte index or data page

Extent

A group of eight 8K pages (64K); this is only used when allocating new space to the database

Table

An entire table, including all rows and indexes

Database

The entire database

At the point of request, SQL Server determines approximately how much of the database will be needed to satisfy the request. This is calculated on the basis of several factors (the details of which are undocumented), some of which include the cost of acquiring the lock, the amount of resources needed, and how long the locks will be held, which we will be discussing in the next section. It is also possible for the query processor to upgrade the lock if it is taking up unexpectedly large quantities of resources.

For example, if large parts of a table are locked with row locks, it might switch to a table lock to finish out the process. Or, if you are adding large numbers of rows into a clustered table in sequential order, you might use a page lock on the new pages that are being added.

The granularity of the lock defines only how much of the database we will lock. The second part of the lock is the mode of the lock. It refers to how strict the lock is, when dealing with other locks. These available modes are:

Mode

Description

Shared

This is used generally when you want to look at the data. Multiple processes can have a shared lock on the same resource. However, it prevents other processes from modifying the locked data.

Exclusive

As the name implies, this gives exclusive access to a resource. Only one process may have an active exclusive lock on a resource.

Update

This is used to inform other processes that we are planning to modify the data, but are not quite ready to do so. While we are still preparing to do the modification, other connections may also issue shared locks, but not update or exclusive locks. Update locks are used to prevent deadlocks (we cover them later in this section) by marking rows that will possibly be updated by a statement, rather than upgrading from shared directly to an exclusive lock.

Intent

This communicates with other objects that we are planning to take one of the previously listed modes. You might see intent share, intent exclusive, or shared with intent exclusive.

Schema

This is used to lock the structure of an object when it is in use, so that you cannot alter a table when a user is reading data from it.

Each of these modes, coupled with the granularity, describes a locking situation. For example, an exclusive table lock will mean that no other user can access any data in the table. An update row lock will say that other users could look at the data in the row, but any statement that might modify data in the table will have to wait until after this process has been completed.

As far as concurrency is considered, locks are not desirable. Whenever a resource is locked in a way that another process cannot use it to complete its processing, concurrency is lowered, as the process must wait for the other to complete before it can continue. This is generally referred to as blocking. Of course we must have locks, and we must have blocking to maintain our data integrity. In the next section, we will discuss isolation levels that determine how long locks are held. Executing the stored procedure sp_who2 will give you a list of all processes, with a column BlkBy that will tell you if any users are blocked, and which user is doing the blocking. Enterprise Manager also contains a node for seeing current activity, in the <serverName>\Management\Current Activity\Process Info node.

There is also a bulk update mode that is used to lock the table while inserting data in bulk into the table, and the TABLOCK hint is applied (covered in the next paragraph) here. It is analogous to an exclusive table lock for concurrency issues.

We can force SQL Server to use a different type of lock from what it might ordinarily choose by using locking hints on our queries. For individual tables in a FROM clause, we can set the type of lock to be used like this:

     FROM table1 WITH (<hintList>)         JOIN table2 WITH (<hintList>) 

Note that these hints will work on all query types. In the case of locking, we have five different hints we can use:

  • PAGLOCK

    Forces the optimizer to choose page locks for the given table

  • NOLOCK

    Leaves no locks, and honors no locks for the given table

  • ROWLOCK

    Forces row-level locks to be used for the table

  • TABLOCK

    Goes directly to table locks, rather than row locks or page locks, and can speed some operations, but seriously lowers concurrency

  • TABLOCKX

    Same as TABLOCK, but uses exclusive locks, rather than the more common shared locks, and forces single threaded use of the table, rather than NOLOCK access

Note that SQL Server can override our hints if it is necessary, for example the case where we choose NoLock, but then modify the table in the query. Exclusive locks will still be held on the table for the rows that are modified, though not on rows that are simply looked at.

Deadlocks

One term that is frequently bandied about is deadlocks. A deadlock is a circumstance where two processes are trying to use the same objects, but neither will ever be able to complete because it is blocked by the other connection. For example, consider that we have two processes (processes 1 and 2), and two resources (resources A and B). Then the following steps will lead to a deadlock:

  • Process 1 takes a lock on Resource A, and at the same time, Process 2 takes a lock on Resource B.

  • Then Process 1 tries to get access to Resource B. As it is locked by Process 2, Process 1 will go into a wait state.

  • Then Process 2 tries to get access to Resource A, but since it is locked by Process 1, Process 2 will go into a wait state.

At this point, there is no way to resolve this issue, without ending one of the processes. Therefore, SQL Server will arbitrarily kill one of the processes (unless one of the processes has voluntarily raised the likelihood of it being the killed process by using SET DEADLOCK_PRIORITY LOW, and the other process has not). SQL Server raises the following error to the client to tell it that their process was stopped:

     Server: Msg 1205, Level 13, State 1, Line 4     Transaction (Process ID 55) was deadlocked on lock resources with another process and     has been chosen as the deadlock victim. Rerun the transaction. 

At this point, we should be able to resubmit the very same request. This assumes that we are working within a transaction for multiple data modifications, and know what was rolled back.

Isolation Level s

In the previous section on locks, we mentioned that every SQL Server process applies a lock to almost anything it does, to ensure that no other users can affect the operation that it is performing. Locks are placed to make sure that, while SQL Server is actually using the resource, the resource is protected. However, exactly how long SQL Server holds locks, or whether locks prevent other processes from looking at the data that we have locked, is controlled by what level of isolation we have set up in a given connection or process.

Depending on how safe we need our transactions to be, and how concurrent we need our processes to be, we can use several different models of isolation. To illustrate this point, consider that the safest method will be to put an exclusive lock on the entire database, do our operations, and release the lock. While this was common in early file-based systems, it is not really a viable idea when you need to support a large number of concurrent users, no matter how beefy your hardware platform may be.

From inside a transaction, locks can be held for a variable amount of time to protect the data we are working with. For example, consider the following hypothetical code snippet:

     BEGIN TRANSACTION     SAVE TRANSACTION savePoint     IF EXISTS ( SELECT * FROM tableA WHERE tableAId = 'value' )     BEGIN       UPDATE tableB       SET status = 'UPDATED'       WHERE tableAId = 'value'       IF @@error <> 0       BEGIN         RAISERROR 50001 'Error updating tableB'         ROLLBACK TRANSACTION savePoint       END     END     COMMIT TRANSACTION 

First, we check to see if a key-value exists in tableA, and if it does, we update a value in tableB. On first glance, this seems very secure – if a record exists when we check in tableA, it will exist when we update tableB. However, the security of this transaction is based solely on how long the locks are held on the SELECT from tableA. While the row may exist when we run the IF EXISTS block, what happens if an exclusive table lock exists on tableB when we execute the update on tableB, and we are blocked waiting for the lock to be cleared? During this period, when we are waiting for the table lock on tableB to be cleared, the row we have been told existed could be deleted from tableA, if we do not maintain our lock on the row until our transaction is complete.

What is interesting is the fact that under the default isolation level that SQL Server connections operate in, no lock will have been kept on tableA, leaving a hole in our data integrity.

Before we define the isolation levels, there are two concepts that we need to mention briefly. These are repeatable reads and phantom rows. Suppose that we execute a statement such as:

     SELECT * FROM table 

and the rows returned are:

     ColumnName     -------     row1     row2 

For this SELECT statement to be considered as a repeatable read, we must be able to execute the statement multiple times and get back at least the same results. However, we may get back an extra row, like:

     ColumnName     -------     row1     row2     row3 

This still passes the test for repeatable read, which can seem confusing since the results of the read were not really repeatable, but that is how it is defined. The value row3 is a phantom row.

Let's look at the four isolation levels to see what is allowed:

Isolation Level

Description

READUNCOMMITTED

Ignore all locks, and do not issue locks. We can see any data that has been saved to the table, regardless of whether or not it is part of a transaction that has not been committed (hence the name). However, it will leave exclusive locks if you do modify data, to keep other users from changing data that you have not committed. Primarily valuable for reporting when you can accept a margin of error that this will allow.

READCOMMITTED

The default isolation level, does not allow us to see uncommitted data. All shared and update locks are released, as soon as we have finished using the resource. Exclusive locks are held until the end of the transaction. Here, we do not have protection for repeatable reads or phantoms.

REPEATABLEREAD

This includes protection from data being deleted from under our operation. Shared locks are now held during the entire transaction, to prevent other users from modifying the data we have touched.

SERIALIZABLE

This takes everything from repeatable read, and adds in phantom protection, by taking not only locks on existing data that we have read, but also key locks on any ranges of data that could match any SQL statement executed.

The syntax for setting the isolation level is:

     SET TRANSACTION ISOLATION LEVEL <level> 

Here, <level> can be any of the four settings in the table above. The default isolation level is read committed, and is generally the best balance between concurrency and integrity.

Note that it is not always the proper setting. If we refer to our previous example code block, where we check if a value exists in one table, and then modify another, it is important to remember that the isolation level we use will depend on the types of tables that tableA and tableB represent. For example, if we are implementing a medical sales system that allows the user to ship a critical product to a customer, we will perform the following checks (in pseudocode):

     BEGIN TRANSACTION     IF (product in inventory) > (amount requested)     BEGIN       Decrement inventory       PRINT Bill Of Lading     END     COMMIT TRANSACTION 

We will probably want to use the REPEATABLEREAD isolation level, since we will not want to accidentally print two bills for the same product for two users, regardless of how small the probability is. In read committed, multiple users may block on the step to decrement inventory, after the check to see if the decrementing was valid.

Note

Note that locks are held not just for operations that you directly execute, but also for any constraints that fire to check existence in other tables, and any code executed in trigger code. The isolation level in effect controls the duration for which these locks are held.

It is extremely important that, while considering solutions, we keep in mind the locking and isolation levels. As we build more and more critical solutions with SQL Server, it is imperative that we make sure that we protect data to a level that is commensurate with the value of the data. If we are building procedures to support a system on a space shuttle, or a life support system, this becomes more important than in the case of a sales system, or a pediatrician's schedule. In the next section, we will look at coding schemes aimed at improving the concurrency of our stored procedures.

We can apply an isolation level to only a given table in a query in a query hint, rather than an entire query. These hints are READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and SERIALIZABLE, and behave as their corresponding isolation levels do. Note again that these hints may be overridden in cases were data is modified, especially in the case of READ UNCOMMITTED.

Coding for Integrity and Concurrency

As we build database systems, we must consider the fact that multiple users will attempt to modify our data, at the same time. So far, in this chapter we have talked at length about the different mechanisms (like transactions, varying isolation levels, and so on), for protecting our data. Now, we need to discuss the different mechanisms to keep our users from stepping on one another's toes.

The general progression of events in most applications is the same. First, we fetch some data for a user or a process to look at, operate on it, make changes to it, or make some decision based on it. After the users have performed the operations, they will either commit the changes to the database, or possibly save data to a different table based on the decision. While the user has the data cached on their client, what happens if a different user wants the data, or wants to make a change to the same data? For this, we use one of these while coding our database application:

  • Pessimistic

    We assume it is likely that users will try to modify the same data, so we make the access to tables and data single-threaded.

  • Optimistic

    Instead of blocking access to tables, we assume that it is unlikely that users will try to modify the exact same row at the same time, so we will only verify that the data is the same as the data when we fetched it.

Using one or both of these schemes, it is usually possible to protect data in our multi-user system to an acceptable level of integrity and concurrency.

Pessimistic Locking

A pessimistic locking scheme is very restrictive. Generally, the idea is straightforward – begin a transaction, most likely a serializable one, fetch the data, manipulate the data, modify the data, and finally commit the transaction. The goal is to serialize or single-thread all access to data we are interested in, to make sure that no other users can touch the data we are working on.

The only real problem here is that we will block all access to the row that we are working with. This sounds good, but the main issue is that any query to the tables where the data was locked might have to wait for the user to complete the access. Even if the row will not actually be involved in the answer to a query, there is a possibility that the query will be held up. For example, if one user has a single row locked in a table, and the next user executes a different query that requires a table scan on the same table, even if the results of this query do not need the same row in use, they will be blocked, as they might need that row. Soon, a chain of users will be waiting on one particular user.

All this might even be reasonable, but what if the user decides to take a break? All users will then have to wait until this user finishes the access to the data, and if the user has modified one piece of data with complex triggers, and still has more to go, all access to most of our data will be blocked because a user was forgetful and did not close the application.

We can relieve some of the long-term stress on our system by reducing the time we hold locks for, like releasing the lock when we are not actually modifying the data (refetch the data when the user makes a change to a field), and by setting time limits on how long a user can keep the data before rolling back the transaction. However, either way, we can block access to large quantities of data for a long period, considering that we will need to lock any domain tables that the user will rely on to choose values for the table, so no related data that we might use is changed. This is not optimum, so this type of locking is seldom, if ever, used.

Optimistic Locking

The opposite of pessimistic locking is optimistic locking. Here, we simply assume that the likelihood of users stepping on one another is very limited. Therefore, we choose to take locks only during actual data modification activities, since most of the time users just look around, and even if the data that we are looking at is slightly out of date, it will not really hurt anything. This is true of almost all applications, including banking ones. Consider the ATM machine, where you go to withdraw money. It will only allow you to withdraw as much money as it believes you have, but if your spouse has withdrawn money somewhere else simultaneously, the result is that more money is taken out than you have.

Optimistic locking schemes can be broken down into two types. In all cases, we only lock the data at the point where the user modifies the data. We protect the data in the server using constraints, triggers, and so on. We will choose the best isolation level depending upon how important perfection is, since, as seen before in the Isolation Levels subsection, the default of READ UNCOMMITTED is flawed because for some milliseconds, it leaves open the possibility that one user can change data on which our transaction depends. For the most part, it is considered appropriate to use the default, as it greatly enhances concurrency, and the probability of someone modifying data during your transaction is comparable to the chances of being hit by lightning ten sunny days in a row. It could happen, but it is very unlikely.

If we think back to the progression of events – user fetches data, modifies data, and finally commits data to the database – there can be a long interval between data fetch and actually committing the changes to the database. In fact, it is also possible that other users could have also fetched and modified the data during that period.

Instead of locking the data by using physical SQL Server locks, we will employ one of the following schemes:

  • Unchecked

    This is the most commonly employed scheme. If two users modify the same row in the database, then the last user wins. It is not really the best idea, as the first user may have had something important to say, and this method rejects their changes. We will not cover this any further because it is straightforward.

  • Row-based scheme

    We protect our data at the row level, by checking to see if the row we are modifying is the same as in the physical table. If not, we will refresh the data from the table, showing the user what was changed.

  • Logical unit of work

    A logical unit of work is used to group a parent record with all of its children data to allow a single optimistic lock to cover multiple tables, for example, an invoice, and the line items for that invoice. We treat modifications to the line items in the same manner as a modification to the invoice for locking purposes.

An important point to note about optimistic locking is that it is not enforced by the database, and must be coded into every procedure that is written against the tables. If we have an invalid optimistic lock value, the tables themselves will not deny access to the data; it is up to the programmers to follow the rules that the database architect lays down.

Row-Based Scheme

In a row-based scheme, we indicate that we will check on a row-by-row basis whether or not the data that the user has retrieved is still the same as the one that they fetched. Therefore, the order of events is now: fetch data, modify data, check to see that the rows of data are still the same as they were, and then commit the changes.

There are three common methods to implement row-based optimistic locking:

  • Check all fields in the table

    If you cannot modify the table, which the next two methods require, you can check that the data you had fetched is still the same, then modify the data. This method is the most difficult, since any procedure you write must contain parameters for the previous values of data, which is not a good idea, if you can help it. It is useful when building data-grid type applications, where we execute direct updates to the table. Since this book deals with stored procedures only, we will not cover this method.

  • Add a date-time column to the table

    We set this value when the table is inserted, and subsequently updated. Every procedure we write for modifying or deleting data from the table will need a column for the previous value of the timestamp. Every update to the table to modify the value in the table will be required to update the date-time column. Generally, it is best to use a trigger for keeping the date-time column up to date, and often we will include a column to tell which user modified the data last. Later, in this section, we will write a simple INSTEAD OF trigger to support this function.

  • Use a timestamp column

    In the previous method, we used a manually controlled value to manage the optimistic lock value. In this method, we will use the built-in control, using a column with a timestamp data type. The timestamp data type automatically gets a new value for every command used to modify a given row in a table.

As an example, let's look at a simple table, which is person with the structure:

     CREATE TABLE person     (          personId INT IDENTITY(1,1),          firstName VARCHAR(60) NOT NULL,          middleName VARCHAR(60) NOT NULL,          lastName VARCHAR(60) NOT NULL,          dateOfBirth DATETIME NOT NULL,          rowLastModifyDate DATETIME NOT NULL default getdate(),          rowModifiedByUserIdentifier NVARCHAR(128) NOT NULL             DEFAULT suser_name(),               CONSTRAINT XPKperson PRIMARY KEY (personId)     ) 

Note the two fields for our optimistic lock, named rowLastModifyDate and rowModifiedByUserIdentifier. We will use these to hold the last date and time of modification, and SQL Server's user identifier for the modifying user. For this, instead of triggers, we will use the following INSERT and UPDATE:

     CREATE TRIGGER person$insteadOfUpdate ON person     INSTEAD OF UPDATE     AS     DECLARE @numRows INTEGER     SET     @numRows = @@rowcount     IF @numRows = 0 --no need to go into this trigger if no rows modified        RETURN     SET NOCOUNT ON --must come after the @@rowcount setting     DECLARE @msg VARCHAR(8000) --holding for output message     UPDATE person     SET firstName = inserted.firstName,         middleName = inserted.middleName,         lastName = inserted.lastName,         dateOfBirth = inserted.dateOfBirth,         rowLastModifyDate = getdate(),         rowModifiedByUserIdentifier = suser_name()     FROM inserted          JOIN person                ON person.personId = inserted.personId     IF @@error <> 0      BEGIN              SET @msg = 'There was a problem in the instead of trigger                          for the update of person record(s).'              RAISERROR 50000 @msg              ROLLBACK TRANSACTION              RETURN     END 

The INSTEAD OF INSERT trigger is included in the code download. Since we have this in an INSTEAD OF trigger, the user or programmer cannot overwrite the values, even if they include it in the field list of an INSERT.

Next, we will include the checking code in our stored procedure, like the highlighted code in this procedure. The declaration for this procedure will include the key, all of the logically ‘modifiable’ fields, and the optimistic locking field:

     CREATE PROCEDURE person$upd     (         --primary key         @r_personId int,         --updateable fields         @firstName VARCHAR(60) ,         @middleName VARCHAR(60) ,         @lastName VARCHAR(60) ,         @dateOfBirth DATETIME ,         --optimistic lock         @rowModifiedByUserIdentifier NVARCHAR(128) = NULL     )     AS 

Then, we set up some of our basic variables and savepoint names, and finally start a transaction and set our savepoint:

     -- Turns off the message returned at the end of each statement     -- that states how many rows were affected     SET NOCOUNT ON      BEGIN        DECLARE @rowcount INT,     --checks the rowcount returned          @error  INT,            --used to hold the error code after a call          @msg  VARCHAR(255),      --used to preformat error messages                @retval   INT,     --general purpose var for return values          @savepoint VARCHAR(30)   --holds the transaction name        SET     @savepoint = CAST(object_name(@@procid) AS varchar(27))                             + CAST(@@nestlevel AS varchar(3))        BEGIN TRANSACTION        SAVE TRANSACTION @savepoint 

Then, we update the table, based on the key value passed in, and the optimistic locking value. This will not harm performance because it will do a primary key lookup:

     UPDATE  person        SET  firstName = @firstName ,             middleName = @middleName ,             lastName = @lastName ,             dateOfBirth = @dateOfBirth     WHERE   personId = @r_personId       AND   rowModifiedByUserIdentifier = @rowModifiedByUserIdentifier 

Next, we have our error-handling block. We check to see if an error has occurred:

         --get the rowcount and error level for the error handling code         SELECT @rowcount = @@rowcount, @error = @@error         IF @error != 0 --an error occurred outside of this procedure           BEGIN             SELECT @msg = 'Problem occurred modifying the person record.'             RAISERROR 50001 @msg             ROLLBACK TRANSACTION @savepoint             COMMIT TRANSACTION             RETURN -100           END 

Then, check to see if the rowcount is 0, since it will mean that either the key value does not exist, or the row has been modified.

     ELSE IF (@rowcount = 0 )          BEGIN                 --check existance                 IF EXISTS ( SELECT *                             FROM   person                             WHERE   personId = @r_personId )                   BEGIN                        SELECT @msg = 'The person record you tried to'+                           'modify has been modified by another user.'                   END                 ELSE                   BEGIN                        SELECT @msg = 'The person record you tried to'+                                              'modify does not exist.'                   END          RAISERROR 50001 @msg          ROLLBACK TRANSACTION @savepoint          COMMIT TRANSACTION          RETURN -100     END 

Finally, we close the transaction and return a 0.

         COMMIT TRANSACTION         RETURN 0     END 

If however, we were to define our table as:

     CREATE TABLE person     (          personId INT IDENTITY(1,1),          firstName VARCHAR(60) NOT NULL,          middleName VARCHAR(60) NOT NULL,          lastName VARCHAR(60) NOT NULL,          dateOfBirth DATETIME NOT NULL,         autoTimestamp TIMESTAMP NOT NULL,         constraint XPKperson PRIMARY KEY (personId)     ) 

With a timestamp datatype column, very little will change with our procedures, and we will not need the INSTEAD OF triggers. In the next block of code, this is all that will change. In the parameters:

     CREATE PROCEDURE person$upd     (         @r_personId INT,         @firstName VARCHAR(60) ,         @middleName VARCHAR(60) ,         @lastName VARCHAR(60) ,         @dateOfBirth DATETIME ,         @autoTimestamp TIMESTAMP     )     AS 

and in the UPDATE statement:

     UPDATE  person        SET  firstName = @firstName ,             middleName = @middleName ,             lastName = @lastName ,             dateOfBirth = @dateOfBirth      WHERE  personId = @r_personId        AND  @autoTimestamp = autoTimestamp 

It is much easier to implement, but lacks the documentation of who made the last change to the table, and when. The timestamp column is a VARBINARY(16), and is simply a value that is guaranteed to be unique within a database.

Logical Unit of Work

While row-based optimistic locks are very helpful, they do have a drawback. In many cases, several tables together actually make one ‘object’. A good example is an invoice and line items. The idea behind a logical unit of work is that instead of having a row based lock on the invoice and all of the line items, we might only have one on the invoice, and use the same value for the line items. Therefore, we expect that the user will always fetch not only the invoice, but also its line items into its cache when it is dealing with the invoice. Assuming we are using a timestamp column, we will use the same kind of logic as we have used previously on the invoice table. However, when the user wants to insert, update, or delete line items for the invoice, the procedure will require the @autoTimestamp parameter, and will check the value against the invoice, before UPDATE. Consider that we have two tables, minimally defined as:

     --leaving off who invoice is for     CREATE TABLE invoice     (          invoiceId INT IDENTITY(1,1),          number VARCHAR(20) NOT NULL,          autoTimestamp TIMESTAMP NOT NULL,          CONSTRAINT XPKinvoice PRIMARY KEY (invoiceId)     )     --also forgetting what product that the line item is for     CREATE TABLE invoiceLineItem     (          invoiceLineItemId int NOT NULL,          invoiceId INT NULL,          itemCount INT NOT NULL,          cost INT NOT NULL,          CONSTRAINT XPKinvoiceLineItem PRIMARY KEY (invoiceLineItemId)     ) 

For our delete procedure for invoiceLineItem, for the parameters we will have:

     CREATE PROCEDURE invoiceLineItem$del     (         @r_invoiceLineItemId INT, --just need the primary key         --this will be the timestamp of the invoice table         @autoTimestamp TIMESTAMP     ) 

Next, we use the following logic in the DELETE statement:

         DELETE  invoiceLineItem         FROM     invoiceLineItem         JOIN invoice         ON invoice.invoiceId = invoiceLineItem.invoiceId         WHERE   invoiceLineItem.invoiceLineItemId = @r_invoiceLineItemId           AND   @autoTimestamp = invoice.autoTimestamp 

Instead of checking the timestamp on an invoiceLineItem row, we are now checking the timestamp on the invoice table. Additionally, we will need to update the timestamp value on the invoice table when we make our change.

The rest of the code will be similar to the code we used in the update procedure. Again, all code for the examples will be included in the download for the book.

Best Practices

The most important goal of any SQL Server application is to maintain the integrity of the data that is to be stored. Whenever we work to increase the number of concurrent users that our system can handle, we must still make sure that the integrity of our data is still maintained.

The following is a brief list of suggestions to help maximize concurrency:

  • Recognize the difference between hardware limitations and SQL Server concurrency issues

    If the hardware usage is at its maximum (with excessive disk queuing, 90% CPU utilization, and so on), consider adding more hardware. However, if you were single-threading calls through your database, due to locking issues, you would have to add twenty processors and a terabyte of RAM to see a little improvement.

  • Keep transactions as short as possible

    The smaller the transaction, the less chance there is of it holding locks. Try not to declare variables, create temporary tables, etc. inside a transaction, unless it is necessary. Make sure that all table access within transactions is executed as atomic operations.

  • Fetch all rows from a query as fast as possible

    Depending on the isolation level and editability of the rows being returned, there can be locks held that would interfere with the other users' ability to modify or even read rows.

  • Minimize use of cursors

    Cursors require more resources (CPU if nothing else) than typical SQL statements. Use them when and if they are needed, and no more.

  • Make sure that all queries use good plans

    The better queries are, the faster they execute, and it follows that more code can be executed within the same resource framework.

  • Use some form of optimistic locking mechanism

    Preferably use a timestamp column, as it requires the smallest amount of coding, and is managed entirely by SQL Server. The only code that will be required when programming will be to validate the value in the timestamp column.




SQL Server 2000 Stored Procedures Handbook
SQL Server 2000 Stored Procedures Handbook (Experts Voice)
ISBN: 1590592875
EAN: 2147483647
Year: 2005
Pages: 100

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net