Using stored procedures is just an extension of what we have talked about so far in this chapter. One difference between a stored procedure and a cached SQL statement is that the stored procedure is created beforehand and that clients call it by name rather than sending the SQL text to the database. Another difference is that a stored procedure is a database object, whereas obviously SQL text sent from the client to the database is not. Because it's a database object, you can use it for security. You—or the database administrator—can give users permission to run a specific stored procedure, even if they're not allowed to do the job the procedure performs. For example, you can allow a user to call a procedure that fetches all the horses trained by a specified trainer, even though the user might not have SELECT privileges for the Horses table. This feature helps you guarantee that the user always accesses the database through well-tested stored procedures and never directly accesses database tables.
In many ways, this practice is similar to creating data access classes in the business tier of a Microsoft Windows DNA application, giving these classes permission to access database objects and never allowing users direct access to tables or any other database objects.
It's easy to create and call a parameterized stored procedure for a SELECT operation. Say, for example, that you want to create a stored procedure that fetches ID and name values for horses with names that correspond to a certain pattern. The command you use in SQL Server to create such a procedure is, not surprisingly, CREATE PROCEDURE. The following SQL command creates the GetHorseListForNamePattern stored procedure:
CREATE PROCEDURE GetHorseListForNamePattern @prmNamePattern varchar(20) AS SELECT HorseId, HorseName FROM Horses WHERE HorseName LIKE @prmNamePattern ORDER BY HorseName |
The most interesting parts of the preceding code appear in boldface type. Concentrating on that part of the code, you can see that this stored procedure takes one parameter, which is the name pattern you need to use for the query. You can also see that the @prmNamePattern variable is represented in the procedure's SQL text. Notice that in SQL Server the names of all parameters and arguments are prefixed by the @ sign.
How do you call such a procedure from ADO? As a matter of fact, you call such a procedure using ADO almost the same way you would if you were sending the corresponding parameterized SQL statement from the client to the database. The following code is an example. We show the code specific for calling a stored procedure in boldface type.
Public Function GetListForNamePattern _ (ByVal strNamePattern As String) As Recordset Dim cmd As Command, prm As Parameter, rs As Recordset, _ strSQL As String On Error GoTo GetListForNamePatternErr Set cmd = CreateObject("ADODB.Command") cmd.ActiveConnection = strConn cmd.CommandType = adCmdStoredProc cmd.CommandText = "sp_GetHorseListForNamePattern" Set prm = CreateObject("ADODB.Parameter") prm.Type = adVarChar prm.Size = 20 prm.Direction = adParamInput prm.Value = strNamePattern cmd.Parameters.Append prm Set rs = CreateObject("ADODB.Recordset") Set rs.Source = cmd rs.LockType = adLockReadOnly rs.CursorLocation = adUseClient rs.CursorType = adOpenStatic rs.Open Set rs.ActiveConnection = Nothing Set GetListForNamePattern = rs Set cmd.ActiveConnection = Nothing Set cmd = Nothing Exit Function GetListForNamePatternErr: Err.Raise Number:=Err.Number, Source:=Err.Source, _ Description:=Err.Description End Function |
As you can see, calling a parameterized stored procedure is not much different from sending a parameterized SQL statement through an ADO Command object. So why should you even bother to create, let alone call, such procedures? We'll give you just the following list of reasons:
Those are some of the reasons we offer, hoping to encourage you to increase your use of stored procedures. Let's see if we can find any reasons not to use them. (Such reasons do exist.)
Let's take a narrow table—one with only a few columns—such as the RaceTracks table. It's narrow enough, having only four columns. Here's the SQL script required to create that table:
CREATE TABLE RaceTracks ( TrackId char (3) NOT NULL , TrackName varchar (30) NOT NULL , City varchar (20) NOT NULL , Country char (3) NOT NULL ) |
A very simple stored procedure designed to update this table would look like the following example:
CREATE PROCEDURE UpdateRaceTracks @TrackId char(3), @TrackName varchar(30), @City varchar(20), @CountryCode char(3) AS UPDATE RaceTracks SET TrackName = @TrackName, City = @City, Country = @CountryCode WHERE TrackId = @TrackId |
There's at least one problem with this procedure: it doesn't have any concurrency control at all. Concurrency control protects users from having their modifications written over by other users. Say, for example, that user A and user B almost simultaneously select all the columns from the RaceTracks table. Because they select their data almost at the same time, they both get the same data back. User A then modifies the city of the track and calls the preceding procedure. Only seconds later, user B calls the same procedure after modifying the track's country. Unfortunately, because B doesn't know that the city value is now different from the value B received, B gives the old city value as the @City parameter, thus writing over and destroying the new city value user A entered.
In some cases, it might be perfectly all right for a modification to destroy other users' modifications, but in most cases it's not. Normally you should use some kind of concurrency control. If you use ADO recordsets when modifying data in SQL Server, concurrency control comes very easy. All you have to do is set the LockType property of the recordset to adLockOptimistic. With this LockType value, ADO makes sure that you don't overwrite other users' modifications when you update the database. As an experiment, let's see what kind of call to SQL Server such an ADO recordset will make when only the TrackName column is modified. The following call to sp_executesql is a typical example:
sp_executesql N'UPDATE MSPress_ADBRacing..RaceTracks SET TrackName = @P1 WHERE TrackId = @P2 AND TrackName = @P3', N'@P1 varchar(30), @P2 char(3), @P3 varchar(30)', 'Täbyholm Galopp', 'Täb', 'Täby Galopp' |
The preceding statement tries to update the name of the racetrack identified by the 'Täb' racetrack ID from 'Täby Galopp'—the old value—to 'Täbyholm Galopp'—the new value. Notice that ADO uses parameters and qualifies table names to make it as easy as possible for SQL Server to reuse any similar execution plan already established, executed, and cached. The performance of this call will be nearly as good as the performance of a stored procedure doing exactly the same work. There will be a very small difference in performance because it's easier for SQL Server to find an existing plan for a stored procedure by name than to search for a cached execution plan that matches a certain query.
Going back to the concurrency control problem, another user might have changed the track name of Täby Galopp in between—after we fetched it but before we tried to update it. If so, SQL Server won't find the row to update. The AND part of the WHERE clause, which specifies the old value, makes sure of that. This query doesn't just ask for a track with TrackId = "Täb"; it asks for a track with TrackId = "Täb" and TrackName = "Täby Galopp". So if another user really modified the name of this track, SQL Server can't find the row we want to update and our update won't be executed. You get this type of concurrency checking for free when you use ADO recordsets to do your updates for you.
Let's consider another example, updating the city as well as the name of the track. If American and Canadian hockey teams can move from one city to another, why shouldn't racetracks be able to do the same? The code ADO would generate for this kind of update is shown below.
sp_executesql N'UPDATE MSPress_ADBRacing..RaceTracks SET TrackName = @P1, City = @P2, WHERE TrackId = @P3 AND TrackName = @P4 AND City = @P5', N'@P1 varchar(30), @P2 varchar(20), @P3 char(3) @P4 varchar(30), @P5 varchar(20)', 'Täbyholm Galopp', 'Gothenburg', 'Täb', 'Täby Galopp', 'Stockholm' |
This statement, automatically generated by ADO for this update, is somewhat more complex than the first one. But just the same, it's a similar statement. It furnishes sp_executesql with as many arguments as it takes to protect users from having their modifications overwritten by other users. What's more, ADO does this without constraining other users more than necessary. If, for a specific track, user A modifies the country and user B modifies both the track name and the city, their modifications don't conflict. And both sets of changes will be made. But if two users modify the same column of the same track, doing it from the same starting point, SQL Server performs only one of the updates; thanks to the augmented WHERE clause, the other one will be denied.
The following table lists all possible column combinations in the racetracks table. We exclude the TrackId column because normally you should avoid modifying key columns.
TrackName TrackName City TrackName City Country TrackName Country City City Country Country |
Even for such a small table with only four columns (including the missing TrackId column), you still need seven different UPDATE statements to cover all possible combinations of column updates. This is true at least if you need concurrency control when the table is modified; if you don't, you might get away with only one UPDATE statement. For a table with 10 nonkey columns, you need many different UPDATE statements. Knowing this, it's obvious that you can't have one UPDATE statement for each update option, at least not if you must write one stored procedure for each update option. You need a simpler model. Let us give you two options:
We created the following simple stored procedure as an example:
CREATE PROCEDURE UpdHorse @SetClause varchar(500), @WhereClause varchar(500) AS DECLARE @Stmt varchar(500) SELECT @Stmt = 'UPDATE Horses ' + @SetClause + ' ' + @WhereClause EXECUTE (@Stmt) |
The preceding procedure takes a SET clause and a WHERE clause as arguments. Then it builds the complete UPDATE statement, furnishing the UPDATE part itself, and then it goes on to save the completed statement in the @Stmt variable. Finally it uses the EXECUTE keyword to execute the string. Without any problems at all, we used the following EXECUTE statement to call this procedure:
EXECUTE UpdHorse 'SET HorseName = "Vildkatten"', 'WHERE HorseId = 1610' |
It worked just fine, and the name Wildcat was changed to Vildkatten, which by the way is the Swedish word for wildcat.
While this call to the procedure worked, there was a snag. We made the update without controlling concurrency. Our update might conflict with somebody else's; somebody else's update might conflict with ours. So when we changed back to the real name of the horse, we enhanced the statement that calls the procedure, making the WHERE clause check for the old horse name as well as for the ID of the horse. We knew that this time the call wouldn't result in the database being updated because it shouldn't, but we made the call anyway just to prove our point. Here's the enhanced EXECUTE statement:
EXECUTE UpdHorse 'SET HorseName = "Wildcat"', 'WHERE HorseId = 1610 AND HorseName = "Wildcat"' |
Now we've copied ADO's way of updating under concurrency control. We used the wrong horse name in the AND part of the WHERE clause, but that was on purpose. The previous update had named the horse Vildkatten, and we asked for a record of a horse having 1610 as his horse ID and Wildcat as his name. Of course, such a record didn't exist anymore when we issued the statement, so SQL Server didn't find it.
To prove that the record would be updated if we did everything right, we modified the EXECUTE statement again to reflect the present state of things. This time it worked, and we've given back to the horse its correct name. Here's the statement that fixed it:
EXECUTE UpdHorse 'SET HorseName = "Wildcat"', 'WHERE HorseId = 1610 AND HorseName = "Vildkatten"' |
The method we've used so far to achieve concurrency control is called the optimistic way. It's said to be optimistic because we assume that nobody else will update our data as long as we hold it. Since we're optimistic, we don't bother to lock the record to prevent other users from even seeing it. Implicitly we say something like this: "Now I'm fetching this data from the database. I'm probably not going to change it, so I don't lock the record containing it. If I should happen to change it anyway, probably no other user will come up with the same idea, so there's no real need to lock the record." But even though we're optimistic, we don't just haphazardly change any data without checking for possible concurrency problems. We make sure that at least the fields we're changing still hold the same values they did when we fetched them. This is the reason this strategy is called not only optimistic but optimistic concurrency control. There's optimism involved, but also concurrency control.
You can use another strategy, less optimistic, to make sure when updating that no user unintentionally destroys any updates made by other users. As you would imagine, the other way is called the pessimistic way. Using it, you say something like this: "Now I'm fetching this data from the database. I'm probably going to change it, so I prevent other users from modifying it before me by locking it. When I lock the data I read, I'm guaranteed that nobody else gets access to this data while I have it." It's easy to see why this way of doing things is called the pessimistic way, or pessimistic concurrency control.
It might not be as easy to see how this process of locking most of the records users read reduces concurrency, but it does. Pessimistic locks tend to be long lasting. A pessimistic user gets the lock when retrieving data from the database. Then there's think time and local work time before the update is sent back to the database. During all this time, the affected data is still locked in the database with the express intention of keeping other users away. In an installation with many users and lots of updates, pessimistic concurrency control might become very painful. This is the reason for the popularity of optimistic schemes, one of which we have just described. We recommend that you use pessimistic concurrency control only when you're absolutely sure you need it; in all other situations, use the optimistic way!
We still owe you an explanation of how you can implement pessimistic concurrency control. Using it, you must stay connected to the database because you can't otherwise keep the lock. You must perform the update in the same transaction that acquired the lock. Therefore, you must issue a BEGIN TRANSACTION command before you acquire the lock. Within that transaction, you can acquire your lock on a specific table row by updating one of its values with the value it already has. In other words, you update the table row without changing it. The only effect of this action is to give you an exclusive lock on the row, thus preventing all other users from even seeing the row. This lock remains in effect just as long as your transaction is still alive. As a demonstration, we issued the following command batch of two commands—one to start a transaction, the other to update a row in the Horses table:
BEGIN TRANSACTION UPDATE Horses SET HorseName = HorseName WHERE HorseId = 365 |
Notice that we didn't change anything in the table. We just updated the name of the horse identified by the HorseId value of 365 with its own name. What we achieved was an exclusive lock on that table row. Because we started the batch with a BEGIN TRANSACTION command, SQL Server holds that lock for us until we resolve the transaction by issuing a COMMIT or ROLLBACK TRANSACTION command.3
Using SQL Server Enterprise Manager, we opened the Management folder for the server that we used. In that folder, we first opened the Current Activity folder and then the Locks/Object folder. Clicking on the database we were working in, we received the list of locks displayed in Figure 17-1.
Figure 17-1. We now hold one exclusive row level lock and two intent exclusive locks in our table.
In the Mode column of the table displayed in Figure 17-1, you can see that we now have two IX locks and one X lock. IX means intent exclusive and X means exclusive. Every lock shown in the figure is given to us. The meaning of each lock shown in the figure is as follows:
SQL Server holds the exclusive row lock just mentioned for the lifetime of the user's transaction. As long as the transaction lives, the lock stays. There's no way another user can modify, or even read,4 any data on this row as long as someone owns an exclusive lock on it. Keeping this lock, the owner holding the lock (but nobody else) can fetch the values of the row columns in order to modify them. Just as long as this user stays connected and avoids closing the transaction, the owner has an absolute guarantee that nobody else is allowed to make in-between changes to this record. It doesn't matter whether it takes seconds, minutes, hours, or even days for the user to come back from the races, say. The lock stays, and the data locked conceptually belongs to the owner of the lock.
This, of course, is also the downside of pessimistic concurrency control. Concurrency is reduced, sometimes severely reduced. Normally, if you use pessimistic concurrency control, you should make sure that it's virtually impossible for a user to acquire a lock on some data and then go take his or her coffee break.
Stored procedures are great, but they require more work than ordinary updatable ADO recordsets. It has been said that stored procedures double the scalability of SQL Server. This is because stored procedures reuse existing execution plans and therefore also use fewer server resources for a shorter time. It should also be said that it's not all that difficult for a developer to write and access stored procedures in such a way that the advantage of using them is entirely dissipated, perhaps even turning the use of stored procedures into a disadvantage when you look at the system as a whole.
If you decide to use stored procedures, we definitely won't argue with you; just make sure you enforce the concurrency control you need. Normally, the concurrency control you use should be optimistic; only if it's clearly inappropriate to use optimistic concurrency control should you go the pessimistic way.
Whatever you choose, SQL from the client or stored procedures, optimistic or pessimistic or no concurrency control, we wish you the best of luck!