One of the powerful features of SQL Server is the capability to write scripts and batches to manipulate data. T-SQL includes features such as conditional execution constructs (IF...ELSE), looping constructs (WHILE), and cursor functionality. These features combine to make T-SQL a limited yet fairly powerful tool. T-SQL programs are technically called scripts. A script is usually contained within a text file on disk, which is then loaded into some tool (the Query Analyzer, or the command-line equivalent called OSQL) and can be executed. A script is made of one or more batches. Each batch is made up of zero, one, or more transactions. To separate one batch from another, put the word GO on a line by itself between the batches, like this: SELECT * FROM sysobjects WHERE type = 'u' go SELECT COUNT(*) FROM sysobjects This script contains two batches, one from the beginning of the file to the word GO, and another from the word GO to the end of the file. Knowing how batches work is important for several reasons. Batches determine variable scope. This subject is covered again later, but you should always remember that a variable can be used only within the batch where it is declared. SQL Server compiles and runs scripts batch by batch. If you have a script with several batches in it, and one of the batches contains a syntax error, the rest of the batches do execute; but the statement in the batch that had an error does not execute. If one of the statements would cause a constraint violation, that statement doesn't execute, but all the other statements in the batch do execute. Other runtime errors, such as arithmetic overflow errors, cause the batch to stop executing at that point, with all the preceding commands executed and none of the following commands executed. When you use tools such as Query Analyzer or the command-line equivalent, OSQL, the tools themselves send the statements to SQL Server in batches, one batch at a time. SQL Server then compiles the single batch, processes it, and returns for the next batch as necessary. The keyword GO, then, isn't used by SQL Server; it is actually used by the various tools to determine when batches start and stop. The following illustrates a few rules you should know about batches:
Variables can cause other issues within scripts. Local variables declared within the script fall out of scope after a batch is ended with GO. You can use global variables, but then you open up issues around consuming resources and visibility to outsiders. Coding with VariablesA variable is a temporary place to put pieces of data that you're working with. This can include dates, values, stringsessentially, anything you need to store during the execution of the script. A variable is very similar to a column. Differentiating a variable from a column is accommodated through the use of a @ or @@ prefix on the name. All variables must be declared before they can be utilized: DECLARE @id int, @total int, @fee varchar(30) You can create variables of most SQL data types, the exceptions to this being TEXT, NTEXT, and IMAGE. Instead of the large data types, you must accept the limitations of a standard string variable that holds up to 8,000 bytes, so you can build a varchar(8000) or an nvarchar(4000) in a variable. (Remember that the nchar and nvarchar types are double-wide characters, so they take up twice as much room as a varchar.) There is also a special variable type called TABLE that can be used to store a recordset. For example, you can declare a table variable and then load it with data something like this: DECLARE @tmp TABLE (Id int, TableName varchar(50)) INSERT INTO @tmp SELECT Id, Name FROM sysobjects WHERE Type = 'u' That creates a table similar to a temporary table that is available only within the current batch. This is faster and requires fewer resources than a temp table, but with a more limited scope. Be aware, however, that this consumes SQL Server memory, so don't put exceptionally large tables into these structures. You cannot use a variable of type table as the target of a SELECT..INTO statement, either. You can populate the table using only INSERT..SELECT, INSERT..VALUES, and UPDATE. All variables cease to exist at the end of their scope. To keep things simple, SQL Server has only one scope for a variable, which is the local scope. Yes, there are "global variables" but they are reserved as functions for the server itself. The developer can not create global variables. That means that when your script has a GO in it, all variables are deallocated. If you intend to use the values contained within them again later in the script you will need to stow them away into a table. Variables need to be redeclared and reinitialized after the GO. Setting and Using VariablesThere are four ways to put a value into a variable. If you need to put data into a variable that comes out of a SELECT statement, you can do something like this: SELECT @id = id FROM sysobjects WHERE name = 'syscolumns' After executing this, and assuming that the variables are all declared properly, you'll end up with some number in the @id variable. That's one way to put a value into a variable. You can also set a variable using SET: SET @id = 42 or SET @today=getdate() Another way to put a value into a variable is with the EXEC statement, utilizing the return value from a stored procedure execution: DECLARE @ReturnCode int EXEC @ReturnCode = sp_who Return codes used in this manner do not return data. Return codes provide status information such as success or failure of the procedure execution. Return codes are limited to data type int, so you can also use return codes to return integer data. The final way to set a variable is also to use an EXEC, but with an output parameter: EXEC @ReturnCode = SampleProcedure @ID = 9, @Qty = @QtyVar OUTPUT Output parameters (and input parameters, for that matter) are used in many circumstances when you must pass information from one procedure to another. This topic is expanded on later in the chapter, in the section "Results from Procedure Execution." Local variables are equally useful within a procedure. Global variables are set up by the server and, postSQL Server Version 7, are more appropriately called functions. Server Provided Global ValuesGlobal variables are not variables. They are actually system functions that return various pieces of information about the current user environment for SQL Server. A global variable looks like a variable with a double @@ prefix. You cannot declare global variables. You can put as many @ signs as you want in front of a variable declaration and you will still have a local variable. Each of the following in essence declares the same type of local variable: DECLARE @id int DECLARE @@id int DECLARE @@@id int Global variables are functions; however, you cannot directly change them with a SET or SELECT statement. So if you use one of these special functions somewhere, realize that although it looks like a variable and can act like one, you can't assign any data to it; you can only read data from it. Table 6.5 lists some of the most common global variables in SQL Server 2000. There are many other functions available, but because they are less frequently used and you are unlikely to see them on the exam, they have been omitted from the list. To see the complete list, refer to Global Variables in SQL Server Books Online.
To access the content of global variables, you can use a SELECT command or you can assign the values through use of a SET. You can also display the value using a PRINT statement. As we are now well into T-SQL coding, before we get much further, you should be documenting your work. The use of comments is essential in producing readable code that is usable by members of a development team. Document As You GoComments have two entirely different purposes within a T-SQL batch or stored procedure. First, they can be used to document code, to make it easier for folks who have to maintain software in the future. Second, they can be used to temporarily disable lines of code within your batch when you're trying to get it working. Using comments is the most reliable way of ensuring that you or anyone else can figure out what your code does. SQL Server has two methods for putting comments in your code. The first commenting method is to start the comment with a double dash (--). The double dash can appear anywhere on the line, and anything between the double dash and the end of the line is a comment and is not executed. For example: --this is a comment on the whole line SET @i = 42 --this is a comment, but the preceding code will execute --Nothing on this line executes SET @i = 21 The other style of comment, which is not seen as often anymore, is the slash-star comment: /* ************** Note that this is a multiple-line comment. This type of comment can begin and end anywhere. It can start or end in the middle of a line, though for readability the markers on either end are usually segregated from the comments. **************** */ One thing to watch for is that the string GO within a comment on a line by itself causes an error. The more common convention by far is to use the double-dash style of comment. The new Query Analyzer for SQL Server 2000 provides you with a tool to create multiline comments quickly and easily. Just highlight the lines you want to comment and press Ctrl+Shift+C. This adds a double dash to the beginning of each highlighted line. To uncomment the text, just use Ctrl+Shift+R. This is a quick, easy, and painless way to comment out large chunks of code for testing and put them back later. There aren't any restrictions on any special words in the double-dash comment. Statement Blocks with BEGIN...ENDThe BEGIN and END keywords work jointly to group statements together. They are used in later constructs for loops and conditional statements. BEGIN and END are used to create a statement block, which is a group of statements that can be used anywhere one statement can be used. For example, you could write this: BEGIN UPDATE mytable SET emptype = 'manager' WHERE name ='fred' UPDATE mytable SET name = 'george' WHERE id = 42 END Note that traditionally the indentation is the preferred style, although it is not required. BEGIN and END must occur as a pair. That's why they are indented as they are: The indentation makes it easy to spot if one of them is missing, and it makes it easy to tell where the statement block ends. The keyword RETURN exits out of a statement block without executing any further commands: BEGIN UPDATE mytable SET emptype = 'manager' WHERE name =_ 'fred' RETURN UPDATE mytable SET name = 'george' WHERE id = 42 END In the preceding example the second UPDATE never runs. It's not very useful now, but when used in combination with a conditional operation, such as IF...ELSE, it becomes a very useful command. Conditional Statements with IF...ELSEIn many instances you want things to be performed only if certain conditions are met. Although there are several conditional constructs, by far the most common and easiest to use is the IF...ELSE construct. In T-SQL, an IF statement looks like this: IF expression statement ELSE statement When multiple statements are to be executed based on the condition, you must use a statement block similar to the following: IF expression BEGIN statement statement statement END ELSE BEGIN statement statement statement END The expression has to be an expression that evaluates to a true or false condition, unlike in some languages that use zero and nonzero. To evaluate something to true or false, you need to use the comparison operators. Multiple Conditions with CASEA CASE expression works like an IF statement, but it can be used in locations where an IF statement cannot. Specifically, a CASE expression returns one of a specific set of values based on the outcome of one or more expressions. Here's an example: Select CASE datepart(weekday, getdate()) WHEN 1 then 'Sunday' WHEN 2 then 'Monday' WHEN 3 then 'Tuesday' WHEN 4 then 'Wednesday' WHEN 5 then 'Thursday' WHEN 6 then 'Friday' WHEN 7 then 'Saturday' ELSE 'Unknown' END This example gets the day of week for today and turns it into a string that represents the text for the day of week. If, for some reason, the day of the week returned by the datepart() function is invalid, it returns the string Unknown. The result is placed into the variable @Result. This is the proper syntax to use when the comparison you want to use is equalityin this situation, datepart(weekday, getdate()) = 1. Notice that the expression starts with the keyword CASE and ends with the keyword END. This is the only time you can use an END without a BEGIN. This is called a "simple" CASE statement, in contrast with the "searched" CASE statement, discussed later in this section. CASE statements are a flexible mechanism for adding logic within a query anywhere an expression is permitted. This means that you can also use a CASE expression in the WHERE clause, in an ORDER BY clause, or anywhere else an expression is allowed, similar to the following: SELECT Name FROM sysobjects WHERE CASE id % 2 WHEN 1 THEN 1 ELSE 0 END = 1 The percent sign in this example is the modulo operator: It returns the remainder of the first number divided by the second number. Basically, what this SELECT statement does is return the names of all the odd-numbered (divisible by 2 with a remainder of 1) objects in the current database. When the ID modulo 2 returns a value of 1, then it's an odd number; the CASE statement returns 1, which the WHERE clause then compares to the number 1, and the row is included in the resultset. Otherwise, the CASE statement returns 0, which does not equal 1, so the row is not included in the resultset. The keen of wit will note that a better way to write this would be the following: SELECT name FROM sysobjects WHERE id % 2 = 1 That, however, would not have demonstrated the point of using CASE statements in a WHERE clause, nor would it be nearly as convoluted. It would, however, be readable and efficient. A statement that is a shortcut for a CASE statement is called COALESCE. It takes a series of values and returns the first one that's not null. To condense many conditions into a single statement, use the following: SELECT COALESCE(sid, 0) FROM sysusers It is important to recognize that in many situations the first syntax you think of may not be the best way to write code based on ease of execution and efficiency. Always be willing to reexamine what you have written to see whether it can be done better. This is particularly important when examining complex queries and looping structures. Loops in T-SQLWhereas most languages provide many different looping constructs, T-SQL offers essentially only one: WHILE. A WHILE loop is similar to an IF statement, but after executing any conditional statements it retests the condition and returns to the top to start over again. A WHILE loop continues to execute until the conditional expression controlling the loop becomes false. If the expression never becomes false, you have a problem known as an infinite loop. A classic looping structure has three steps. The initialization step sets up the variables and populates them to initial values. The test step evaluates the expression and determines whether the loop should be repeated. The augmentation step performs useful work, usually changes the expression somehow, and returns to the test step. Remember that the WHILE loop is the only construct provided for executing a counting loop, so a simple loop that counts to 100 would look like this: DECLARE @i int SET @i = 1 --initialization WHILE @i <= 100 --test BEGIN PRINT @i SET @i = @i + 1 --augmentation END Two special keywords can be used to control the execution of a WHILE loop. The CONTINUE keyword short-circuits the statement being executed and immediately goes back up to the loop test, ignoring the rest of the statement block. The BREAK keyword exits the WHILE loop and starts executing the statement after the end of the statement block. Here's an example of the BREAK keyword: DECLARE @i int SET @i = 1 --initialization WHILE @i <= 100 --test BEGIN PRINT @i SET @i = @i + 1 --incrementation IF @i = 42 BREAK END This causes the loop to stop counting when it reaches the number 42, but after it has printed 41. The number 42 will not be printed. Here's an example of using a CONTINUE keyword: DECLARE @i int SET @i = 1 --initialization WHILE @i <= 100 --test BEGIN PRINT @i SET @i = @i + 1 --augmentation IF @i = 42 BEGIN SET @i = @i + 1 continue END END This skips printing the number 42 and goes straight on to printing 43 and up to 100. Why does it increment the variable before using CONTINUE? If it didn't, the statement would actually print the number 42 and continue along, just as the first WHILE loop did. Although loops are great, how often do you need to count things in T-SQL? It would be great if you could use this structure to work on one row from a table and then loop to perform the same operation with data from the next row. It is in working with cursors that looping operations become of primary importance. Cursors allow you to exercise a looping structure against a dataset and thus process a recordset from beginning to end. Traditional Data ProcessingCursors are used to take the results of a SELECT statement and assign the output from the recordset to a set of variables, one row at a time. This enables you to walk through the recordset one record at a time and process the information. Creating a cursor involves five steps:
Here's a short example: DECLARE @Name sysname DECLARE SysObj cursor for SELECT name FROM sysobjects OPEN SysObj FETCH NEXT FROM SysObj INTO @Name WHILE @@FETCH_STATUS = 0 BEGIN PRINT @Name FETCH NEXT FROM SysObj INTO @Name END CLOSE SysObj DEALLOCATE sysobj The first line declares a variable called @Name of type sysname. The sysname data type is a special nvarchar data type that is used to hold the names of different system objects. If you're putting system names into a variable, it's the correct type to use because if the length of names changes from this version of SQL Server to the next, your code will still work. The DECLARE CURSOR line declares what the cursor is going to do. In this case, the cursor is going to return the Name column in sysobjects. It is possible to return multiple fields, and data can be filtered with a WHERE clause. You can do anything in the SELECT statement that you can do in any other SELECT statement, including joins. The OPEN actually makes the cursor usable by allocating resources for it. The FETCH NEXT fetches the next row from the cursor. Because you haven't fetched any rows from the cursor yet, it fetches the first one. It takes the value returned and places it into the @Name variable. Note that the returned data and the variable have to be the same type, or if they are two different types, they have to convert implicitly. FETCH NEXT automatically sets the global variable @@FETCH_STATUS to 0 if the fetch was successful, and to other values (refer to Table 6.6) for other results. The WHILE loop will continue execution as long as there are records within the dataset to process, in other words, @@FETCH_STATUS = 0.
After the sample code executes the printing, there's another FETCH. This operation iterates the loop, advancing the cursor to the next record in the dataset and changing the @@FETCH_STATUS when there is no data left to execute. After the FETCH is the end of the loop, which then returns up to the test. Although @@FETCH_STATUS doesn't change every time, it changes when the end of the cursor is reached, so the loop doesn't go on forever. The CLOSE and DEALLOCATE are what you use to tell SQL Server that you're finished with the cursor. These close the cursor, which releases any locks you have, and deallocate the cursor, freeing the memory resources used by the cursor. Now, this was just a basic example. There are far more useful illustrations of using a cursor to follow within this segment.
A few options are available in the DECLARE CURSOR statement. The cursor used in the previous example was the default cursor type. It is important to recognize the default behavior of a cursor if no other behavior is specified. It puts a shared lock on rows in the cursor for the duration of the cursor, so they cannot be modified while the cursor is reading them. To avoid the locking problem, tell SQL Server to make a copy of the data and run the cursor from the copy by using the INSENSITIVE keyword. The cursor takes longer to open because SQL Server actually copies all the data to run the cursor into a temporary table. Another thing you can do with cursors is scroll back and forth through them. This is done with the SCROLL keyword. Those are the options for scrolling and fetching. Now see what things look like for inserts and updates through a cursor. By default, a cursor is updatable. To prevent updates to a cursor, use the FOR READ ONLY clause, which is placed after the SELECT statement in the DECLARE cursor: DECLARE Flintstone SCROLL CURSOR FOR SELECT Id, Value FROM Bedrock ORDER BY 1 FOR READ ONLY To update through a cursor, you need to tell SQL Server that you're going to update the cursor using the FOR UPDATE clause, which goes in the same place as the FOR READ ONLY in the preceding code example. To actually update the data, you perform a positioned update by using a special form of the UPDATE statement, UPDATE WHERE CURRENT OF: DECLARE @ID int, @Value varchar(30) DECLARE Flintstone cursor FOR SELECT Id, Value FROM Bedrock FOR UPDATE OPEN Flintstone FETCH NEXT FROM Flintstone INTO @ID, @Value UPDATE Bedrock SET Value = 'Fredrick' WHERE CURRENT OF Flintstone CLOSE Flintstone DEALLOCATE Flintstone Several rules have to be followed to update through a cursor. First, the cursor cannot be read-only. That's fairly obvious, but it implies that the cursor does not have the INSENSITIVE or SCROLL options turned on, in addition to not having the READ ONLY option turned on. Many other options (which are discussed later) cause a cursor to be read-only. The FOR UPDATE in the cursor declaration is optional, but suggested. A cursor defaults to an updatable state, but if you explicitly state that the cursor is going to be updated, your code will be easier to read. It would be even better if the update specified FOR UPDATE OF columnname, because that's the only column that is updated. Everything that has been discussed so far about cursors is part of the ANSI SQL-92 standard, so the code is fairly generic and should be portable to any other database management system that is SQL-92 compliant. There are many T-SQLspecific extensions to the cursor syntax that enable you to make performance enhancements for your cursor operations. Some of these extensions are described in the following list:
A few notes about the preceding list. First, the default LOCAL or GLOBAL status of a cursor can be changed by changing the server-wide Default to Local Cursor configuration setting. Next, if you specify FORWARD_ONLY and don't specify STATIC or KEYSET, the cursor behaves as a DYNAMIC cursor. In other words, the cursor sees any records inserted by other connections while the cursor is open. In addition, if you don't use the SCROLL, STATIC, KEYSET, or DYNAMIC options to specify that a cursor should scroll, the cursor will be FORWARD_ONLY. Also, you cannot use FORWARD_ONLY and FAST_FORWARD together. Cursors are flexible mechanisms that can be used to solve problems when no other solution exists. Keep in mind, however, that there are many approaches to solving most database issues that don't require cursors. Don't get caught in a cursor trap; they carry significant overhead and are required only for specialty tasks. Most day-to-day database transactions won't require their use. They are, however, a useful coding mechanism to add to the scripting toolbox. The two most similar of the coding implementations are stored procedures and user-defined functions. Stored procedures and user-defined functions can be used to manipulate and store data by encapsulating SELECT, INSERT, UPDATE, and DELETE functionality. |