Batches | Inside Microsoft SQL Server 2000

A batch is one or more SQL Server commands that are dispatched and executed together. Because every batch sent from the client to the server requires handshaking between the two, sending a batch instead of sending separate commands can be more efficient. Even a batch that doesn't return a result set (for example, a single INSERT statement) requires at least an acknowledgment that the command was processed and a status code for its level of success. At the server, the batch must be received, queued for execution, and so on.

Although commands are grouped and dispatched together for execution, each command is distinct from the others. Let's look at an example. Suppose you need to execute 150 INSERT statements. Executing all 150 statements in one batch requires the processing overhead to be incurred once rather than 150 times. In one real-world situation, an application that took 5 to 6 seconds to complete 150 individual INSERT statements was changed so that all 150 statements were sent as one batch. The processing time decreased to well under 0.5 second, more than a tenfold improvement. And this was on a LAN, not on a slow network like a WAN or the Internet, where the improvement would have been even more pronounced. Try running an application with 150 batches to insert 150 rows over the Internet, and you'll be really sorry!

Using a batch is a huge win. By using the Transact-SQL constructs presented in Chapter 10, such as conditional logic and looping, you can often perform sophisticated operations within a single batch and eliminate the need for an extensive conversation between the client and the server. Those operations can also be saved on the server as a stored procedure, which allows them to execute even more efficiently. If your operations are returning values rather than modifying them, you can also save the code as a user-defined function. Using batches and stored modules to minimize client/server conversations is crucial for achieving high performance. And now that more applications are being deployed on slower networks—such as WANs, the Internet, and dial-up systems—instead of on LANs only, using batches and stored modules is crucial.

Every SELECT statement (except those used for assigning a value to a variable) generates a result set. Even a SELECT statement that finds zero rows returns a result set that describes the columns selected. Every time the server sends a result set back to the client application, it must send metadata as well as the actual data. The metadata describes the result set to the client. You can think of metadata in this way: "Here's a result set with eight columns. The first column is named last_name and is of type char(30). The second column is…"

Obviously, then, executing a single SELECT statement with a WHERE clause formulated to find, in one fell swoop, all 247 rows that meet your criteria is much more efficient than separately executing 247 SELECT statements that each return one row of data. In the former case, one result set is returned. In the latter case, 247 result sets are returned. The performance difference is striking.

Although batches seem like an obvious necessity, many programmers still write applications that perform poorly because they don't use batches. This problem is especially common among developers who have worked on ISAM or similar sequential files that do row-at-a-time processing. Unlike ISAM, SQL Server works best with sets of data, not individual rows of data, so that you can minimize conversations between the server and the client application.

There's actually a middle ground between using one SELECT statement that returns all your rows and using an individual batch to return each row. Batches can contain multiple statements, each one returning its own metadata, so the ideal solution is to minimize your total number of statements. Multiple SELECTs within a single batch are still better than having each statement in its own batch because each batch has the overhead of handshaking between the client and server.

Here is a simple batch, issued from SQL Query Analyzer. Even though three unrelated operations are being performed, we can package them in a single batch to conserve bandwidth:

 INSERT authors VALUES (etc.) SELECT * FROM authors UPDATE publishers SET pub_id= (etc.) <click the Execute Query button>

All the statements within a single batch are parsed as a unit. This means that if you have a syntax error in any of the statements, none of the statements will execute. For example, consider this batch of two statements, which you can try to execute from SQL Query Analyzer:

 SELECT * FROM sales SELECT * FOM titleauthor <click the Execute Query button>

When you execute this batch, you get the following error:

 Server: Msg 170, Level 15, State 1, Line 2 Line 2: Incorrect syntax near 'FOM'.

Even though the first SELECT statement is perfectly legal, no data is returned because of the error in the second statement. If we separated this into two different batches, the first one would return a result set and the second one would return an error.

Now suppose that instead of mistyping the keyword FROM as FOM, we had mistyped the name of the table:

 SELECT * FROM sales SELECT * FROM titleautor <click the Execute Query button>

SQL Server 2000 provides delayed name resolution, in which object names aren't resolved until execution time. In this example, SQL Server returns the data from the first SELECT, followed by this error message:

 Server: Msg 208, Level 16, State 1, Line 1 Invalid object name 'titleautor'.

However, if we had mistyped the object name in the first statement, execution of the batch would stop when the error was encountered. No data would be returned by any subsequent valid queries.

Note that delayed object resolution does not apply to column names. If you changed the second query to one that used a valid table name but specified an invalid column in that table, the error would be detected at compile time and none of the statements in the batch would be executed. Here's an example:

 SELECT * FROM sales SELECT lastname FROM titleauthor <click the Execute Query button>

Normally, everything in SQL Query Analyzer's query window is considered a single batch and is sent to SQL Server for parsing, compiling, and execution when you click the green EXECUTE button. However, SQL Query Analyzer provides two ways around this. First, you can highlight a section of code in the query window so that when you click the EXECUTE button, only the highlighted text is sent as a batch to SQL Server. Alternatively, you can include the word GO between your queries.

If you use either of the text-based query tools (OSQL or ISQL), there's no green Execute Query button to click. When you use these tools, you must type GO on a line by itself to indicate that everything you've typed up to that point is a batch. The tool will then send that batch to SQL Server for processing.

GO isn't an SQL command or keyword. It's the end-of-batch signal understood only by certain client tools. The client interprets it to mean that everything since the last GO should be sent to the server for execution. SQL Server never sees the GO command and has no idea what it means. With a custom application, a batch is executed with a single SQLExecute from ODBC (or dbsqlexec from DB-Library).

If you include the GO command in the query window, SQL Query Analyzer breaks up your statement into the indicated batches behind the scenes. Each batch (as marked by the GO command) is sent individually to SQL Server.

A collection of batches that are frequently executed together is sometimes called a script. Most of the client tools provide a mechanism for loading a script that you've saved to a text file and for executing it. In SQL Query Analyzer, you can use the File/Open command to load a script. From the command-line OSQL or ISQL programs, we can specify the /i flag followed by a filename to indicate that the SQL Server batches to execute should come from the specified file. Alternatively, if we're using OSQL or ISQL interactively, we can read in a file containing one or more SQL statement by typing :r followed by the filename. (See SQL Server Books Online for details about using OSQL and ISQL.)

The fact that the client tool and not SQL Server processes GO can lead to some unexpected behavior. Suppose you have a script containing several batches. During testing, you want to comment out a couple of the batches to ignore them for the time being. Your commented script might look something like this:

 SELECT * FROM authors /* GO SELECT * FROM sales GO SELECT * FROM publishers GO */ SELECT * FROM titles GO

The intention here was to comment out the SELECT from the sales and publishers tables and to run the SELECT from authors and titles as a single batch. However, if you run this script from SQL Query Analyzer, you'll get exactly the opposite behavior! That is, you'll see the data from the sales and publishers tables but not from authors or titles. If you look at this script from the perspective of the client tool, the behavior makes sense. The tool doesn't try to interpret any of your SQL statements; it just breaks the statements into batches to be sent to SQL Server. A batch is marked by a GO command at the beginning of a line.

So the above script contains four batches. The first batch (everything before the first GO) is:

 SELECT * FROM authors /*

SQL Server generates an error message because there's an open comment with no corresponding close comment.

The second and third batches are:

 SELECT * FROM sales

and

 SELECT * FROM publishers

Both of these batches are perfectly legal, and SQL Server can process them and return results.

The fourth batch is:

 */ SELECT * FROM titles

SQL Server also generates an error for this last one because it has a close comment without an open comment marker, and no data is returned.

If you want to comment out statements within a script that can contain the end-of-batch GO command, you should use the alternative comment marker—the double dash—in front of every GO. Alternatively, you can just use the double dash in front of every line you want to comment out. Your script would then look like this:

 SELECT * FROM authors -- GO -- SELECT * FROM sales -- GO -- SELECT * FROM publishers -- GO SELECT * FROM titles GO

SQL Query Analyzer makes it easy to comment out a group of lines as in the code above. You can highlight all the lines to be commented out, and from the Edit menu, choose Advanced/Comment Out. The keystroke combination to perform the same action is Ctrl-Shift-C. With this revised script, the client tool won't recognize the GO as the end-of-batch marker because it's not the first thing on a line. The client will consider this script to be one single batch and send it to SQL Server as such.