To finish this discussion of query tuning, we'll briefly look at some of the environmental concerns that you need to be aware of in your programming—for example, we'll discuss case sensitivity, which can greatly affect your applications. We'll also look at nullability issues and ANSI compatibility.
Various options and settings affect the semantics of your Transact-SQL statements. You must be sure that your Transact-SQL code can work regardless of the setting, or you must control the environment so that you know what the setting is.
Case sensitivity is by far the most common environmental problem, and it is simple to avoid. You should seriously consider doing most of your development in a case-sensitive environment, even if you will deploy your application mostly in a case-insensitive environment. The reason is simple: nearly all operations that work in a case-sensitive environment will also work in a case-insensitive environment, but the converse is not true.
For example, if we write the statement select * from authors in the pubs database of a case-sensitive environment, it will work equally well in a case-insensitive environment. On the other hand, the statement SELECT * FROM AUTHORS will work fine in a case-insensitive environment but will fail in a case-sensitive environment. The table in pubs is actually named authors, which is lowercase. The only statement that would work in a case-sensitive environment but fail in a case-insensitive environment is the declaration of an object name, a column name, or a variable name. For example, with the statement declare @myvar int, another statement declaring @MYVAR int would work fine in a case-sensitive environment because the two names are distinct, but it would fail in a case-insensitive environment because the names would be considered duplicates.
The easiest way to determine whether your environment is case sensitive is to perform a SELECT statement with a WHERE clause that compares a lowercase letter with its uppercase counterpart—you don't need to access a table to do this. The following simple SELECT statement returns 1 if the environment is case sensitive and 0 if the environment is not case sensitive:
SELECT CASE WHEN ('A'='a') THEN 0 ELSE 1 END
I use the term environment here because in SQL Server 2000 you can control the case sensitivity at many different levels. Recall from Chapter 4 that your SQL Server instance will have a default collation, which determines both the character set and the sort order used. Each database can specify the collation it will use in the CREATE DATABASE command, or you can change the collation using the ALTER DATABSE command. If you don't specify a collation for a database, it will use the server's default. Case sensitivity is just one of the issues surrounding the character set used by SQL Server. The character set choice affects both the rows selected and their ordering in the result set for a query such as this:
SELECT au_lname, au_fname FROM authors WHERE au_lname='José' ORDER BY au_lname, au_fname
If you never use characters that are not in the standard ASCII character set, case sensitivity is really your primary issue. But if your data has special characters such as the é in the example above, be sure that you understand all the issues surrounding different collations. For more information, see Chapter 4.
To be recognized as ANSI compliant, various options had to be enabled in SQL Server because of subtle differences in semantics between the traditional SQL Server behavior and what is mandated by ANSI. I already covered the majority of these issues in earlier chapters. To preserve backward compatibility, the prior behavior couldn't simply be changed. So options were added (or in a few cases, previous options were toggled on) to change the semantics to comply with the ANSI SQL requirements. These options are summarized in Table 16-3, along with the statement used to change the behavior.
All of the options listed in the table can be set individually, but you might want to avoid doing that because there are 128 (27) permutations to consider. You might want to set just a few of the options individually, such as SET ANSI_WARNINGS. But by and large, you should either leave them all at their default settings or change them as a group to the ANSI SQL-92 behavior. You can enable these options as a group by setting SET ANSI_DEFAULTS ON. A good reason to use all the ANSI-compliant options is that several of the new features of SQL Server 2000, such as Indexed Views, require that almost all of these options be set for compliance.
Table 16-3. ANSI SQL requirements satisfied by SQL Server options.
ANSI SQL Requirement | SQL Server Statement |
---|---|
Disable SQL Server's = NULL extension | SET ANSI_NULLS ON |
Automatically display a warning if a truncation would occur because the target column is too small to accept a value. By default, SQL Server truncates without any warning. | SET ANSI_WARNINGS ON |
Always right-pad char columns, and don't trim trailing blanks that were entered in varchar columns, as SQL Server would do by default. | SET ANSI_PADDING ON |
Make all statements implicitly part of a transaction, requiring a subsequent COMMIT or ROLLBACK. | SET IMPLICIT_TRANSACTIONS ON |
Close any open cursors upon COMMIT of a transaction. By default, SQL Server keeps the cursor open so that it can be reused without incurring the overhead of reopening it. | SET CURSOR_CLOSE_ON_COMMIT ON |
Allow identifier names to include SQL Server keywords if the identifier is included in double quotation marks, which by default is not allowed. This causes single and double quotation marks to be treated differently. | SET QUOTED_IDENTIFIER ON |
By default, create as NULL a column in a CREATE TABLE statement that is not specified as NULL or NOT NULL. The statement toggles this so that the column can be created with NULL. (I recommend that you always specify NULL or NOT NULL so that this setting option is irrelevant.) | SET ANSI_NULL_DFLT_ON ON |
The ability to set these options on a per-connection basis makes life interesting for you as a SQL Server application programmer. Your challenge is to recognize and deal with the fact that these settings will change the behavior of your code. Basically, that means you need to adopt some form of the following four strategies:
The approach you take is your choice, but recognize that if you don't think about the issue at all, you have basically settled for the Optimistic Approach. This approach is certainly adequate for many applications for which it's pretty clear that the user community would have neither the desire nor the ability to make environmental changes. But if you're deploying an application and the machine running SQL Server will be accessed by applications that you don't control, it's probably an overly simplistic approach. Philosophically, the Flexible Approach is nice, but it probably isn't realistic unless the application is quite simple.
You can change the SQL Server default values for the server as a whole by using sp_configure 'user options'. A specific user connection can then further refine the environment by issuing one of the specific SET statements just discussed. The system function @@OPTIONS can then be examined by any connection to see the current settings for that connection. The @@OPTIONS function and the value to be set using sp_configure 'user options' are a bit mask with the values depicted in Table 16-4.
By default, none of these options is enabled for SQL Server itself. So in a brand-new installation, the run value for sp_configure 'user options' will be 0. A system administrator can set this so that all connections have the same initial default settings. If you query the value of @@OPTIONS from an application that has not modified the environment, the value will also be 0. However, be aware that many applications, or even the SQL Server ODBC or OLE DB drivers that the application uses, might have changed the environment. Note that this includes SQL Query Analyzer, which uses ODBC. To change the default behavior, simply set the corresponding bit by doing a bitwise OR with the previous value. For example, suppose your run value is 512, which indicates that NOCOUNT is the only option turned on. You want to leave NOCOUNT enabled, but you also want to enable option value 32, which controls how NULLs are handled when using equality comparisons. You'd simply pass the decimal value 544 (or 0x220) to sp_configure 'user options', which is the result of doing a bitwise OR between the two options (for example, SELECT 32|512).
Table 16-4. Bitmask values for SQL Server options.
Decimal Value | Hexadecimal Value | Option | Description |
---|---|---|---|
2 | 0x0002 | IMPLICIT_TRANSACTIONS | Controls whether a transaction is started implicitly when a statement is executed. |
4 | 0x0004 | CURSOR_CLOSE_ON_COMMIT | Controls behavior of cursors once a commit has been performed. |
8 | 0x0008 | ANSI_WARNINGS | Controls truncation and NULL in aggregate warnings. |
16 | 0x0010 | ANSI_PADDING | Controls padding of variables. |
32 | 0x0020 | ANSI_NULLS | Controls NULL handling by using equality operators. |
64 | 0x0040 | ARITHABORT | Terminates a query when an overflow or divide-by-zero error occurs during query execution. |
128 | 0x0080 | ARITHIGNORE | Returns NULL when an overflow or divide-by-zero error occurs during a query. |
256 | 0x0100 | QUOTED_IDENTIFIER | Differentiates between single and double quotation marks when evaluating an expression, allowing object names to include characters that would otherwise not conform to naming rules or would collide with a reserved word or a keyword. |
512 | 0x0200 | NOCOUNT | Turns off the message returned at the end of each statement that states how many rows were affected by the statement. |
1024 | 0x0400 | ANSI_NULL_DFLT_ON | Alters the session's behavior to use ANSI compatibility for nullability. New columns defined without explicit nullability will be defined to allow NULLs. |
2048 | 0x0800 | ANSI_NULL_DFLT_OFF | Alters the session's behavior to not use ANSI compatibility for nullability. New columns defined without explicit nullability will be defined to prohibit NULLs. |
You can use DBCC USEROPTIONS to examine current options that have been set. The output is similar to the following code:
Set Option Value ---------- ---------- textsize 64512 language us_english dateformat mdy datefirst 7 arithabort SET nocount SET
This DBCC command shows only options that have been set—it doesn't show all the current settings for sp_configure 'user options'. But you can also decode your current connection settings pretty easily from @@OPTIONS using something like this:
SELECT 'IMPLICIT TRANSACTIONS' AS 'OPTION', CASE WHEN (@@OPTIONS & 0x0002 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'CURSOR_CLOSE_ON_COMMIT', CASE WHEN (@@OPTIONS & 0x0004 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'ANSI_WARNINGS',CASE WHEN (@@OPTIONS & 0x0008 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'ANSI_PADDINGS', CASE WHEN (@@OPTIONS & 0x0010 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'ANSI_NULLS', CASE WHEN (@@OPTIONS & 0x0020 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'ARITHABORT', CASE WHEN (@@OPTIONS & 0x0040 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'ARITHIGNORE', CASE WHEN (@@OPTIONS & 0x0080 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'QUOTED_IDENTIFIER', CASE WHEN (@@OPTIONS & 0x0100 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'NOCOUNT', CASE WHEN (@@OPTIONS & 0x0200 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'ANSI_NULL_DFLT_ON', CASE WHEN (@@OPTIONS & 0x0400 > 0) THEN 'ON' ELSE 'OFF' END UNION SELECT 'ANSI_NULL_DFLT_OFF', CASE WHEN (@@OPTIONS & 0x0800 > 0) THEN 'ON' ELSE 'OFF' END ORDER BY 'OPTION'
Here's the result:
OPTION SETTING ---------------------- ------- ANSI_NULL_DFLT_OFF OFF ANSI_NULL_DFLT_ON ON ANSI_NULLS ON ANSI_PADDINGS ON ANSI_WARNINGS ON ARITHABORT ON ARITHIGNORE OFF CURSOR_CLOSE_ON_COMMIT OFF IMPLICIT TRANSACTIONS OFF NOCOUNT OFF QUOTED_IDENTIFIER ON
Unfortunately, running the previous SELECT statement will not give you a complete picture. As you saw in Chapter 5, many of the options, especially those involving NULL handling, can also be changed at the database level by using sp_dboption. To really get the full picture, you must examine @@OPTIONS for your current session in addition to sp_dboption for your current database.
Beware of the locale-specific SET options. SET DATEFORMAT and SET DATEFIRST change the recognized default date format. If DATEFORMAT is changed to dmy instead of the (U.S.) default mdy, a date such as 12/10/98 will be interpreted as October 12, 1998. A good strategy for dates is to always use the ISO format yyyymmdd, which is recognized no matter what the setting of DATEFORMAT is.
DATEFIRST affects what is considered the first day of the week. By default (in the United States), it has the value 7 (Sunday). Date functions that work with the day-of-week as a value between 1 and 7 will be affected by this setting. These day-of-week values can be confusing because their numbering depends on the DATEFIRST value, but the values for DATEFIRST don't change. For example, as far as DATEFIRST is concerned, Sunday's value is always 7. But if you've designated Sunday (7) as DATEFIRST, if you execute the statement SELECT DATEPART(dw,GETDATE()) and your date falls on a Sunday, the statement will return 1. You just defined Sunday to be the first day of the week, so 1 is correct.