Database Engine Enhancements

The two primary components of the SQL Server database engine are the relational engine and storage engine. The relational engine parses and processes the query, optimizes the execution plan, talks with the storage engine to obtain the results, and formats the results to be sent to the client. The other database engine component, storage engine, is responsible for physical I/O and concurrency control. The storage engine manages the database files and physical data pages. SQL Server 2005 contains significant changes to these two engine components to support new functionality and to improve on existing functionality.

The New `xml` Data Type

SQL Server 2000 first introduced support for XML by allowing turning a relational rowset into XML by using the FOR XML clause with the SELECT statement and converting XML into a relational rowset by using the OPENXML function. The support for XML in SQL Server 2000 was further extended with the SQLXML Web releases. However, if you need to store XML data inside a database table, the only option with SQL Server 2000 was to use the char/nchar/varchar/nvarchar or text/image data types. In addition, there is no built-in support for working with XML text stored in the database. For instance, if you store XML data in a varchar column, there is no way to execute XPath or XQuery queries to locate data inside XML or to efficiently update XML data.

SQL Server 2005 fills this gap by introducing a new data type called xml. In SQL Server 2005 you can have columns, variables, and parameters of xml data type. There are methods available to query and update XML data. You can store up to 2GB of XML data in a column. A typed XML column is one that has an XML Schema (XSD) associated with it; otherwise, it is called an untyped XML column.

Here is some sample T-SQL code that illustrates the xml data type:

USE TempDB; GO CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,    logtext xml not null); GO INSERT INTO dbo.tblTest (logtext) VALUES ('<log result="1" />'); GO DECLARE @varTest xml; SET @varTest = '<log result="0" />'; INSERT INTO dbo.tblTest (logtext) VALUES (@varTest); GO SELECT * FROM dbo.tblTest; GO SELECT * FROM dbo.tblTest WHERE logtext.value('(/log/@result)[1]', 'int') = 1; GO DROP TABLE dbo.tblTest; GO

This T-SQL script first creates a table that contains a column of xml data type. It then inserts some sample data by directly providing the XML text string and then by using an xml type variable. The final statement illustrates querying xml type column. If you run this code in the Management Studio Query Editor and view the results in Grid mode, you can see that the xml type column is shown as a hyperlink, and if you click it, you see formatted XML data in a new window.

XML Indexes

To optimize access to XML data, you can create indexes on xml type columns. Without an index, when a query is run on an xml type column, all the XML data is shredded at runtime to evaluate the query. This can be a significant performance hit and can cause queries to run more slowly. Indexes on an xml type column are a little different from those on other columns. First, you have to create a primary index on an xml type column, which is essentially a shredded representation of XML data, and then if you like, you can create secondary indexes to optimize XPath and XQuery queries on the xml type columns.

In addition to using standard XML querying mechanisms using XPath and XQuery, you can also define full-text search on xml type columns and run full-text queries.

In addition to the XML standard, SQL Server 2005 supports various other W3C standards, such as XPath, XQuery, XML namespaces, and XML Schema (XSD).

The xml data type and XQuery are discussed in complete detail in Chapter 10, "XML and Web Services Support in SQL Server 2005."

Large Value Data Types

With SQL Server 2000, if you stored more than 8000 bytes of text or binary data in a column, the only option was to use text or image data types. However, these two data types are not easy to work with because they require different programming techniques than the other text or binary types, such as varchar and varbinary, and they have some very serious limitations (for example, many string functions do not work on text/image columns, you cannot define a variable of type text/image, and so on). To fix this, SQL Server 2005 introduces a new max specifier. The max specifier can be used with varchar, nvarchar, and varbinary types, and it allows you to store up to 2GB of data into a column, variable, or parameter, without any different handling requirements.

Here is an example of T-SQL code that illustrates the use of the max specifier:

USE TempDB; GO CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,    logtext varchar(max) not null); GO DECLARE @char varchar(max); DECLARE @varTest varchar(max); SET @char = '-'; SET @varTest = REPLICATE(@char, 100000); INSERT INTO dbo.tblTest (logtext) VALUES (@varTest); GO SELECT * FROM dbo.tblTest; SELECT LEFT(logtext, 10) FROM dbo.tblTest; SELECT SUBSTRING(logtext, 99990, 1) FROM dbo.tblTest; SELECT LEN(logtext) FROM dbo.tblTest; GO DROP TABLE dbo.tblTest; GO

This T-SQL batch first creates a table containing a column of type varchar(max). It then inserts 100,000 characters into this column, using a varchar(max) type column, and uses some string functions to illustrate that varchar(max) is no different from other text type columns. In addition, SQL Server 2005 also relaxes the 8060-bytes per row restriction. The following script will not work in SQL Server 2000, but runs successfully in SQL Server 2005:

USE tempdb; CREATE TABLE dbo.tblTest(    c1 varchar(4000),    c2 varchar(4000),    c3 varchar(4000)); GO INSERT INTO dbo.tblTest VALUES    (REPLICATE('*', 4000), REPLICATE('*', 4000), REPLICATE('*', 4000)); GO

.NET Framework Integration

One of the biggest features introduced in SQL Server 2005 is the .NET Framework integration, better known as SQLCLR. The SQL Server 2005 database engine hosts the .NET Framework CLR, and hence it allows .NET code to run from within the T-SQL script. SQLCLR integration enables developers to extend the T-SQL programming language and type system. The SQLCLR integration allows developers to write stored procedures, triggers, and user-defined functions, and to create new types and aggregates by using .NET.

.NET Framework Overview

The Microsoft .NET Framework, introduced in July 2000, is a development platform that allows the building of Windows, web, and mobile applications. To rapidly build and deploy .NET applications, developers can use Visual Studio .NET tools and a programming language of their choice, such as Visual Basic .NET, C#, C++, or J#, to make use of thousands of classes provided by the .NET Framework Base Class Library.

The .NET programming model is simplified by an execution engine, known as the common language runtime (CLR), which handles a lot of tasks that developers had to do previously. These tasks include allocating and managing memory, thread management, lifetime management or garbage collection, security, and enforcement of type safety.

The code written in .NET languages and executed by the CLR is called managed code. Developers can still use legacy COM objects or call a Win32 API from within .NET code, but because the COM objects and Win32 functions are not controlled by the CLR, they are referred to as unmanaged code. The .NET Framework code is compiled into intermediate language (IL) and saved into DLLs or EXEs known as assemblies.

Introducing SQLCLR

In SQL Server 2000, the only way to extend T-SQL was to write extended stored procedures, which required C++ programming knowledge and understanding of the Open Data Services (ODS) API. In addition, a poorly written extended stored procedure could potentially cause a number of unpredictable problems with SQL Server.

The SQLCLR integration in SQL Server 2005 allows developers to write stored procedures, triggers, and functions and to extend the type system and aggregates by using any of the .NET languages and the familiar ADO.NET programming model. The .NET security framework, the CLR, and the SQLCLR hosting layer security make sure that the managed code running inside SQL Server does not compromise the integrity of the SQL Server process.

SQLCLR easily outperforms T-SQL in scenarios such as doing a lot of procedural coding (looping, branching, and so on), string manipulation, or computational tasks in T-SQL code. However, for data access, using T-SQL is still the best approach. The SQLCLR assembly that is imported into the database contains compiled .NET code in IL format. When this code is executed, it is just-in-time (JIT) compiled into machine language. T-SQL is an interpreted language. The compiled code always executes faster than interpreted code.

Developers can use Visual Studio .NET and any of the .NET Framework languages to build assemblies, which are compiled .NET code that can be imported into a SQL Server database by using the CREATE ASSEMBLY T-SQL statement. Depending on what the assembly implements, a T-SQL stored procedure, function, trigger, user-defined type, or aggregate can be mapped to .NET code in the assembly.

Let's say a developer gives you a .NET assembly called Payroll.dll that contains a managed stored procedure that accepts employee ID and year as the parameters, does some complex calculation, and returns a rowset that contains complete tax details. Here is how you would import the assembly, map a T-SQL stored procedure to the .NET method, and then eventually call the stored procedure:

CREATE ASSEMBLY [CLR_Payroll]    FROM 'F:\FromDev\Payroll.dll'    WITH PERMISSION_SET = SAFE; GO CREATE PROCEDURE dbo.usp_GetTaxDetails (    @EmployeeID INT,    @Year INT ) AS EXTERNAL NAME [CLR_Payroll].[StoredProcedures].[GetTaxDetails] GO EXEC dbo.usp_GetTaxDetails 10, 2004; GO

The important point to note in this sample script is the WITH PERMISSION_SET = SAFE keyword. You as a DBA can control what an imported assembly can or cannot do by using one of the three permission buckets that SQL Server 2005 provides: SAFE, EXTERNAL_ACCESS, and UNSAFE.

SQLCLR Permission Buckets: `SAFE`, `EXTERNAL_ACCESS`, and `UNSAFE`

The assemblies that are imported with SAFE permission set can work with data and use some of the CLR utility classes, but they cannot access external resources (such as the file system or the network), the code must be verifiably type safe by the CLR, and things like COM-interop, PInvoke (Win32 API access), and multithreading are not allowed. This is the default permission set.

The assemblies imported with EXTERNAL_ACCESS are the same as those imported with SAFE, but they also allow access to external sources such as the file system, using the .NET Framework class library.

The assemblies imported with UNSAFE can do virtually anything. They can even call unmanaged code such as COM objects and Win32 functions. It is recommended that UNSAFE permissions be granted only to highly trusted assemblies by experienced developers or administrators. Only members of the sysadmin fixed server role can create UNSAFE assemblies.

The SQLCLR is the subject of Chapter 11, "SQL Server 2005 and .NET Integration." Refer to that chapter for more details and examples of using the .NET integration features.

The New Catalog Views

With SQL Server 2000, the two ways to access SQL catalog or metadata were by using system tables or SQL-99 INFORMATION_SCHEMA views. SQL Server 2000 even allowed updating of system tables, which was not recommended but possible. In general, it is not recommended to access system tables directly. In addition, in SQL Server 2000 there was no real security implemented on metadata access. This meant that even though a database user did not have permissions to see the data, the user could still see what objects existed in the database (sysobjects) or see the structure of objects (sp_help).

With SQL Server 2005, the system tables are deeply hidden inside, and direct access to them is not allowed. They are not visible at all. The recommended way to access SQL metadata is to use the new catalog views. For backward compatibility, SQL Server provides compatibility views that have the same name and return the same data as the system tables in previous releases. In other words, system tables from previous releases are now compatibility views.

To summarize, SQL Server 2005 completely hides the system tables. There are three ways to access SQL metadata: through catalog views, compatibility views, and SQL-99 INFORMATION_SCHEMA views. Of these three features, only catalog views are available for all the features introduced in SQL Server 2005. For example, there is no INFORMATION_SCHEMA or compatibility view available to explore Service Broker objects. In addition, using catalog views is the most efficient way to access system metadata. Because SQL Server 2005 restricts access to base system tables, the allow updates sp_configure option in SQL Server 2005 is meaningless.

SQL Server 2005 implements a security layer on top of metadata views that is used for permission-based metadata access. For example, if a user executes a SELECT statement on a sysobjects compatibility view, a sys.objects catalog view, or a INFORMATION_SCHEMA. Tables view, SQL Server only returns the objects on which a user has access to select or modify the data, unless special VIEW DEFINITION permission is granted to the object.

The Resource Database

SQL Server 2005 also introduces the Resource database. It is an actual .mdf and .ldf filebased SQL Server 2005 database (mssqlsystemresource.mdf and mssqlsystemresource.ldf), but it is not visible in Management Studio or through metadata views. In previous SQL Server releases, the metadata (schema definitions, system stored procedure text, and so on) for system objects was stored in the master database, and user databasespecific system objects were stored in the user database. This posed major issues during upgrade and patching of SQL Server metadata. To expedite and simplify the upgrade process, Microsoft decided to store the system objects metadata inside a hidden Resource database. With this approach, instead of running multiple scripts, the upgrade and rollback process is as simple as copying new or old Resource database files.

The Resource database contains metadata for all system objects, but the system objects logically appear in each database under the sys schema. The master database still holds the actual instance-level data; the user databases still hold the database-specific information, and so on. However, the system objects' schema/metadata and system stored procedure text are stored in the Resource database.

The new metadata architecture is discussed in Chapters 2, "SQL Server 2005 Database Engine Architectural Changes," and 7, "SQL Server 2005 Security."

Scalability, Performance, and Reliability Enhancements

Scalability refers to the ability of a system to maintain and possibly improve performance as the load increases in a predictable manner. SQL Server 2005 introduces several enhancements to the database engine and other subsystems to make sure that the platform scales and performs well even with terabytes of data and thousands of users. In the following sections, you will see some of these new features, and you'll learn more details about them in Chapters 8, "Reliability and High Availability in the Database Engine," and 9, "Performance Analysis Tuning."

Row Versioning and the Snapshot Isolation Level

SQL Server 2000 supports all four isolation levels defined by SQL-92 standard: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and SERIALIZABLE. The default isolation level in SQL Server 2000 and SQL Server 2005 is READ COMMITTED, which ensures that statements cannot read modified data that is not committed yet and hence prevents dirty reads.

Isolation Levels

READ UNCOMMITED is the lowest isolation level. It allows statements to read data that has been modified by other transactions but not yet committed. SERIALIZABLE is the highest isolation level, and it completely isolates two transactions, avoiding all the concurrency issues such as dirty reads, nonrepeatable reads, and phantom reads. In between these two are the READ COMMITTED and REPEATABLE READ levels. Like the READ COMMITTED isolation level, the REPEATABLE READ isolation level also avoids dirty reads by disallowing reading data that is updated but not committed. In addition, it makes sure that no transaction can modify the data that has been read by the current transaction, and hence readers can block writers. The REPEATABLE READ isolation level avoids dirty reads and nonrepeatable reads, but it can still result in phantom reads.

Snapshot Isolation

SQL Server 2005 introduces a new isolation level called snapshot isolation that is like REPEATABLE READ without the issue of readers blocking the writers. In other words, while the snapshot isolation level prevents dirty reads and nonrepeatable reads, other transactions can continue to update the data that is read by the current transaction. How does SQL Server achieve this? The answer is row versioning.

Row Versioning in Detail

Row versioning is an alternative technique to locking to ensure the integrity of transactions and to maintain the consistency of data when multiple users are accessing the data at the same time. With row versioning, SQL Server uses tempdb to maintain a version of each row that is modified. Whenever a transaction modifies a row, SQL Server copies the original row in the tempdb database. This technique is used to enable REPEATABLE READ behavior without the issue of readers blocking writers, as readers can now read the original row from the tempdb database. By avoiding blocking, snapshot isolation enables applications to scale and perform better. The applications with more read activity than writes benefit the most from the snapshot isolation level.

To see snapshot isolation in action, you can start two query window sessions in Management Studio and execute the following two T-SQL batches, which create the blocking scenario by using the REPEATABLE READ and READ COMMITTED isolation levels:

Connection 1:

USE [AdventureWorks]; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; BEGIN TRANSACTION; SELECT * FROM Sales.Store;

Connection 2:

USE [AdventureWorks]; SET TRANSACTION ISOLATION LEVEL READ COMMITTED; BEGIN TRANSACTION; UPDATE Sales.Store SET SalesPersonID = 280 WHERE SalesPersonID = 281;

Notice that the UPDATE statement is blocked because some other transaction in the REPEATABLE READ isolation level has already read the data.

Run the ROLLBACK TRANSACTION statement in the first connection and then in the second Query Editor window.

Let's now see how using the snapshot isolation level instead of repeatable read fixes the blocking problem. You need to change the T-SQL in the connection 1 window to use the snapshot isolation level:

Connection 1:

ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON; GO USE [AdventureWorks]; GO SET TRANSACTION ISOLATION LEVEL SNAPSHOT; BEGIN TRANSACTION; SELECT * FROM Sales.Store;

Next, you should run this script in connection 1 and run the same UPDATE statement inside a transaction in connection 2. Note that this time the UPDATE statement is not blocked. If you switch back to connection 1 and run just the SELECT statement again, you should notice that it still shows the original values, which are similar to the REPEATABLE READ behavior. You should roll back the transactions in both the connections and close the query windows.

You can use a T-SQL statement similar to the following in order to find out whether snapshot isolation is enabled on the database:

SELECT snapshot_isolation_state, snapshot_isolation_state_desc FROM sys.databases WHERE name = 'AdventureWorks';

Note that there is a cost involved with using snapshot isolation and with row versioning in general. Because versions of rows are stored in tempdb, you have to make sure that you have sufficient space available in the tempdb database, and you also need to consider the performance impact of tempdb I/O that is caused because of row versioning. Updates and deletes can be a little slow because UPDATE might cause tempdb activity. While reading the data, SQL Server has to traverse the version history to determine where to read the row from.

In addition to snapshot isolation level, SQL Server 2005 also introduces a new flavor of the READ COMMITTED isolation level that makes use of row versioning to avoid blocking. This is achieved by turning on the READ_COMMITTED_SNAPSHOT option on the database, as discussed in detail in Chapter 8.

Service Broker

Service Broker is a new scalability technology introduced in SQL Server 2005 that allows you to build reliable, asynchronous queued database applications. All you have to worry about is sending and receiving messages using T-SQL, while the rest of messaging framework, including routing and security, is built into the database engine.

Asynchronous Messaging

One of the important aspects of building scalable applications is performing parts of the operations asynchronously. For example, when an order is submitted, you want to process inventory, shipping, and accounting. If these three tasks are performed synchronously when an order record is inserted, the order submission process takes longer, and the application will not scale. However, if these three tasks are done asynchronously, the order process finishes very quickly, and the application can scale. When an order record is inserted, it just posts a message and returns; this message is later retrieved asynchronously and then the inventory, shipping, and accounting tables are updated appropriately.

This asynchronous messaging breaks the tight coupling between components and allows application to perform better, scale out, distribute the load, and defer the things for batch processing. However, the challenge with asynchronous messaging is that someone should guarantee that the message is delivered for sure, delivered once and only once, and delivered in the order in which it was sent in a batch of messages.

On the application and middle tiers, these services are provided by specialized technologies such as Microsoft Message Queuing (MSMQ), IBM WebSphere MQ (formerly MQSeries), and so on. The SQL Server Service Broker brings these services into the database engine so that you can do asynchronous messaging at the database level.

For example, you can use Service Broker to implement asynchronous triggers. In other words, when a trigger is fired, it posts a message into a queue and returns immediately. This message is later received and processed by a Service Broker application, which can be a T-SQL or SQLCLR stored procedure or an external application.

More details on Service Broker can be found in Chapter 14.

Nonclustered Indexes with Included Columns

Designing and using proper indexes is one of the keys to maximizing query performance. One of the recommendations for designing indexes is to keep the index size small, which tends to be efficient. On the other hand, having all the data required by the query obtained from the covering index without touching the clustered index or table data page results in lesser I/O and better query throughput.

You might wonder whether you should include more columns in the index to maximize the covered query instances or whether you should keep the index key size small and efficient. With SQL Server 2000, you had to choose between these two choices, based on your performance benchmark results. SQL Server 2005 introduces the concept of including non-key columns in the index; this is designed to provide the best of both worlds: smaller key size and at the same time more success of an index being a covering index.

You can use the INCLUDE keyword along with a list of one or more columns and the CREATE INDEX statement to indicate that you want one or more non-key columns to be saved in the leaf level of an index. This way, there are more chances that the query will find all it needs in the index pages itself, without any further lookups. However, you should carefully design and determine when and what columns should be included. Including non-key columns as part of indexes increases the disk space requirement, fewer index rows can fit on index pages, and data is stored and updated twice (once in the base table and then as part of an index at the leaf level). More details on this can be found in Chapter 9.

Persisted Computed Columns

A computed column's value is calculated from an expression by using other columns in the same table. With SQL Server 2000, computed columns are always virtual columns, not physically stored in the table. Their value is determined at runtime, based on the columns in the expression and the expression itself.

SQL Server 2005 introduces the ability to persist the computed column values. The PERSISTED keyword can be specified with the computed column definition in the CREATE TABLE or ALTER TABLE statement to indicate that a computed column's value must be physically stored and updated if any column in the expression is updated. If a computed column is persisted in the data pages, it can speed retrieval for computing-intensive columns. In addition, persisted computed columns allow defining indexes in cases where virtual computed columns prohibit creation of indexes. For instance, if an expression is of type float or real, an index cannot be created on such virtual computed columns, but an index can be created on persisted computed columns. If computed columns are persisted, you can create NOT NULL, FOREIGN KEY, and CHECK constraints in addition to UNIQUE and PRIMARY KEY constraints.

Note

The persisted computed columns feature is not available in SQL Server 2005 Mobile Edition.

Here is a sample T-SQL script that illustrates persisted computed columns:

USE TempDB; GO CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,    price float not null,    qty float not null,    cost as price * qty); GO INSERT INTO dbo.tblTest (price, qty) VALUES (100, 100); INSERT INTO dbo.tblTest (price, qty) VALUES (200, 200); GO SELECT * FROM dbo.tblTest WHERE cost < 40000; GO --next stmt will fail; index can't be created on imprecise computed column CREATE INDEX idTest ON dbo.tblTest(cost); GO DROP TABLE dbo.tblTest; GO CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,    price float not null,    qty float not null,    cost as price * qty PERSISTED NOT NULL CHECK (cost > 0)); GO INSERT INTO dbo.tblTest (price, qty) VALUES (100, 100); INSERT INTO dbo.tblTest (price, qty) VALUES (200, 200); GO --next stmt will fail; CHECK constraint violation INSERT INTO dbo.tblTest (price, qty) VALUES (200, -1); GO SELECT * FROM dbo.tblTest WHERE cost < 40000; GO --next stmt will succeed; index can be created on imprecise --but persisted computed column CREATE INDEX idTest ON dbo.tblTest(cost); GO SELECT * FROM dbo.tblTest WHERE cost < 40000; GO DROP TABLE dbo.tblTest; GO

This script begins by creating a table containing a virtual computed column of float data type. SQL Server 2000 and 2005 do not allow creating an index on such a column because the computed column is not precise. But if the computed column is PERSISTED, SQL Server 2005 allows creating an index on that column, as well as defining other constraints such as a CHECK constraint, as illustrated in this script.

Partitioned Tables and Indexes

Partitioning of data and indexes is not a new concept. Especially for very large databases (VLDBs), partitioning of data and indexes has been one of the design strategies. Table and index partitioning is a performance, scalability, and manageability feature. Data in a partitioned table is horizontally split into physical units, which may be spread across more than one filegroup in a database.

What's new in SQL Server 2005 is the simplicity and ease with which you can implement and manage partitioned tables and indexes. You can provide a function and scheme for a table, and SQL Server will make sure to route rows automatically to appropriate partitions, based on the partition key. SQL Server 2005 also supports range partitions, where the partition key is the customized ranges of data. This subject is discussed in more detail, with examples, in Chapter 8.

Miscellaneous Performance Improvement Techniques

To wrap up this section, the following are some other performance-related changes introduced in SQL Server 2005.

`DATE_CORRELATION_OPTIMIZATION`

SQL Server 2005 introduces a new database option, DATE_CORRELATION_OPTIMIZATION, which can improve the performance of queries that perform an equi-join between two tables whose datetime columns are correlated and that specify a date restriction in the query predicate. This option is turned off by default. When you turn on this option, SQL Server maintains correlation statistics between any two tables in the database that have datetime columns and are linked by a foreign key constraint. These statistics are later used during query optimization to improve performance. You can use an ALTER DATABASE statement to turn on or off this option and the is_date_correlation_on field in the sys.databases catalog view to find out whether this option is turned off or on.

`AUTO_UPDATE_STATISTICS_ASYNC`

With SQL Server 2000, if the query optimizer determined that the statistics are stale, it started the statistics update and used the updated statistics to generate the best query plan. However, whenever this happened, that query ran more slowly than usual, and sometimes client requests experience timeouts. SQL Server 2005 introduces a database-level option that guarantees predictable query response time. When the AUTO_UPDATE_STATISTICS_ASYNC option is turned on, if the query optimizer determines that statistics are outdated, SQL Server uses the worker thread to start updating the statistics asynchronously. The current query continues to use existing statistics, which might not produce the most efficient query plan, but the query might execute quickly because it does not have to wait for update statistics to finish. The subsequent queries are then able to make use of the updated statistics when the worker thread finishes. You can use ALTER DATABASE statement to turn on this option and the is_auto_update_stats_async_on field in the sys.databases catalog view to find out whether this option is turned off or on.

`RECOMPILE` and `OPTIMIZE FOR` Hints

SQL Server supports specifying the WITH RECOMPILE option while creating stored procedures. This option tells SQL Server to discard the stored procedure plan and recompile the stored procedure every time it is invoked. With SQL Server 2000, it was all or nothing; however, with SQL Server 2005, you can use a RECOMPILE hint with the query inside a stored procedure to have only that query recompiled every time a stored procedure is called instead of recompiling entire stored procedure.

If you know that a parameter will often have a certain value, you can give this hint to SQL Server by using an OPTIMIZE FOR hint so that SQL Server will generate the most efficient plan, using the value specified.

NUMA, Hyperthreading, 64-Bit Support, and Indexed Views

To maximize the performance on high-end systems (those with eight or more processors), the SQL Server 2005 engine has been architected to leverage non-uniform memory access (NUMA) and hyperthreading. With Symmetric multiprocessing (SMP), all memory accesses are posted to the same shared memory bus. This works fine for a relatively small number of CPUs, but the problem with the shared bus appears when you have a larger number of CPUs competing for access to the shared memory bus. NUMA alleviates these bottlenecks by limiting the number of CPUs on any one memory bus and connecting the various nodes by means of a high-speed interconnection. SQL Server 2005 understands this technology and does all it can to avoid making the CPU travel on the interconnect bus in order to reduce the contention. NUMA support is discussed in detail in Chapter 2. SQL Server 2000 64-bit is supported only on Itanium 2 processors. SQL Server 2005 supports both Itanium 2 and x64 processor platforms. SQL Server 2005 includes many improvements for indexed views. SQL Server 2005 extends the list of constructs that can be used for indexed views. Indexed views and other performance improvements are discussed in Chapter 8.

Mirrored Backup Media Sets and Enhanced Error Detection

SQL Server 2005 can back up simultaneously to a maximum of four devices. It increases the reliability of a system by having extra copies of backup in case a backup media set goes bad. In addition, SQL Server 2005 supports database and backup checksums, to ensure the reliability of the media set. Finally, unlike SQL Server 2000, in SQL Server 2005, RESTORE VERIFYONLY guarantees that the backup is good as it checks everything to ensure that the data is correct.

XML Web Services Support

As mentioned earlier, basic XML support was first introduced in SQL Server 2000, providing the capability to retrieve a relational rowset as XML or to turn an XML string into a relational rowset. The XML support was further extended in SQLXML web releases. One of the features in SQLXML web releases is the ability to quickly expose a stored procedure or function as a web service method. If you are not familiar with XML web services, think of them as an integration API that can work over the Internet. The XML web services support in SQL Server 2005 is functionally similar to that provided by SQLXML web services, but it is more efficient, native, and secure, and it does not have any dependency on a web server such as IIS.

SQL Server 2005 allows you to expose stored procedures and functions as Web service methods. The Web services support in SQL Server 2005 does not require IIS, as it uses the new HTTP listener process (http.sys) that is available on Windows XP Service Pack 2 and Windows Server 2003 platforms. Hence, web services in SQL Server 2005 will only work on Windows XP Service Pack 2 and Window Server 2003.

The CREATE ENDPOINT T-SQL statement can be used to define a web service and web methods. After you define a service and web methods that map to stored procedures or functions, you can invoke those methods from any platform or any language over the Internet by just using HTTP and SOAP.

Native web services support in SQL Server 2005 is discussed in the second half of Chapter 10.

SQL Server 2005 Express and Mobile Editions

SQL Server 2005 Express Edition is the successor of MSDE, and SQL Server Mobile Edition is the successor of SQL Server CE Edition.

Like MSDE, SQL Server 2005 Express Edition is free, and it is intended to be used for personal or simple data-driven applications. The Express Edition contains several enhancements over MSDE. Some of the enhancements include availability of graphical tools (such as SQL Server Configuration Manager and Express Manager), XCopy deployment, 4GB database size, and removal of concurrent worker throttle mechanism that restricted the scalability under heavy load.

SQL Server Mobile Edition supports more platforms than SQL Server CE, hence the name change. It supports all the mobile devices, including Pocket PCs, Tablet PCs, Embedded CE devices, and smart phones. There are several significant improvements to this edition. Some of the new features introduced in Mobile Edition include integration with Management Studio and Visual Studio .NET 2005; synchronization changes, including configurable compression levels, progress status, column-level sync tracking, and support for multiple subscriptions; multiuser support and SQLCEResultSet for efficient data access; and SSIS support.

The New xml Data Type