The xml Data Type | Microsoft SQL Server 2005: Changing the Paradigm (SQL Server 2005 Public Beta Edition)

The `xml` Data Type

SQL Server 2005 introduces a new data type, xml, to store XML data inside a relational database. The xml data type can be used for a column in a table, as a variable, as a parameter to a stored procedure or function, and as a function return value. The SQL Server 2005 xml data type implements the ISO SQL-2003 standard xml data type. You can store an entire XML document or XML fragments (XML data without single top-level element) in xml data type columns and variables. By default, SQL Server does not enforce the storing of well-formed or valid XML documents, and it allows the storing of XML fragments in xml data type columns and variables.

As with other data types, you can assign defaults, define column-level and table-level constraints, create XML indexes, and define full-text indexes on xml data type columns. However, there are few restrictions:

SQL Server internally stores XML data as a binary large object (BLOB), and hence the internal representation of XML in a column cannot exceed 2GB. The depth of the XML hierarchy is limited to 128 levels in SQL Server 2005.
The xml data type column cannot be defined as a primary key or foreign key; in addition, the UNIQUE, COLLATE, and RULE constraints cannot be used on xml data type columns.
The xml data type cannot be compared or sorted, except with the IS NULL operator. Hence, you cannot use an xml data type column with an ORDER BY or GROUP BY clause.
An xml data type column cannot be part of a clustered or a nonclustered index. You can define XML indexes on xml type columns, as discussed later in this chapter.
The XML declaration processing instruction (<?xml ... ?>) is not preserved when the instance is stored in the database. XML data is always stored and retrieved using UTF-16 encoding. If an XML document has other processing instructions besides the XML declaration line, they all are preserved.
The order of attributes and insignificant whitespace is not preserved. Also, the single or double quotes around attributes are not preserved; SQL Server always returns attributes enclosed in double quotes.
You cannot cast an xml data type column to either text or ntext. You can cast or convert an xml data type to varchar or nvarchar, including to varchar(max) and nvarchar(max), and then use that to implicitly or explicitly cast to text or ntext.
Only ISNULL, COALESCE, and DATALENGTH scalar functions are supported with xml data type columns.
xml data type columns cannot be used in distributed partitioned views.

Caution

The xml data type is not supported on SQL Server 2005 Mobile Edition. If xml data type columns are synced to the SQL Server Mobile database, they will be converted to the ntext type.

Here is an example of using the xml data type in a table:

USE [AdventureWorks]; GO --User-defined functions used later while creating a table IF OBJECT_ID('dbo.IsXMLFragment') IS NOT NULL    DROP FUNCTION dbo.IsXMLFragment; GO CREATE FUNCTION dbo.IsXMLFragment(@xmlData xml) returns bit BEGIN    RETURN       CASE @xmlData.value('count(/*)', 'bigint')          WHEN 1 THEN (@xmlData.exist('/text()'))          ELSE 1       END END; GO IF OBJECT_ID('dbo.EmpIDPresent') IS NOT NULL    DROP FUNCTION dbo.EmpIDPresent; GO CREATE FUNCTION dbo.EmpIDPresent(@xmlData xml) returns bit BEGIN    RETURN      @xmlData.exist('/Employee/@ID') END; GO IF OBJECT_ID('dbo.tblXMLTest') IS NOT NULL    DROP TABLE dbo.tblXMLTest; GO CREATE TABLE dbo.tblXMLTest    (id INT IDENTITY(1,1) PRIMARY KEY,     col1 XML,     col2 XML NOT NULL,     col3 XML DEFAULT N'<Person />',     col4 XML NOT NULL DEFAULT N'<Address />',     col5 XML CHECK(dbo.IsXMLFragment(col5) = 0),     col6 XML NULL CHECK(dbo.EmpIDPresent(col6) = 1) ); GO

This script is available in the file XMLDataTypeDemo.sql in the code download for this book. This script creates a table with six columns of xml data type. The second column, col2, specifies the NOT NULL constraint; the third column, col3, specifies the column default; the fifth column, col5, specifies a check constraint and uses a user-defined function to ensure that only well-formed XML documents are inserted into the column; and the sixth column, col6, also specifies a check condition and uses a user-defined function to ensure that the XML document or fragment has a top-level element called Employee and an attribute called ID. The exist() and value() methods used inside the user-defined functions, as well as other methods available with the xml data type, are discussed later in this chapter.

In the following script, the first three INSERT statements succeed, and the other four INSERTs fail:

INSERT INTO dbo.tblXMLTest(col2, col5) VALUES  (N'<col2Data />', N'<col5Data />'); INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES  (N'<col2Data />', N'<col5Data />', N'<Employee  />'); INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES  ('<col2Data />', '<col5Data />', '<Employee /><Employee />'); GO SELECT * FROM dbo.tblXMLTest; GO INSERT INTO dbo.tblXMLTest(col5, col6) VALUES  (N'<col5Data />', N'<Employee  />'); INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES  (N'<col2Data />', N'<col5Data /><col5Data />', N'<Employee  />'); INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES  (N'<col2Data />', N'<col5Data />', N'<Employee/>'); INSERT INTO dbo.tblXMLTest(col2, col4, col5) VALUES  (N'<col2Data />', NULL, N'<col5Data />'); GO

Of the four failed INSERTs toward the end, the first one fails because col2 does not allow NULLs, the second INSERT fails because col5 does not allow XML fragments, the third one fails because col6 does not have XML with the Employee top-level element having an ID attribute, and the final one fails because col4 is non-NULLable.

The following script shows how to declare a variable of the xml data type:

DECLARE @var1 XML SET @var1 = '<TestVariables />' INSERT INTO dbo.tblXMLTest(col2, col5) VALUES  (@var1, @var1); SELECT * FROM dbo.tblXMLTest; GO

The following script block illustrates how to pass an xml data type as a function parameter to implement a technique called "property promotion," where some value from XML is copied to a separate table column, so that you don't have to necessarily access and parse XML to access that value:

CREATE FUNCTION dbo.fn_GetID (@empData XML) RETURNS INT AS BEGIN    DECLARE @retVal INT    -- do some more things here on the input XML    SELECT @retVal = @empData.value('(/Employee/@ID)[1]', 'INT')    RETURN @retVal END; GO SELECT dbo.fn_GetID(col6) FROM dbo.tblXMLTest; ALTER TABLE dbo.tblXMLTest ADD empNo AS dbo.fn_GetID(col6); INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES  ('<col2Data />', '<col5Data />', '<Employee />'); GO SELECT * FROM dbo.tblXMLTest; GO IF OBJECT_ID('dbo.tblXMLTest') IS NOT NULL    DROP TABLE dbo.tblXMLTest; GO IF OBJECT_ID('dbo.fn_GetID') IS NOT NULL    DROP FUNCTION dbo.fn_GetID; GO IF OBJECT_ID('dbo.IsXMLFragment') IS NOT NULL    DROP FUNCTION dbo.IsXMLFragment; GO IF OBJECT_ID('dbo.EmpIDPresent') IS NOT NULL    DROP FUNCTION dbo.EmpIDPresent; GO

This script begins with a user-defined function that accepts an xml data type parameter, uses the value function on the input XML parameter, and returns an integer value. Next, the ALTER TABLE statement adds a new computed column whose value is the integer returned by the fn_GetID function, passing the col6 column to it.

Validating XML Data by Using Typed XML

So far you have seen examples of storing untyped XML documents and XML fragments into xml data type columns, variables, and parameters. SQL Server 2005 supports associating XSD schemas with the xml data type so that the XML value for the column, variable, or parameter adheres to the structure and data types defined by the associated XSD schema. Having SQL Server validate the xml data type column, variable, or parameter by associating it with an XML schema collection is known as typed XML.

With typed XML, you can associate an XSD schema collection with the xml data type column, and the engine validates against the associated schema and generates an error if the validation fails. The typed XML has two main benefits. First, it ensures that the XML data in the column, variable, or parameter is according to the schema you desire. Second, it helps the engine to optimize storage and query processing. Also, when declaring the typed XML, you can specify the DOCUMENT clause, which ensures that the XML has only one top-level element and hence XML fragments are disallowed. The default is CONTENT, which allows XML fragments.

Tip

For performance reasons, it is recommended that you use typed XML. With untyped XML, the node values are stored as strings, and hence the engine has to do the data conversion when you extract the XML values or use node values in the predicate (for example, /person/age < 50); on the other hand, with typed XML, no data conversion takes place because the XML data values are internally stored based on types declared in the XSD schema. Typed XML makes parsing more efficient and avoids any runtime conversions.

Here's an example of creating typed XML:

1.	Make sure you have an XSD schema in a XML schema collection. SQL Server 2005 provides the `CREATE`, `ALTER`, and `DROP XML SCHEMA COLLECTION` DDL statements, which can be used to manage the XML schema collections in a database. When you have your XSD schema in the database as part of an XML schema collection, it can be associated with any number of `xml` data type columns or variables to create typed XML. The following T-SQL statements create a schema collection called `SampleXSDs` that contains an XSD schema which defines the purchase order: USE AdventureWorks; GO IF EXISTS (SELECT schema_id FROM sys.xml_schema_collections WHERE name = 'SampleXSDs') BEGIN RAISERROR('Schema collection named SampleXSDs already exists.', 16, 1); RETURN END; CREATE XML SCHEMA COLLECTION SampleXSDs AS N'<?xml version="1.0" encoding="UTF-16"?> <xsd:schema xmlns:xsd = "http://www.w3.org/2001/XMLSchema" targetNamespace = "http://schemas.sams.com/PO" elementFormDefault="qualified" attributeFormDefault="unqualified" xmlns = "http://schemas.sams.com/PO"> <xsd:complexType name="USAddress"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="street" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:decimal"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/> </xsd:complexType> <xsd:complexType name="Items"> <xsd:sequence> <xsd:element name="item" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="productName" type="xsd:string"/> <xsd:element name="quantity"> <xsd:simpleType> <xsd:restriction base="xsd:positiveInteger"> <xsd:maxExclusive value="100"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="USPrice" type="xsd:decimal"/> <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType> <xsd:element name="purchaseOrder" type="PurchaseOrderType"/> </xsd:schema>' The XSD schema first defines a type called `USAddress`, and then it defines a type called `Items`, and finally it defines a type called `PurchaseOrderType`, which makes use of the `USAddress` and `Items` types. The XSD schema defines a single root element called `purchaseOrder` that is of type `PurchaseOrderType`. Before creating a new schema collection, the script uses the `sys.xml_schema_collections` catalog view to check whether a schema collection with that name already exists. In this example, it raises an error and returns; alternatively, you can run `ALTER XML SCHEMA COLLECTION` and add an XSD schema to this existing schema collection.
2.	After you register an XSD schema in a schema collection, SQL Server 2005 stores the schema metadata in various system tables. You can use the XML catalog views to view the XSD schema collection details. Execute the following T-SQL batch to see the details of an XSD schema in the XML schema collection created in step 1: DECLARE @collection_id INT SELECT @collection_id = xml_collection_id FROM sys.xml_schema_collections WHERE name = 'SampleXSDs'; SELECT * FROM sys.xml_schema_namespaces WHERE xml_collection_id = @collection_id; SELECT * FROM sys.xml_schema_elements WHERE xml_collection_id = @collection_id; SELECT * FROM sys.xml_ schema_attributes WHERE xml_collection_id = @collection_id; SELECT * FROM sys.xml_ schema_types WHERE xml_collection_id = @collection_id; SELECT XML_SCHEMA_NAMESPACE('dbo', 'SampleXSDs'); For each XML schema collection, there is an entry in the `sys.xml_schema_ collections` catalog view. One schema collection can contain multiple XSD schemas that have different target namespaces. For each such schema, there is an entry in the `sys.xml_schema_namespaces` catalog view. The types, elements, attributes, and so on are available via the catalog views `sys.xml_schema_types`, `sys.xml_schema_elements`, `sys.xml_schema_attributes`, and so on. The last statement shows how to view the registered XSD schema by using the `XML_SCHEMA_ NAMESPACE` function. Other schema collection catalog views include `sys.xml_schema_components`, `sys.xml_schema_facets`, `sys.xml_schema_model_groups`, `sys.xml_schema_wildcards`, and `sys.xml_schema_wildcard_namespaces`.
3.	When an XSD schema is available in an XML schema collection, associate the schema with the XML column to yield typed XML: IF OBJECT_ID('dbo.tblTypedXML') IS NOT NULL DROP TABLE dbo.tblTypedXML; GO CREATE TABLE dbo.tblTypedXML (id int IDENTITY(1,1) PRIMARY KEY, col1 XML (dbo.SampleXSDs)); GO Note XML schema collections are database scoped and cannot be referenced across databases. Also, unlike temporary tables, there is no notion of temporary XML schema collections. Therefore, even if you precede a schema collection name with # or ##, it is still created in the current database and not in the `tempdb` database. If you are creating a temporary table that contains a typed XML column, the schema collection must first be created in the `tempdb` database. The first INSERT statement in the following T-SQL batch succeeds because it validates with the XSD schema specified for the typed XML column, but the second `INSERT` fails because it is missing the `billTo` element: INSERT INTO dbo.tblTypedXML (col1) VALUES (N'<purchaseOrder orderDate="2005-03-03Z" xmlns="http://schemas.sams.com/PO"> <shipTo country="US"> <name>Alice Smith</name> <street>123 Maple Street</street> <city>Mill Valley</city> <state>CA</state> <zip>90952</zip> </shipTo> <billTo country="US"> <name>Robert Smith</name> <street>8 Oak Avenue</street> <city>Old Town</city> <state>PA</state> <zip>95819</zip> </billTo> <items> <item> <productName>Lawnmower</productName> <quantity>1</quantity> <USPrice>148.95</USPrice> </item> </items> </purchaseOrder> '); PRINT 'Following statement will fail:' INSERT INTO dbo.tblTypedXML (col1) VALUES (N'<purchaseOrder orderDate="2005-03-03Z" xmlns="http://schemas.sams.com/PO"> <shipTo country="US"> <name>Alice Smith</name> <street>123 Maple Street</street> <city>Mill Valley</city> <state>CA</state> <zip>90952</zip> </shipTo> <items> <item> <productName>Lawnmower</productName> <quantity>1</quantity> <USPrice>148.95</USPrice> </item> </items> </purchaseOrder> '); GO
4.	Finally, clean up: IF OBJECT_ID('dbo.tblTypedXML') IS NOT NULL DROP TABLE dbo.tblTypedXML; GO IF EXISTS (SELECT schema_id FROM sys.xml_schema_collections WHERE name = 'SampleXSDs') BEGIN DROP XML SCHEMA COLLECTION SampleXSDs END;

Bulk Loading XML Data

If you have XML data in a disk file and would like to load it into an xml data type column, you can use the BULK rowset provider with the OPENROWSET function and specify the SINGLE_CLOB option to read the entire file as a single-row, single-varchar(max) column value.

Let's assume that you have the following XML text saved into a disk file called "c:\PO.xml" on the SQL Server machine:

<purchaseOrder orderDate="1999-10-20" xmlns="http://schemas.sams.com/PO">    <shipTo country="US">       <name>Alice Smith</name>       <street>123 Maple Street</street>       <city>Mill Valley</city>       <state>CA</state>       <zip>90952</zip>    </shipTo>    <billTo country="US">       <name>Robert Smith</name>       <street>8 Oak Avenue</street>       <city>Old Town</city>       <state>PA</state>       <zip>95819</zip>    </billTo>    <items>       <item>          <productName>Lawnmower</productName>          <quantity>1</quantity>          <USPrice>148.95</USPrice>       </item>    </items> </purchaseOrder>

The following T-SQL statements illustrate creating a table with an untyped xml data type column and bulk loading the preceding XML file into this column:

USE AdventureWorks; GO IF OBJECT_ID('dbo.tblBulkLoadXML') IS NOT NULL    DROP TABLE dbo.tblBulkLoadXML; GO CREATE TABLE dbo.tblBulkLoadXML  (id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,   POData XML NOT NULL); GO INSERT INTO dbo.tblBulkLoadXML (POData)    SELECT * FROM    OPENROWSET(BULK 'c:\PO.xml', SINGLE_CLOB) AS POData; GO SELECT * FROM dbo.tblBulkLoadXML;

Note

You can use the technique discussed here for using the OPENROWSET function to read XML from a file. However, SQL Server 2005 does not provide any way to write the XML data from a XML column to a file. You can write a SQLCLR managed procedure to write XML into the file. Chapter 11, "SQL Server 2005 and .NET Integration," contains the C# code for this.

Before we look at other functions to query and modify the xml data type columns and variables, let's first take a look at a quick overview of the XQuery specification.

Introduction to XQuery

If SQL is used to query relational data, XQuery is a language that is used to query data that is either physically stored as XML or virtualized as XML. In addition, XQuery also allows general processing of XML (such as creating nodes). XQuery borrows a lot of features from XPath. XQuery 1.0 is said to be an extension to XPath 2.0, adding support for better iteration, sorting of results, and construction (that is, the ability to construct the shape of the desired XML). In summary, XQuery is an expression-based declarative query language that is used to efficiently extract data from XML documents. In SQL Server 2005, the data stored in xml data type columns and variables can be queried using XQuery. SQL Server 2005 partially implements the W3C XQuery specification (see www.w3.org/TR/xquery) and is aligned with the July 2004 working draft (see www.w3.org/TR/2004/WD-xquery-20040723/).

An XQuery query consists of two parts: the prolog and the query body. The optional prolog section is used to declare the namespaces used in the query, and the required query body consists of an expression that is evaluated by the SQL Server 2005 engine to produce the desired output. Each XQuery prolog entry must end with a semicolon (;).

These are the three most common expression types used in XQuery queries:

Path expressions Exactly as in XPath, the path expressions are used to locate nodes within an XML tree. The path expression (for example, /purchaseOrder/items/item[5]/text()) may consists of steps (separated by / or //), filter steps, axis steps, predicates, a node test, and a name test.
FLWOR expressions FLWOR, which stands for "for-let-where-order by-return," is pronounced "flower," and it is the core XQuery expression that allows looping, variable binding, sorting, filtering, and returning of results. SQL Server 2005 does not support the LET construct.
Constructors As mentioned earlier, in addition to querying XML data, XQuery allows the creating of XML nodes. There are two approaches to creating XML nodes in XQuery. The "direct" approach involves directly writing XML text or using expressions that produce XML inside curly braces ({}). The other approach, called "computed," involves the use of keywords such as element, attribute, document, text, processing-instruction, comment, or namespace to create the respective nodes. SQL Server 2005 supports both approaches.

SQL Server 2005 primarily provides three xml data type methods that can be used to run XQuery queries. These methods are query(), value(), and exist(). The other two xml type methods are nodes(), which is used to shred xml type instance into relational data, much like OPENXML, and the modify() method, which is used to modify the content of an xml type column or a variable.

XML Type Methods

The xml data type supports five methods that you can use to manipulate XML instances; you can call these methods by using xmltype.method() syntax:

The query() method allows users to run an XQuery expression to obtain a list of XML nodes as an instance of untyped XML.
The value() method is useful for extracting scalar values from XML documents.
The exist() method is useful for evaluating XQuery expressions to determine whether the expression results in an empty or nonempty set. This method returns 1 for a nonempty result, 0 for an empty result, and NULL if the XML instance itself is NULL.
The modify() method is useful for modifying the content of an XML document.
The nodes() method is useful for decomposing an XML document into relational data.

Let's begin with an example of the query() method:

USE AdventureWorks; GO SELECT Name, Demographics.query('       declare namespace d=   "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey";     d:StoreSurvey/d:AnnualSales')     AS AnnualSales FROM Sales.Store; GO

This T-SQL statement illustrates the query() method. The parameter to the query() method is an XQuery query that begins with a namespace declaration in the prolog section and a path expression as the query body. This SELECT statement queries the Demographics xml data type column to find out each store's annual sales. Note that the query() method returns the untyped XML as the result.

You can also declare namespaces by using the WITH XMLNAMESPACES clause, as shown here:

WITH XMLNAMESPACES (   'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey'    AS "d") SELECT Name, Demographics.query('d:StoreSurvey/d:AnnualSales') AS AnnualSales FROM Sales.Store; GO

You can use the value() method to get the node data as a scalar value:

SELECT Name, Demographics.value('     declare namespace d=    (d:StoreSurvey/d:AnnualSales)[1]', 'decimal(18,2)')    AS AnnualSales FROM Sales.Store; GO

The value() method takes two parameters: an XQuery expression string and a string that indicates the type of returned scalar value (decimal(18,2), in this example). The second parameter cannot be specified as an xml data type, a CLR user-defined type, or an image, text, ntext, timestamp, or sql_variant data type. The XQuery expression must return a singleton or an empty sequence.

The exist() method takes just one parameterthe XQuery expressionand returns either 1 (indicating trUE), 0 (indicating FALSE), or NULL (indicating that the xml data type instance was NULL). This method is generally used in the WHERE clause. Let's say you want to get a list of stores that have T3 internet lines. One of the ways in which you can do this is by running the following query:

SELECT * FROM Sales.Store WHERE Demographics.value(' declare namespace d= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"; ""  (/d:StoreSurvey/d:Internet)[1]', 'varchar(20)') = 'T3'; GO

However, this is not an efficient approach because SQL Server has to retrieve the node value, do the conversion (Unicode to ANSI, in this case), and then do the comparison. An alternate and better approach is to use the exist() method as shown here:

SELECT * FROM Sales.Store WHERE Demographics.exist(' declare namespace d=   "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"; ""  /d:StoreSurvey[d:Internet = "T3"]') = 1; GO

The exist() method exploits the PATH and other XML indexes (discussed later in this chapter) more effectively and can yield better performance than the value() method.

Before concluding this section, here are some more examples that show the capabilities of XQuery. The first two queries illustrate the FLWOR expressions, and the next two queries illustrate constructing XML using XQuery:

[View full width]
--FLWOR Example 1 SELECT EmployeeID, Resume.query('    declare namespace     r="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";""    for $emp in /r:Resume/r:Employment    order by $emp/r:Emp.OrgName    return concat(string($emp/r:Emp.OrgName), "~") ') AS PrevEmployers FROM HumanResources.JobCandidate WHERE EmployeeID IS NOT NULL --FLWOR Example 2 SELECT name,    Instructions.query('    declare namespace i="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works /ProductModelManuInstructions";    for $step in //i:step    where count($step/i:tool) > 0 and count($step/i:material) > 0    return $step   ') AS StepWithToolAndMaterialReq FROM Production.ProductModel WHERE Instructions IS NOT NULL --Constructing XML  direct approach SELECT FirstName, LastName, AdditionalContactInfo.query(' declare namespace a= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactTypes";"" <PhoneNumbers> {   for $p in //a:number   return <Number>{string($p)}</Number> } </PhoneNumbers> ') AS AdditionalPhoneNumbers FROM Person.Contact WHERE AdditionalContactInfo IS NOT NULL; --Constructing XML  computed approach SELECT FirstName, LastName, AdditionalContactInfo.query(' declare namespace a= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactTypes";"" element PhoneNumbers {   for $p in //a:number   return    element Number {string($p)}  } ') AS AdditionalPhoneNumbers FROM Person.Contact WHERE AdditionalContactInfo IS NOT NULL;

The first query in this batch illustrates the FLWOR expression. The $emp looping variable is bound to the Employment elements under the Resume top-level elements, and for each such element, the XQuery expression sorts and returns the organization names, separated by tilde characters (~). The second FLWOR example, which illustrates the where clause, and returns the steps that have both material and tool nodes. The final two examples show constructing XML directly and by using computed approach.

XQuery in SQL Server 2005 supports various types of functions and operators, including arithmetic, comparison, and logical operators and the string manipulation, data accessors, and aggregate functions. XQuery expressions also support the if-then-else construct that you can use to perform operations based on the value of a conditional expression. The sql:column() and sql:variable() functions can be used inside XQuery expressions to access a non-XML relational column and an external variable.

SQL Profiler includes a trace event called XQuery Static Type under the TSQL event category that you can use to trace XQuery activity. The event provides a method name, including the column on which the method was executed and the static type of the XQuery expression. To use it, you run Profiler, select the XQuery Static Type event, and run the preceding queries. If you do not see the events in Profiler, you can run DBCC FREEPROCCACHE and then run the XQuery queries again.

In summary, XQuery can be used for querying and reshaping the data stored in xml data type columns and variables. Processing XML at the server by using XQuery can result in reduced network traffic, better maintainability, reduced risk, and increased performance.

Indexes on XML Type Columns

Properly designed indexes are the key to improving query performance. At the same time, you need to consider the cost of maintaining indexes in measuring the overall benefits of the indexes created. As mentioned earlier, SQL Server 2005 stores XML data as a BLOB, and every time an XML column is queried, SQL Server parses and shreds the XML BLOB at runtime to evaluate the query. This can be quite an expensive operation, especially if you have large XML data stored in the column or if you have a large number of rows. To improve the performance of queries on xml data type columns, SQL Server 2005 provides two new types of indexes: primary XML indexes and secondary XML indexes. You can create XML indexes on typed or untyped XML columns.

You can create primary XML indexes by using the CREATE PRIMARY XML INDEX DDL statement. This type of index is essentially a shredded and persisted representation of XML BLOB data. For each XML BLOB in the column, the primary index creates several rows of data in an internal table. This results in improved performance during query execution time because there is no shredding and parsing involved at runtime. The primary XML index requires a clustered index on the primary key of the base table in which the XML column is defined. If the base table is partitioned, the primary XML index is also partitioned the same way, using the same partitioning function and partitioning scheme. After a primary XML index is created on a table, you cannot change the primary key for that table unless you drop all the XML indexes on that table.

All the primary XML index does is avoid the runtime shredding. After you analyze your workload, you can create secondary XML indexes to further improve the performance of an XML query.

A secondary XML index cannot be created unless you have a primary XML index. There are three types of secondary XML indexesPATH, VALUE, and PROPERTYeach designed for improving the response time for the respective type of query. You can create secondary XML indexes by using the CREATE XML INDEX DDL statement.

If your XML queries make use of path expressions, such as /Production/Product/Material, the PATH secondary XML index can improve the performance of such queries. In most cases, the exist() method on XML columns in a WHERE clause benefits the most from the PATH indexes. The PATH index builds a B+ tree on the path/value pair of each XML node in the document order across all XML instances in the column.

If your XML queries are based on node values and do not necessarily specify the full path or use wildcards in the path (for example, //Sales[@amount > 10000] or //Catalog[@* = "No"]), the VALUE secondary XML index can improve the performance of such queries. The VALUE index creates a B+ tree on the value/path pair of each node in the document order across all XML instances in the XML column.

If your XML queries retrieve multiple values from individual XML instances, the PROPERTY XML index might benefit from using such queries because it groups the properties for each XML together.

Tip

You can include the word Primary, Path, Property, or Value in the name of a primary or secondary XML index that you create to quickly identify the type of XML index.

The following T-SQL batch illustrates creating primary and secondary XML indexes and then using the catalog views and functions to view the XML index details:

USE AdventureWorks; GO IF OBJECT_ID('dbo.tblIndexDemo') IS NOT NULL    DROP TABLE dbo.tblIndexDemo; GO CREATE TABLE dbo.tblIndexDemo    (ID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,     col1 XML NOT NULL); GO CREATE PRIMARY XML INDEX PrimaryXMLIdx_col1 ON dbo.tblIndexDemo(col1); GO CREATE XML INDEX PathXMLIdx_col1 ON dbo.tblIndexDemo(col1) USING XML INDEX PrimaryXMLIdx_col1 FOR PATH; GO CREATE XML INDEX PropertyXMLIdx_col1 ON dbo.tblIndexDemo(col1) USING XML INDEX PrimaryXMLIdx_col1 FOR PROPERTY; GO CREATE XML INDEX ValueXMLIdx_col1 ON dbo.tblIndexDemo(col1) USING XML INDEX PrimaryXMLIdx_col1 FOR VALUE; GO --Get Index Information SELECT * FROM sys.xml_indexes    WHERE [object_id] = OBJECT_ID('dbo.tblIndexDemo'); --Cleanup IF OBJECT_ID('dbo.tblIndexDemo') IS NOT NULL    DROP TABLE dbo.tblIndexDemo; GO

This script creates a table that has a column of xml data type. It then creates a primary XML index on this column, followed by all three types of secondary indexes on the same column. You can use the sys.xml_indexes catalog view to view the details on XML indexes. This catalog view indicates whether an XML index is a primary or secondary index; if it is secondary, the view indicates what primary index it is based on and what type (PATH, PROPERTY, or VALUE) of secondary index it is.

Tip

SQL Server 2005 allows you to create full-text indexes on xml data type columns. You can combine a full-text search with XML index usage in some scenarios to first use fulltext indexes to filter the rows and then use XML indexes on those filtered rows, in order to improve the query response time. However, note that attribute values are not full-text indexed as they are considered part of the markup, which is not full-text indexed.

Modifying XML Data

SQL Server 2005 provides the modify() method, which you can use to change parts of XML content stored in an xml data type column or variable. When you change the xml type table column, the modify() method can only be called within a SET clause in an UPDATE statement; when you change the xml type variable, the modify() method can only be called by using the SET T-SQL statement. The modify() function can be used to insert one or more nodes, to delete nodes, or to update the value of a node. This function takes an XML Data Modification Language (XML DML) expression, which is an extension to the XQuery specification. XQuery 1.0 does not support the update functionality. Hence, SQL Server 2005 introduces an extension to XQuery 1.0 by including three new casesensitive keywords"insert", "delete", and "replace value of"that you can use inside an XQuery query to change parts of the XML data.

The following T-SQL script shows an example of the modify() method and XML DML:

DECLARE @xmlVar xml SET @xmlVar = N'   <Person>     <Phone type="h">111-111-1111</Phone>     <Phone type="c">222-222-2222</Phone>   </Person> '; SELECT @xmlVar; --insert SET @xmlVar.modify(   'insert <Phone type="w">333-333-3333</Phone>    into (/Person)[1]'); --delete SET @xmlVar.modify(   'delete /Person/Phone[@type="c"]'); --change node value SET @xmlVar.modify(   'replace value of (/Person/Phone[@type="h"]/text())[1]   with "444-444-4444"'); SELECT @xmlVar; GO

At the end, the @xmlVar XML variable has the following value:

<Person>   <Phone type="h">444-444-4444</Phone>   <Phone type="w">333-333-3333</Phone> </Person>