Getting Started with the xml Data Type


The xml Data Type

SQL Server 2005 defines a new data type called xml. This data type can be the data type for a column when you create a table. It can also be the data type for a parameter that is passed to a stored procedure or a function. It works in much the same way as the other built-in data types in SQL Server. An XML instance is the value assigned to a variable, parameter, or column value that has an xml data type.

When XML data comes into SQL Server, SQL Server parses it to ensure that it is XML data, based on the SQL ANSI 2003 standard of an XML data type. This means the XML can be an XML document or a fragment containing more than one top-level element, and it can even have top-level text.

After parsing, the data is placed into the xml data type, which provides a logical view of the XML data that has been parsed into binary XML representation of the XQuery data model. The binary XML is used by SQL Server to provide efficient processing of queries.

Using the Schema Collection to Implement "Typed" xml Columns

You can also place a set of XML schemas into the SQL Server Schema Collection. The Schema Collection is a metadata object that is used to manage schema validation information, which can be associated with an xml data type to provide automatic validation of the type. An xml column that has no schema assigned is called an "untyped" xml column; an xml column that has a schema assigned is called a "typed" xml column, as shown in Figure 15-1.

image from book
Figure 15-1: Untyped vs. typed XML data storage to the xml data type

The Schema Collection contains XML schemas that might be related using <xs:import> or might be unrelated. Each typed XML instance specifies the target namespace from the Schema Collection it conforms to. The database engine validates the instance according to its schema during data assignment and modification. To provide efficient processing of typed XML data, the schema information is also used in storage and query optimizations.

Retrieving and Modifying XML Data

You can easily query for the complete XML data instance in the xml column, but the xml data type has five methods that are used to retrieve and modify XML data:

  • query Extracts parts of an XML instance. An XQuery expression is provided that evaluates to a list of XML nodes. The complete subtree root of these nodes is returned in document order. The result type is untyped xml.

  • value Extracts a scalar value from an XML instance. An XQuery expression is provided that evaluates to a node value. This value is cast to the T-SQL type specified as the second argument of the value method.

  • exist Provides an existential check on an XML instance. An XQuery expression is provided that returns 1 if the expression evaluates to a non-null node list; otherwise, it returns 0.

  • nodes Yields instances of a special xml data type, each with its context set to a different node than the XQuery expression evaluates to. The special xml data type supports query, value, nodes, and exist methods and can be used in count(*) aggregations and NULL checks. All other uses result in an error.

  • modify Permits modifying parts of an XML instance, such as adding or deleting subtrees or replacing scalar values (such as replacing the inventory count of a widget from 5 to 4).

Indexing the xml Column

Executing a query processes each XML instance at runtime which can be expensive if the XML instance values are large or the query needs to be evaluated on many rows in a table. In these situations, you can increase performance by implementing an index.

You can create a primary XML index on an xml column, which creates an index on all tags, values, and paths of the XML instances in the xml column. The primary XML index provides efficient evaluation of queries on XML data, and it can efficiently reassemble an XML result from the index while preserving document order and document structure.

The following secondary XML indexes, shown also in Figure 15-2, can also be created on an xml column to improve performance on different types of queries.

image from book
Figure 15-2: The primary index must be created before the secondary indexes are created, and these indexes can increase performance when you work with large XML instances.


Programming Microsoft ADO. NET 2.0 Applications. Advanced Topics
Linux Application Development (2nd Edition)
ISBN: 735621411
EAN: 2147483647
Year: 2004
Pages: 85
Authors: Michael K. Johnson, Erik W. Troan

Similar book on Amazon
Beginning Linux Programming
Beginning Linux Programming
Advanced Programming in the UNIX(R) Environment (Addison-Wesley Professional Computing Series)
Advanced Programming in the UNIX(R) Environment (Addison-Wesley Professional Computing Series)
Tom Swan's GNU C++ for Linux (Professional Dev. Guide)
Tom Swan's GNU C++ for Linux (Professional Dev. Guide) © 2008-2017.
If you may any questions please contact us: