XML Data Type


One of the major new features of SQL Server 2005 is the XML data type. With older versions of SQL Server, XML data has been stored inside a string or a blob. Now XML is a supported data type that allows you to combine SQL queries with XQuery expressions to search within XML data. An XML data type can be used as a variable, a parameter, a column, or a return value from a UDF.

With Office 2003, it is possible to store Word and Excel documents as XML. Word and Excel also support using custom XML schemas, where only the content (and not the presentation) is stored with XML. The output of Office applications can be stored directly in SQL Server 2005, where it is possible to search within this data. Of course, custom XML data can also be stored in SQL Server.

Important 

Don’t use XML types for relational data. If you do a search for some of the elements and if the schema is clearly defined for the data, storing these elements in a relational fashion allows the data to be accessed faster. If the data is hierarchical and some elements are optional and may change over time, storing XML data has many advantages.

Tables with XML Data

Creating tables with XML data is as simple as selecting the Xml data type with a column. The following CREATE TABLE SQL command creates the Exams table with a column ID that is also the primary key and the column Exam, which is of type xml:

  CREATE TABLE [dbo].[Exams](    [Id] [int] IDENTITY(1,1) NOT NULL,    [Number] [nchar] (10) NOT NULL,    [Info] [xml] NOT NULL,    CONSTRAINT [PK_Exams] PRIMARY KEY CLUSTERED    (       [Id] ASC    ) ON [PRIMARY] ) ON [PRIMARY] 

For a simple test, the table is filled with this data:

  INSERT INTO Exams values('70-536',   '<Exam Number="70-536">     <Title>TS: Microsoft .NET Framework 2.0 – Application Development Foundation     </Title>     <Certification Name="MCTS Windows Applications" Status="Core" />     <Certification Name="MCTS Web Applications" Status="Core" />     <Certification Name="MCTS Distributed Applications" Status="Core" />     <Topic>Developing applications that use system types and collections</Topic>     <Topic>Implementing service processes, threading, and application domains     </Topic>     <Topic>Embedding configuration, diagnostics, management, and installation features     </Topic>     <Topic>Implementing serialization and input/output functionality</Topic>     <Topic>Improving the security</Topic>     <Topic>Implementing interoperability, reflection, and mailing functionality     </Topic>     <Topic>Implementing globalization, drawing, and text manipulation functionality     </Topic>   </Exam>') INSERT INTO Exams values('70-528',   '<Exam Number="70-528">    <Title>TS: Microsoft .NET Framework – Web-Based Client Development</Title>    <Certification Name="MCTS Web Applications" Status="Core" />    <Course>2541</Course>    <Course>2542</Course>    <Course>2543</Course>    <Course>2544</Course>    <Topic>Creating and Programming a Web Application</Topic>    <Topic>Integrating Data in a Web Application by using ADO.NET, XML, and    Data-Bound Controls</Topic>    <Topic>Creating Custom Web Controls</Topic>    <Topic>Tracing, Configuring, and Deploying Applications</Topic>    <Topic>Customizing and Personalizing a Web Application</Topic>    <Topic>Implementing Authentication and Authorization</Topic>    <Topic>Creating ASP.NET Mobile Web Applications</Topic>   </Exam>') INSERT INTO Exams values('70-526',   '<Exam Number="70-526">     <Title>TS: Microsoft .NET Framework 2.0 – Windows-Based Client Development     </Title>     <Certification Name="MCTS Windows Applications" Status="Core" />     <Course>2541</Course>     <Course>2542</Course>     <Course>2546</Course>     <Course>2547</Course>     <Topic>Creating a UI for a Windows Forms Application by Using Standard Controls     </Topic>     <Topic>Integrating Data in a Windows Forms Application</Topic>     <Topic>Implementing Printing and Reporting Functionality</Topic>     <Topic>Enhancing Usability</Topic>     <Topic>Implementing Asynchronous Programming Techniques to Improve the User     Experience</Topic>     <Topic>Developing Windows Forms Controls</Topic>     <Topic>Configuring and Deploying Applications</Topic>   </Exam>') 

You can read the XML data with ADO.NET using a SqlDataReader object. The SqlDataReader method GetSqlXml() returns a SqlXml object. The SqlXml class has a property Value that returns the complete XML representation and a CreateReader() method that returns a XmlReader object.

Tip 

Using the XmlReader class is discussed in Chapter 26, “Manipulating XML.”

  using System; using System.Data; using System.Data.SqlClient; using System.Data.SqlTypes; using System.Xml;    class Program    {       static void Main()       {          string connectionString =             @"server=(local);database=ProCSharp;trusted_connection=true";          SqlConnection connection = new SqlConnection(connectionString);          SqlCommand command = connection.CreateCommand();          command.CommandText = "SELECT Id, Number, Info FROM Exams";          connection.Open();          SqlDataReader reader = command.ExecuteReader(                CommandBehavior.CloseConnection);          while (reader.Read())          {             SqlXml xml = reader.GetSqlXml(2);             Console.WriteLine(xml.Value);             XmlReader xmlReader = xml.CreateReader();             // use the reader          }          reader.Close();       }    } 

Query of Data

Up to now, you haven’t seen the really great features of the XML data type. SQL SELECT statements can be combined with XML XQuery.

A SELECT statement combined with an XQuery expression to read into the XML value is shown here:

  SELECT [Id], [Number], [Info].query('/Exam/Course') AS Course FROM [Exams] 

The XQuery expression /Exam/Course accesses the Course elements that are children of the Exam element. The result of this query returns the IDs, exam numbers, and courses. Exam 70-536 does not return any courses, as there are no courses listed for this exam.

 1 70-536 2 70-528 <Course>2541</Course><Course>2542</Course><Course>2543</Course>             <Course>2544</Course> 3 70-526 <Course>2541</Course><Course>2542</Course><Course>2546</Course>             <Course>2547</Course>

With an XQuery expression, you can create more complex statements to query data within the XML content of a cell. The next example converts the XML from the exam information to XML that lists information about courses:

  SELECT [Info].query('    for $course in /Exam/Course    return <Course>    <Exam>{ data(/Exam[1]/@Number) }</Exam>    <Number>{ data($course) }</Number> </Course>') AS Course FROM [Exams] WHERE Id=2 

Here, just a single row is selected with SELECT [Info]/... FROM Exams WHERE Id = 2. With the result of this SQL query, the for and return statements of an XQuery expression are used. for $course in /Exam/Course iterates through all Course elements. $course declares a variable that is set with every iteration (similar to a C# foreach statement). Following the return statement, the result of the query for every row is defined. The result for every course element is surrounded by the <Course> element. Embedded inside the <Course> element are <Exam> and <Number>. The text within the <Exam> element is defined with data(/Exam[1]/@Number). data() is an XQuery function that returns the value of the node specified with the argument. The node /Exam[1] is used to access the first <Exam> element; @Number specifies the XML attribute Number. The text within the element <Number> is defined from the variable $course.

Tip 

Contrary to C#, where the first element in a collection is accessed with an index of 0, with XPath the first element in a collection is accessed with an index of 1.

The result of this query is shown here:

 <Course>   <Exam>70-528</Exam>   <Number>2541</Number> </Course> <Course>   <Exam>70-528</Exam>   <Number>2542</Number> </Course> <Course   <Exam>70-528</Exam>   <Number>2543</Number> </Course> <Course>   <Exam>70-528</Exam>   <Number>2544</Number> </Course>

You can change the XQuery statement to also include a where clause for filtering XML elements. The following example only returns courses from the XML column if the course number has a value higher than 2542:

  SELECT [Info].query('    for $course in /Exam/Course    where ($course > 2542)    return <Course>    <Exam>{ data(/Exam[1]/@Number) }</Exam>    <Number>{ data($course) }</Number> </Course>') AS Course FROM [Exams] WHERE Id=2 

The result is reduced to just two course numbers:

 <Course   <Exam>70-528</Exam>   <Number>2543</Number> </Course> <Course>   <Exam>70-528</Exam>   <Number>2544</Number> </Course>

XQuery in SQL Server allows using several other XQuery functions for getting minimum, maximum, or summary values, working with strings, numbers, checking for positions within collections, and so on.

The next example shows the use of the count() function to get the number of /Exam/Course elements:

  SELECT [Id], [Number], [Info].query('    count(/Exam/Course)')    AS CourseCount FROM [Exams] 

The data returned displays the number of courses for the exams:

 Id   Number   CourseCount 1    70-536   0 2    70-528   4 3    70-526   4

XML Data Modification Language (XML DML)

XQuery as it is defined by the W3C (http://www.w3c.org) only allows querying of data. Because of this XQuery restriction, Microsoft defined an extension to XQuery that has the name XML Data Modification Language (XML DML). With XML DML it is possible to modify XML data. For this the following keywords extend XQuery: insert, delete, and replace value of.

This section looks at some examples to insert, delete, and modify XML contents within a cell.

You can use the insert keyword to insert some XML content within an XML column without replacing the complete XML cell. Here, <Course>2555</Course> is inserted as the last child element of the first Exam element:

  UPDATE [Exams] SET [Info].modify('    insert <Course>2555</Course> as last into Exam[1]') WHERE [Id]=3 

XML content can be deleted with the delete keyword. Within the first Exam element, the last Course element is deleted. The last element is selected by using the last() function.

  UPDATE [Exams] SET [Info].modify('    delete /Exam[1]/Course[last()]') FROM [Exams] WHERE [Id]=3 

It is also possible to change XML content. Here, the keyword replace value of is used. The expression /Exam/Course[text() = 2543] accesses only the child elements Course where the text content contains the string 2543. From these elements, only the text content is accessed for replacement with the text() function. If only a single element is returned from the query, it is still required that you specify just one element for replacement. This is why explicitly the first text element returned is specified with [1]. with 2599 specifies that the new course number is 2599:

  UPDATE [Exams] SET [Info].modify('    replace value of (/Exam/Course[text() = 2543]/text())[1] with 2599') FROM [Exams] 

XML Indexes

If some specific elements are often searched within the XML data, you can specify indexes within the XML data type. XML indexes must be distinguished as being a primary or a secondary XML index type. A primary XML index is created for the complete persisted representation of the XML value.

The following SQL command CREATE PRIMARY XML INDEX creates the index idx_exams on the Info column:

  CREATE PRIMARY XML INDEX idx_exams on Exams (Info) 

Primary indexes don’t help if the query contains an XPath expression to directly access XML elements of the XML type. For XPath and XQuery expressions, XML secondary indexes can be used. If an XML secondary index is created, the primary index must already exist. With secondary indexes, these index types must be distinguished:

  • PATH index

  • VALUE index

  • PROPERTY index

A PATH index is used if exists() or query() functions are used and XML elements are accessed with an XPath expression. Using the XPath expression /Exam/Course, it might be useful to do a PATH index:

  CREATE XML INDEX idx_examNumbers on [Exams] (Info)    USING XML INDEX idx_exams FOR PATH 

The PROPERTY index is used if properties are fetched from elements with the value() function. The FOR PROPERTY statement with the index creation defines a PROPERTY index:

  CREATE XML INDEX idx_examNumbers on [Exams] (Info)    USING XML INDEX idx_exams FOR PROPERTY 

If elements are searched through the tree with an XPath descendant-or-self axis expression, the best performance might be achieved with a VALUE index. The XPath expression //Certification searches all Certification elements with the descendant-or-self axis. The expression [@Name=”MCTS Web Applications”] returns only the elements where the attribute Name has the value MCTS Web Applications:

  SELECT [Info].query('/Exam/Title/text()') FROM [Exams]    WHERE [Info].exist('//Certification[@Name="MCTS Web Applications"]') = 1 

The result returned lists the titles of the exams that contain the requested certification:

 TS: Microsoft .NET Framework 2.0 – Application Development Foundation TS: Microsoft .NET Framework – Web-Based Client Development

The VALUE index is created with the FOR VALUE statement:

  CREATE XML INDEX idx_examNumbers on [Exams] (Info)    USING XML INDEX idx_exams FOR VALUE 

Strongly Typed XML

The XML data type in SQL Server can also be strongly typed with XML schemas. With a strongly typed XML column, it is verified if the data conforms to the schema when XML data is inserted.

A XML schema can be created with the CREATE XML SCHEMA COLLECTION statement. The statement shown here creates a simple XML schema CourseSchema. The schema defines the type CourseElt that contains a sequence of Number and Title, which are both of type string, and an element Any, which can be any type. Number and Title may occur only once. Because Any has the minOccurs attribute set to 0, and the maxOccurs attribute set to unbounded, this element is optional. This allows you to add any additional information to the CourseElt type in future versions, while the schema still remains valid. Finally, the element name Course is of type CourseElt.

  CREATE XML SCHEMA COLLECTION CourseSchema AS '<?xml version="1.0" encoding="UTF-8"?> <xs:schema  targetNamespace="http://thinktecture.com/Courses.xsd"   elementFormDefault="qualified" xmlns="http://thinktecture.com/Courses.xsd"   xmlns:mstns="http://thinktecture.com/Courses.xsd"   xmlns:xs="http://www.w3.org/2001/XMLSchema">     <xs:complexType name="CourseElt">       <xs:sequence>         <xs:element name="Number" type="xs:string" maxOccurs="1" minOccurs="1" />         <xs:element name="Title" type="xs:string" maxOccurs="1" minOccurs="1" />         <xs:element name="Any" type="xs:anyType"             maxOccurs="unbounded" minOccurs="0" />       </xs:sequence>     </xs:complexType>     <xs:element name="Course" type="CourseElt">     </xs:element> </xs:schema>' 

With this schema, a valid XML looks like this:

  <Course xmlns="http://thinktecture.com/Courses.xsd">   <Number>2549</Number>   <Title>Advanced Distributed Application Development with Visual Studio 2005</Title> </Course> 

With the Visual Studio 2005 Database project type, there’s no support to add a schema to the database. This feature is not available from the GUI by Visual Studio 2005 but must be created manually. To create an XML schema with Visual Studio 2005, you use an empty Visual Studio project type. You copy the XML syntax of the schema into the CREATE XML SCHEMA statement.

Besides using Visual Studio, you can copy the XML syntax into SQL Server Management Studio to create and view the XML schemas (see Figure 27-4). The Object Explorer lists the XML schemas under the Types entry.

image from book
Figure 27-4

The XML schema can be assigned to a column by setting it with the xml data type:

  CREATE TABLE [Courses] (  [Id] [int] IDENTITY(1,1) NOT NULL,  [Course] [xml]([dbo].[CourseSchema]) NOT NULL ) 

By creating the table with Visual Studio 2005 or with SQL Server Management Studio, the XML schema can be assigned to a column by setting the property XML schema namespace.

Now as you add data to the XML column, the schema is verified. If the XML does not satisfy the schema definition, a SqlException is thrown with an XML Validation error.




Professional C# 2005 with .NET 3.0
Professional C# 2005 with .NET 3.0
ISBN: 470124725
EAN: N/A
Year: 2007
Pages: 427

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net