Working with XML Data Types | Beginning XML Databases (Wrox Beginning Guides)

SQL Server XML data types allow storage of XML document data into tables in a SQL Server database. There are various methods of manipulating the content of XML data types.

Adding XML Documents to SQL Server

The easiest method of loading XML document data into a SQL Server database is to create a table containing an XML data type field, and then load that table using an external data import function called OPENROWSET .

Begin by creating a table:

   CREATE TABLE demXML (xml XML) GO

Now simply use an INSERT statement to add an XML document directly from an external source into the table containing the XML data type field:

   INSERT INTO demXML (xml) SELECT * FROM OPENROWSET (BULK 'C:\Manuscripts\Wiley\BeginningXMLDatabases\app\databa se\oracle\XMLDB\demographicsCLOB.xml', SINGLE_BLOB) AS TEMP GO

To check the content of the table, use a query something like this:

   SELECT * FROM demXML GO

The demographics database used for this book (see Appendix B) has an XML document that is almost 4GB in size . SQL Server has a limitation on XML data type storage of 2GB. So, even though the entire document might have loaded into the XML data type in the table demXML created previously, the management studio tool does not allow viewing of output of over 2GB.

Another method of loading external data is to use the SQLXML XML Bulk Loader utility. There is not really any point in demonstrating it as the method described so far is simplistic and more than adequate for the purposes of this book.

Retrieving and Modifying XML Data Types

The SQL Server XML data type has methods called query() , value() , exist() , nodes() , and modify() .

The case of the methods when executing each one is important in SQL Server. In other words, xml.Query() or xml.QUERY() returns an error. xml.query() will not return an error.

Using the various XML data type methods could not be easier. The query() method (often mistakenly called a function) allows use of an expression search string. The expression is a pattern match search (searching for a string pattern) through the hierarchical structure of an XML document. The query method can be used to return a set of nodes from an XML data type, which is stored in a SQL Server database.

The following example searches from the root node (demographics.xml) into the regions ; until it finds the region containing Australia. Then it looks for the city of Brisbane inside the country node of Australia:

   SELECT xml.query('/demographics/region/country[name="Australia"]/city[name="Brisbane"]') AS node FROM demXML GO

The query method returns the city node containing the city of Brisbane, in Australia:

   <city id="176">   <name>Brisbane</name>   <population>1775000</population> </city>

Using an attribute the same node can be found:

   SELECT xml.query('/demographics/region/country[name="Australia"]/city[@id="176"]') AS node FROM demXML GO

The result is the same as before but this time an attribute value, rather than an element name, was used to find the same city node. As you can see in the previous example, Brisbane in Australia has an integer identifier value of 176.

There are various methods of building expressions. Different character combinations are used as pattern matching or expression directives, persuading a search pattern to behave in a specific manner. All of these specific pattern matching character sequences are common to both XQuery and XPath, both of which are discussed later on in this book. Explaining the nitty-gritty details for each relational database is pointless for the purposes of this book. In general you should already know from previous chapters that expression rules, as shown in Table 6-1, apply when reading through an XML document.

Table 6-1: Standard XML Document Pattern Matching Expressions
Expression	Function	Example
element="value"	Search for a value.	name="Brisbane"
@attribute="value"	Search for an attribute value.	@id="176"
//	Search through child nodes.	/demographics//country [name="Brisbane"]
..	Search for parent nodes.	/demographics//city [name="Brisbane"]/..
.	Search the current node.	/name[.="Brisbane"]
text()	Find text value of an element.	country/name.text() finds a country like Australia

Expanding on Table 6-1, the following example will again find the node for the city of Brisbane. The difference when using the // character sequence is the amount of searching. The following query searches through all child nodes of the demographics node ignoring parent country and region. This example will execute slower than previous, more precise expressions:

   SELECT xml.query('/demographics//city[name="Brisbane"]') AS node FROM demXML GO

This next example finds the entire parent node of the country of Australia because Australia contains the city of Brisbane:

   SELECT xml.query('/demographics//city[name="Brisbane"]/..') AS node FROM demXML GO

This is a partial result:

   <country id="46" code="AS">   <name>Australia</name>   <population> ...   </population>   <area>7617931</area>   <currency fxcode="AUD" rate="1.30141">Dollars</currency>   <city id="175">     <name>Adelaide</name>     <population>1125000</population>   </city>   <city id="176">     <name>Brisbane</name>     <population>1775000</population>   </city> ... </country>

The . character sequence returns only the node searched for, and no child nodes:

   SELECT xml.query('/demographics//name[.="Brisbane"]') AS node FROM demXML GO

This is the result:

   <name>Brisbane</name>

This query finds all city <name> nodes in the country of Australia:

   SELECT xml.query('/demographics//country[name="Australia"]//name') AS node FROM demXML GO

This is the result:

   <name>Australia</name> <name>Adelaide</name> <name>Brisbane</name> <name>Canberra</name> <name>Darwin</name> <name>Hobart</name> <name>Melbourne</name> <name>Perth</name> <name>Sydney</name>

Now, if you use the text() function, all the values within the nodes are returned, as opposed to all the nodes:

   SELECT xml.query('/demographics//country[name="Australia"]//name/text()') AS node FROM demXML GO

This is the result:

   AustraliaAdelaideBrisbaneCanberraDarwinHobartMelbournePerthSydney

The query() method can be used to return individual element and attribute values. An easier method of returning specific value items is the value() method. The following query finds a single node, and the value of the node is supposed to be extracted. This query does not seem to function as it returns an error in the management studio. The problem is typecasting issues for XML data returned from an XML data type so it has to be coded. The following query, as shown before, returns an XML node:

   SELECT xml.query('/demographics//city[name="Brisbane"]') AS node FROM demXML GO

This is the node returned:

   <city id="176">   <name>Brisbane</name>   <population>1775000</population> </city>

Next, the value() method is used to extract a single value from the result of the query() method:

   DECLARE @xmldoc xml SET @xmldoc = (SELECT xml.query('/demographics//city[name="Brisbane"]') AS node FROM demXML) SELECT @xmldoc.value('/city[1]','nvarchar(64)') AS node SELECT @xmldoc GO

The result contains all the text contents (excluding attributes) of the node returned by the query() method:

   Brisbane 1775000

The exist() method simply returns a Boolean response for the existence of a node. This query returns a 1, indicating the node exists:

   SELECT xml.exist('/demographics//name[.="Brisbane"]') AS node FROM demXML GO

This query returns a 0, indicating the node does not exist:

   SELECT xml.exist('/demographics//name[.="Brisban"]') AS node FROM demXML GO

The modify() method allows for changes to be made to data in XML documents. More specifically , the modify() method allows insertions of new data, updates to existing data, and deletions of existing data. The following query inserts a new node within the node for the city of Brisbane:

   DECLARE @xmldoc xml SET @xmldoc = (SELECT xml.query('/demographics//city[name="Brisbane"]') FROM demXML) SET @xmldoc.modify('insert <area>100000</area> into (/city)[1]') SELECT @xmldoc GO

This is the result with the area element added as the last child of the city element:

   <city id="176">   <name>Brisbane</name>   <population>1775000</population>   <area>100000</area> </city>

The previous insertion example was executed by simply adding the new element into the specified point. Insertions can also be performed as first, as last, before, and after all relative to the current element.

Now lets try updating. This query changes the value of the newly added area element:

   DECLARE @xmldoc xml SET @xmldoc = (SELECT xml.query('/demographics//city[name="Brisbane"]') FROM demXML) SET @xmldoc.modify('insert <area>100000</area> into (/city)[1]') SET @xmldoc.modify('replace value of (/city/area/text())[1] with "200000"') SELECT @xmldoc GO

The result shows the area element value has been changed:

   <city id="176">   <name>Brisbane</name>   <population>1775000</population>   <area>200000</area> </city>

And now you can remove the newly added area element:

   DECLARE @xmldoc xml SET @xmldoc = (SELECT xml.query('/demographics//city[name="Brisbane"]') FROM demXML) SET @xmldoc.modify('insert <area>100000</area> into (/city)[1]') SET @xmldoc.modify('delete /city/area') SELECT @xmldoc GO

The result shows the area element now removed:

   <city id="176">   <name>Brisbane</name>   <population>1775000</population> </city>