In addition to being useful for inserting XML data into tables, the OpenXML function can be used to retrieve metadata from an XML document. This functionality allows you to write code that queries the actual XML structure of the document, and you could use it to build your own XML-processing application.
A number of metaproperties are defined in the Microsoft xml-metaprop namespace. These metaproperties can be retrieved by specifying the metaproperty name in a column pattern. For example, you could retrieve the unique node ID, the name, and the name of the parent node for each Item XML node in a document by executing the following query:
SELECT * FROM OpenXML(@iTree, ‘Order/Items/Item’, 1) WITH (NodeID INTEGER ‘@mp:id’, NodeName VARCHAR(20) ‘@mp:localname’, ParentNode VARCHAR(20) ‘@mp:parentlocalname’)
This code would return the following rowset:
NodeID | NodeName | ParentNode |
---|---|---|
6 | Item | Items |
10 | Item | Items |
14 | Item | Items |
This Transact-SQL code can be viewed in the metadata.sql script in the Demos\Chapter7 folder on the companion CD. The metaproperties can be used to examine the structure of an XML document and process it accordingly. is the full list of metaproperties defined in the xml-metaprop namespace.
Metaproperty | Description |
---|---|
@mp:id | Unique identifier for the specified node. |
@mp:localname | The name of the element. |
@mp:namespaceuri | The namespace for the specified node. The value of this property is NULL if no namespace is defined. |
@mp:prefix | The prefix used for the specified node. |
@mp:prev | The ID of the previous sibling node. |
@mp:xmltext | The textual representation of the node and its attributes and subelements. |
@mp:parentid | The ID of the parent node. |
@mp:parentlocalname | The name of the parent node. |
@mp:parentnamespaceuri | The namespace of the parent node. |
@mp:parentprefix | The prefix of the parent node. |
One of the most useful metaproperties is xmltext. This property returns the actual XML text from any part of the document. You can use the xmltext property to retrieve the whole XML document by specifying the root (/) as the row pattern, which of course means that you could insert the entire XML document into a single column of a table. However, it’s more likely that you’d want to insert some data from the document into individual columns and perhaps insert the remaining unconsumed XML into an overflow column.
For example, suppose we wanted to store the order header information (OrderID, CustomerID, EmployeeID, and OrderDate) in the corresponding columns of the Orders table, and store the XML data representing the relevant items in an overflow column named OrderDetailsXML. You can use the OpenXML function to retrieve the necessary data from the Order element, and you can use the xmltext metaproperty to retrieve the remaining XML data. However, by default the xmltext metaproperty retrieves the entire XML text for the element defined in the rowpattern parameter (in this case, Order), even if some of its attributes and subelements have already been selected. To retrieve only the unconsumed XML, you must add the value 8 to the flags parameter. This action instructs SQL Server to copy only the elements and attributes that haven’t already been explicitly selected to the overflow column.
In the case of the purchase order document, the following Transact-SQL statement could be used to retrieve the order header fields and the unconsumed XML data about order details:
SELECT * FROM OpenXML(@iTree, ‘Order’, 9) WITH (OrderID INTEGER, CustomerID nCHAR(6), EmployeeID INTEGER, OrderDate DATETIME, OrderDetailsXML VARCHAR(2000) ‘@mp:xmltext’)
Note that the flags parameter has been assigned the value 9. This assignment instructs SQL Server to search for attributes (1) and to retrieve unconsumed data in an overflow column (8). Here’s the resulting rowset from this query when used against the purchase order document I described earlier:
OrderID | CustomerID | EmployeeID | OrderDate | OrderDetailsXML |
---|---|---|---|---|
1001 | ALFKI | 2 | 01/01/2001 | <Order><Items> <Item Product Qty="1" UnitPrice="12.99"> <Discount>0</Discount> </Item> <Item Product Qty="2" UnitPrice="4.99"> <Discount>0.5</Discount> </Item> <Item Product Qty="1" UnitPrice="11.99"> <Discount>0</Discount> </Item> </Items></Order> |
This Transact-SQL code can be viewed in the overflow.sql script in the Demos\Chapter7 folder on the companion CD. As you can see, the OrderDetailsXML column contains an XML representation of the entire order document apart from the nodes that were retrieved in the other columns.
You can retrieve additional metadata by generating an edge table from an XML document. An edge table gets its name from the fact that every edge (or node) in the XML document produces a row in the rowset. To retrieve an edge table, simply omit the flags parameter and the WITH clause in the OpenXML function, as shown here:
SELECT * FROM
OPENXML(@iTree, ‘Order’)
This code retrieves a rowset with the following columns:
Column | Description |
---|---|
Id | Unique identifier for the node |
Parentid | Unique identifier of the parent node |
Nodetype | XML DOM node type |
Localname | Name of the node |
Prefix | Node prefix |
Namespaceuri | XML namespace for the node |
Datatype | The XML datatype of the node |
Prev | Unique identifier of the previous sibling node |
Text | Attribute value or element content |
This Transact-SQL code can be viewed in the EdgeTable.sql script in the Demos\Chapter7 folder on the companion CD. Notice that an edge table includes the datatype and text properties for each node. These properties aren’t available by specifying metaproperty column patterns.
An edge table can be useful for calculating summary information about the document. For example, a simple way to find out how many different items have been ordered is to count the Item elements this way:
SELECT COUNT(*) ItemCount FROM OpenXML(@iTree, ‘Order’) WHERE localname=‘Item’
This query returns the number of nodes with the name Item in the document.