Generating Rowsets with the OpenXML Function

The OpenXML function returns a rowset containing data from an XML document. The rows and columns in the rowset are determined by parameter values that you can specify in Transact-SQL. Here’s the full syntax of the OpenXML function:

 OPENXML(iDoc, rowpattern, flags) [WITH (rowsetschema [colpatterns] | tablename)] 

The use of some of these parameters is immediately obvious, but the use of others is a little more complex. Let’s examine the OpenXML function syntax in the sample code I showed you earlier more closely.

 OPENXML(@iTree, ‘Order’, 1) WITH (OrderID INTEGER,       CustomerID nCHAR(5),       EmployeeID INTEGER,       OrderDate DATETIME) 

You can view this Transact-SQL code in the SimpleOrder.sql script in the Demos\Chapter7 folder on the companion CD. First we use the iDoc parameter (named @iTree in our code sample) to specify the internal node tree representing the XML document. This parameter contains the value returned by the sp_xml_preparedocument stored procedure.

Next we use the rowpattern parameter to define the elements in the XML document that should be returned. The value itself, in this case Order, is an XPath expression determining the point in the XML tree at which the search should begin and can include a criteria expression to limit the rows returned. Conceptually, the rowpattern parameter is the equivalent of the FROM and WHERE clauses in a Transact-SQL query.

The flags parameter tells SQL Server to search for attributes, subelements, or both. This value can be 0 (the default, which is attributes), 1 (attributes), or 2 (elements). To search for both attributes and elements, you need to combine the values to produce 3. In the sample code, the OpenXML function returns rowsets containing the attributes of the Order elements in the XML document.

We use the WITH clause to define the structure of the rowset. Essentially, the WITH clause specifies the columns that will be returned. In this respect, the WITH clause is conceptually similar to the SELECT clause in a Transact-SQL query. You can take one of two approaches to defining the rowset structure. You can declare the columns in the rowset schema explicitly, the way you would do in a CREATE TABLE statement, or you can specify the name of an existing table that has the required structure. The WITH clause provides the necessary column mapping information to create a rowset from the XML data. The columns in the rowset will map to the XML attributes or subelements (depending on the flags value) of the elements specified in the rowpattern parameter, with matching names and compatible data types. If you choose to define the columns explicitly, you can return columns from other parts of the document, or with nonmatching names, by specifying a colpattern parameter. This parameter is an XPath expression, relative to each element returned by the rowpattern parameter, which is used to navigate the tree and retrieve a specified attribute or element value.

The rowset schema in the sample code is declared explicitly, with no colpattern parameters specified, so the OpenXML function returns the OrderID, CustomerID, EmployeeID, and OrderDate attributes from each Order element in the XML document. For a clearer understanding of the function’s behavior, suppose the following XML document is passed to our InsertOrder stored procedure:

 <?xml version="1.0"?> <Order Order Customer      OrderDate="01/01/2001" Employe e/> 

The OpenXML function in the procedure would return the following rowset:

OrderID CustomerID EmployeeID OrderDate

1001

ALFKI

2

2001-01-01 00:00:00.000

In the following sections, we’ll examine the details of the OpenXML function more closely.

Specifying the Row Pattern

Let’s suppose that we receive a more complex document. For example, a purchase order would probably contain a hierarchy of order information, as shown in this example:

 <?xml version="1.0"?> <Order Order Customer      Employee OrderDate="01/01/2001">     <Items>         <Item Product Qty="1" UnitPrice="12.99">             <Discount>0</Discount>         </Item>         <Item Product Qty="2" UnitPrice="4.99">             <Discount>0.5</Discount>         </Item>         <Item Product Qty="1" UnitPrice="11.99">             <Discount>0</Discount>         </Item>     </Items> </Order> 

This document contains multiple hierarchical levels from which rows could be returned.

When represented as a tree structure, this document would create the following nodes of elements and attributes. (Attributes are prefixed with the @ symbol.)

 
  • / (root)
    • Order (@OrderID, @CustomerID, @EmployeeID, @OrderDate)
      • Items
        • Item (@ProductID, @Qty, @UnitPrice)
          • Discount
        • Item (@ProductID, @Qty, @UnitPrice)
          • Discount
        • Item (@ProductID, @Qty, @UnitPrice)
          • Discount

You can use the rowpattern parameter in an OpenXML function to do two things: It can specify the level of the hierarchy you want to search, and it can limit the rows returned by specifying criteria based on the values of elements and attributes in the document.

The simplest XPath expression you can specify in the rowpattern parameter is the root of the document (/), which isn’t a particularly useful row pattern because it allows you to retrieve only the root of the document. The rowset generated by OpenXML will then have a single column based on the root element—in this case Order —containing the values of all elements in the document in the form of a space-delimited string. For example, the following Transact-SQL statement would return a single column named Order with the value 0 0.5 0 if you used it to retrieve data from the preceding XML sample code.

 SELECT * FROM OPENXML(@iTree, ‘/’, 2) WITH ([Order] VARCHAR(10)) 

To retrieve a more useful rowset, the rowpattern parameter should specify the path to the level in the document from which you require data. You define the path as an XPath expression in which the nodes in the tree, separated by / delimiters, are identified. For example, to retrieve data from the Order element, you should specify the XPath expression /Order; if you want to drill down to the Item level, you need to use the XPath expression /Order/Items/Item. The initial /, which represents the root, can be omitted, allowing you to specify simply Order as the XPath expression for the Order element. You could use the following sample code to retrieve the attributes of the Item elements in the preceding purchase order document:

 SELECT * FROM OPENXML(@iTree, ‘Order/Items/Item’, 1) WITH (ProductID INTEGER,  Qty INTEGER,  UnitPrice MONEY) 

This code returns the following rowset:

ProductID Qty UnitPrice

11

1

12.9900

17

2

4.9900

21

1

11.9900

This Transact-SQL code can be viewed in the ComplexOrder.sql script in the Demos\Chapter7 folder on the companion CD. The rowpattern parameter can limit the rows being returned by including an XPath expression that defines some selection criteria. For example, the following sample code could be used to return a rowset that contains all items of which the customer has ordered more than one unit:

 SELECT * FROM OPENXML(@iTree, ‘Order/Items/Item[@Qty>1]’, 1) WITH (ProductID INTEGER,  Qty INTEGER,  UnitPrice MONEY) 

This code returns the following rowset:

ProductID Qty UnitPrice

17

2

4.9900

Using Flags to Retrieve Attributes and Elements

The flags parameter determines whether the OpenXML function searches for attributes or elements by default. As I said earlier, you use the value 1 to denote an attribute-centric search, 2 for an element-centric search, and you can combine the values into 3, in which case both attributes and elements are searched.

The following code sample returns the ProductID, Qty, and UnitPrice attributes, as well as the Discount subelement value, from the Item elements in our sample document:

 SELECT * FROM OPENXML(@iTree, ‘Order/Items/Item’, 3) WITH (ProductID INTEGER,  Qty INTEGER,  UnitPrice MONEY,  Discount REAL) 

This code produces the following rowset:

ProductID Qty UnitPrice Discount

11

1

12.9900

0.0

17

2

4.9900

0.5

21

1

11.9900

0.0

You can view this Transact-SQL code in the ComplexOrderFlags.sql script in the Demos\Chapter7 folder on the companion CD.

Combining element and attribute flag values might impact application performance because the function needs to search both the attribute nodes and the element nodes for each element specified by the rowpattern parameter. Using a column pattern in the WITH clause can be a more efficient way to combine elements and attributes. I’ll talk about column patterns later in this chapter.

Defining the Rowset Schema

The columns in the rowset returned by the OpenXML function are determined by the WITH clause. So far we’ve seen sample code in which I declared the rowset schema explicitly by listing the columns and their data types in the WITH clause. When you use this approach, the column names must match the names of the attributes or elements being retrieved and the specified data type must be compatible with the values in the XML document.

Using Column Patterns

You can customize the rowset schema definition by specifying column patterns. A column pattern is an XPath expression that identifies a node in the document relative to the element defined by the rowpattern parameter. Using column patterns allows you to solve two common problems:

  • The name of the attribute doesn’t match the name you want to use for the column in the rowset.
  • The required data isn’t within the immediate scope defined by the rowpattern and flags parameters.

Let’s look at each of these problems in turn. First suppose you want to return the Qty attribute of the Item element as a column named Quantity. You could use the following Transact-SQL code:

 SELECT * FROM OPENXML(@iTree, ‘Order/Items/Item’, 1) WITH (ProductID INTEGER,  Quantity INTEGER ‘@Qty’) 

Note that the XPath expression identifying the Qty attribute is used to map to the Quantity column in the WITH clause. This code produces the following rowset:

ProductID Quantity

11

1

17

2

21

1

The second problem addressed by column patterns is a little more complex. To understand the nature of this problem, consider the schema of the following Order Details table definition:

 CREATE TABLE [Order Details] (OrderID INTEGER,  ProductID INTEGER,  Qty INTEGER,  UnitPrice MONEY,  Discount REAL) 

To retrieve a rowset from our purchase order document that could be inserted into this table, we need to be able to generate an OrderID column. However, the order ID in the XML document is represented by the OrderID attribute of the Order element, which is two levels above the Item level used to retrieve the rest of the columns in this table.

To solve this problem, you can use a column pattern specifying an XPath expression that navigates back up the tree hierarchy, as shown in this sample code:

 SELECT * FROM OPENXML(@iTree, ‘Order/Items/Item’, 1) WITH (OrderID INTEGER ‘../../@OrderID’,  ProductID INTEGER,  Qty INTEGER,  UnitPrice MONEY,  Discount REAL ‘Discount’) 

Note that the XPath expression . ./../@OrderID defines the OrderID attribute of the element two levels above the current Item element. This code retrieves the value of the OrderID attribute of the Order element. Notice also that a column pattern identifies the Discount element one level below the current one. This code allows us to retrieve the Discount subelement without combining element and attribute flag values. The following rowset is returned by this code:

OrderID ProductID Qty UnitPrice Discount

1001

11

1

12.9900

0.0

1001

17

2

4.9900

0.5

1001

21

1

11.9900

0.0

This Transact-SQL code can be viewed in the ComplexOrderColPatterns.sql script in the Demos\Chapter7 folder on the companion CD. The ability to retrieve data from anywhere in the XML hierarchy is extremely useful when, for example, you’re inserting a foreign key into a table because XML documents often nest data in such a way that the primary key of a parent element is not repeated as a foreign key in the child elements.

Using a Table Name to Define the Rowset Schema

Sometimes the data in an XML element matches the structure of an existing table. In this case, you can specify the name of the table instead of defining the columns explicitly. For example, suppose you created an Orders table using the following schema:

 CREATE TABLE Orders (OrderID INTEGER,  CustomerID nCHAR(5),  EmployeeID INTEGER,  OrderDate DATETIME) 

You could use the following Transact-SQL statement to retrieve data from the purchase order XML document:

 SELECT * FROM OPENXML(@iTree, ‘Order’, 1) WITH Orders 

This code would return the rowset shown here:

OrderID CustomerID EmployeeID OrderDate

1001

ALFKI

2

2001-01-01 00:00:00.000



Programming Microsoft SQL Server 2000 With Xml
Programming Microsoft SQL Server(TM) 2000 with XML (Pro-Developer)
ISBN: 0735613699
EAN: 2147483647
Year: 2005
Pages: 89

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net