OPENXML can be utilized whenever a rowset provider is necessary, such as a table or view.This includes SELECT, SELECT INTO, and OPENROWSET. A preliminary operation must be accomplished, however, before OPENXML can be used to write queries. sp_xml_preparedocument is a SQL Server internal stored procedure that takes an XML document and parses it into a memory representation of the XML document tree. I say that this is an internal stored procedure because you can't view it in the normal manner of viewing a stored procedure. If you try to view the procedure using SP_HELPTEXT, you'll see what is displayed in Figure 8.1. Figure 8.1. Attempting to view sp_xml_preparedocument .
sp_xml_preparedocument returns a file identifier ( handle ) after parsing it in memory. OPENXML takes this handle and, through a process known as shredding , provides a rowset representation of the data based on what is passed to it. After processing is complete, another internal stored procedure must be called to flush the parsed document from memory. This procedure is sp_xml_removedocument . The entire process is diagrammed in Figure 8.2. Figure 8.2. The OPENXML process.
A parsed document is stored in internal cache. The MSXML parser uses one- eighth the total memory available for SQL Server. This means that if you have a SQL Server with 320MB of memory available, the MSXML parser will use 40MB of this memory. To avoid overutilization of RAM, run sp_xml_removedocument to free up the memory as soon as possible. OPENXML SyntaxA sample OPENXML statement looks similar to this: FROM OPENXML (@idoc, '/ROOT/Employee', 1) WITH (EmployeeID varchar(4), LastName varchar(20), FirstName varchar(10)) We'll be examining this new keyword with detailed examples shortly, but for now, here is the syntax specification for OPENXML: OPENXML ( handle, 'rowpattern', [flags] ) [WITH SchemaDeclaration] The various arguments of this function are described in the following sections. handleThis is the handle returned by sp_xml_preparedocument that points to the parsed XML document in SQL Server memory. It serves the identical function as a file handle in other programming languages. This handle is valid for the duration of the connection to SQL Server, until the connection is reset, or until sp_xml_removedocument is executed, thereby dropping the handle and deleting the document from memory. rowpatternThis is the familiar XPath expression that we've been using for the majority of this book. For every node identified by the XPath expression there is a row generated by OPENXML in the rowset. Because XPath can identify any type of node (attribute, text, or otherwise ), if rowpattern ends in a text node, a row is generated for each text node identified by rowpattern . SchemaDeclarationOPENXML requires a custom schema created in the OPENXML statement to specify rowset structure. The WITH clause can be used to accomplish this, but there are a couple of ways of doing it. Here are our options:
Table 8.1. The Structure of the Edge Table
flagsThe flags argument specifies whether the mapping between the rowset columns and the XML document nodes will be element-centric (elements nested within other elements), attribute-centric (elements with data expressed as attribute values), or a mixture of both. Flags can have the following three values:
In addition to the flags argument, there is another way to define mappings. The ColPattern parameter, which is actually specified in the SchemaDeclaration through use of the WITH clause, also can be used. If it is used, it overrides the setting of the flags argument. ColPattern is used under the following two circumstances:
Both flags and ColPattern are optional parameters and, if missing, cause the default value of attribute-centric mapping to be used. OPENXML ExamplesThis section contains a lot of examples demonstrating the uses of OPENXML. We'll be utilizing stored procedures here almost exclusively. Let's start with a simple SELECT statement and OPENXML. Here we'll use the <Employee> , <Order> , and <OrderDetail> elements and return a rowset consisting of EmployeeID , LastName , and FirstName . We'll perform the following steps:
Listing 8.1 contains the stored procedure we'll use in this example. In this example, ColPattern need not be specified because the rowset column names correspond to the XML attribute names. Listing 8.1 Simple Select Stored ProcedureDECLARE @idoc int DECLARE @xmldoc varchar(1000) SET @xmldoc =' <ROOT> <Employee EmployeeID="5" LastName="Buchanan" FirstName="Steven" > <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"> <OrderDetail ProductID="42" Quantity="10"/> <OrderDetail ProductID="72" Quantity="5"/> </Order> </Employee> <Employee EmployeeID="7" LastName="King" FirstName="Robert" > <Order OrderID="10303" CustomerID="GODOS" EmployeeID="7" OrderDate="1996-09-11T00:00:00"> <OrderDetail ProductID="40" Quantity="40" /> </Order> </Employee> </ROOT>' -- Create an in memory representation of the document. EXEC sp_xml_preparedocument @idoc OUTPUT, @xmldoc SELECT * FROM OPENXML (@idoc, '/ROOT/Employee', 1) WITH (EmployeeID varchar(4), LastName varchar(20), FirstName varchar(10)) EXEC sp_xml_removedocument @idoc The following example shows the results of the simple select stored procedure in Listing 8.1. EmployeeID LastName FirstName ---------- -------------------- ---------- 5 Buchanan Steven 7 King Robert That was an easy one, but it represents the essence of this entire chapter. We've taken an XML document and turned it into a rowset data provider, which lets us use it in any situation that requires data (table or view, for example, remember?). Let's modify the procedure slightly by changing the flags parameter to 2 (element-centric) and seeing what happens.The line and value to change is as follows : FROM OPENXML (@idoc, '/ROOT/Employee', 2) Refer to the section "OPENXML Syntax" at the beginning of the chapter, if necessary, to understand the layout of the parameters. The following example shows the results of changing flags to element-centric. EmployeeID LastName FirstName ---------- -------------------- ---------- NULL NULL NULL NULL NULL NULL Not quite what you expected? EmployeeID , LastName , and FirstName are attributes, not child elements, so when you specified element-centric mapping, you were referring to nonexistent elements. That's why the NULLs came back. To get the element-centric mapping to work ( flags = 2 ), you'll need to define your elements similar to the XML document defined in the stored procedure in Listing 8.2. The results obtained from this procedure are identical to the results of the simple select stored procedure in Listing 8.1. Listing 8.2 Simple Select with Element OrientationDECLARE @idoc int DECLARE @xmldoc varchar(1000) SET @xmldoc =' <ROOT> <Employee> <EmployeeID>5</EmployeeID> <LastName>Buchanan</LastName> <FirstName>Steven</FirstName> <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"> <OrderDetail ProductID="42" Quantity="10"/> <OrderDetail ProductID="72" Quantity="5"/> </Order> </Employee> <Employee> <EmployeeID>7</EmployeeID> <LastName>King</LastName> <FirstName>Robert</FirstName> <Order OrderID="10303" CustomerID="GODOS" EmployeeID="7" OrderDate="1996-09-11T00:00:00"> <OrderDetail ProductID="40" Quantity="40"/> </Order> </Employee> </ROOT>' -- Create an in memory representation of the document. EXEC sp_xml_preparedocument @idoc OUTPUT, @xmldoc SELECT * FROM OPENXML (@idoc, '/ROOT/Employee', 2) WITH (EmployeeID varchar(4), LastName varchar(20), FirstName varchar(10)) EXEC sp_xml_removedocument @idoc Now let's see what the ColPattern specification in the WITH clause will do for us. Take a look at Listing 8.3. The results of this stored procedure are given in Listing 8.4. Listing 8.3 Stored Procedure Utilizing the ColPattern ParameterDECLARE @idoc int DECLARE @xmldoc varchar(1000) SET @xmldoc =' <ROOT> <Employee EmployeeID="5" LastName="Buchanan" FirstName="Steven"> <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"> <OrderDetail ProductID="42" Quantity="10"/> <OrderDetail ProductID="72" Quantity="5"/> </Order> </Employee> <Employee EmployeeID="7" LastName="King" FirstName="Robert"> <Order OrderID="10303" CustomerID="GODOS" EmployeeID="7" OrderDate="1996-09-11T00:00:00"> <OrderDetail ProductID="40" Quantity="40" /> </Order> </Employee> </ROOT>' -- Create an in memory representation of the document. EXEC sp_xml_preparedocument @idoc OUTPUT, @xmldoc SELECT * FROM OPENXML (@idoc, '/ROOT/Employee/Order/OrderDetail', 2) WITH (OrderID int '../@OrderID', CustomerID nvarchar(10) '../@CustomerID', Lastname nvarchar(10) '../../@LastName', FirstName nvarchar(10) '/Employee/@FirstName', ProductID int '@ProductID', Quant int '@Quantity') EXEC sp_xml_removedocument @idoc Listing 8.4 Results of the ColPattern Parameter in a Stored ProcedureOrderID CustomerID Lastname FirstName ProductID Quant ----------- ---------- ---------- ---------- ----------- ----------- 10248 VINET Buchanan NULL 42 10 10248 VINET Buchanan NULL 72 5 10303 GODOS King NULL 40 40 Let's look at some of the important points of this procedure.
Listing 8.5 A Stored Procedure with Both Types of MappingsDECLARE @idoc int DECLARE @xmldoc varchar(1000) SET @xmldoc =' <ROOT> <Employee EmployeeID="5"> <LastName>Buchanan</LastName> <FirstName>Steven</FirstName> <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"> <OrderDetail ProductID="42" Quantity="10"/> <OrderDetail ProductID="72" Quantity="5"/> </Order> </Employee> <Employee EmployeeID="7"> <LastName>King</LastName> <FirstName>Robert</FirstName> <Order OrderID="10303" CustomerID="GODOS" EmployeeID="7" OrderDate="1996-09-11T00:00:00"> <OrderDetail ProductID="40" Quantity="40"/> </Order> </Employee> </ROOT>' -- Create an in memory representation of the document. EXEC sp_xml_preparedocument @idoc OUTPUT, @xmldoc SELECT * FROM OPENXML (@idoc, '/ROOT/Employee', 3) WITH (EmployeeID int , LastName nvarchar(10), FirstName nvarchar(10)) EXEC sp_xml_removedocument @idoc The results of the stored procedure in Listing 8.5 are as follows: EmployeeID LastName FirstName ----------- ---------- ---------- 5 Buchanan Steven 7 King Robert Here, attribute-centric mapping is applied first for EmployeeID , and then element-centric mapping is applied because LastName and FirstName are left over. Retrieving a text node is not much different from the types of data retrievals we've done so far.We'll utilize the XPath function text( ) in the ColPattern specification. Look at the stored procedure in Listing 8.6.We specify a column named Remark that extracts the text node associated with the Order element from the XML document. Listing 8.6 Stored Procedure Containing Text NodesDECLARE @idoc int DECLARE @xmldoc varchar(1000) SET @xmldoc =' <ROOT> <Employee EmployeeID="5"> <LastName>Buchanan</LastName> <FirstName>Steven</FirstName> <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"> <OrderDetail ProductID="42" Quantity="10"/> <OrderDetail ProductID="72" Quantity="5"/>Customer returned merchandise </Order> </Employee> <Employee EmployeeID="7"> <LastName>King</LastName> <FirstName>Robert</FirstName> <Order OrderID="10303" CustomerID="GODOS" EmployeeID="7" OrderDate="1996-09-11T00:00:00" note="Almost out of inventory"> <Priority>First Class</Priority> <OrderDetail ProductID="40" Quantity="40"/>Customer needs this ASAP </Order> </Employee> </ROOT>' EXEC sp_xml_preparedocument @idoc OUTPUT, @xmldoc SELECT * FROM OPENXML (@idoc, '/ROOT/Employee/Order', 1) WITH (OrderID int , EmployeeID nvarchar(10), Remark ntext 'text( )') EXEC sp_xml_removedocument @idoc The following example shows the results of ColPattern and the text( ) function in Listing 8.6. OrderID EmployeeID Remark ----------- ---------- ------ 10248 5 Customer returned merchandise 10303 7 Customer needs this ASAP Notice that only the text nodes of the Order element are retrieved and placed in the results. All other text comments are ignored. In the section, "SchemaDeclaration" I said that an existing table could be specified in the OPENXML clause that would dictate the schema of the result. Let's see how that works. Listing 8.7 illustrates a stored procedure that does exactly what we're talking about. Because we have to have a preexisting table, the first line in the procedure will create one for us to utilize in the OPENXML clause. Listing 8.7 Specifying a Table Name in the OPENXML Clause--create a test table create table Table1(OrderID int, OrderDate datetime, EmployeeID int) DECLARE @idoc int DECLARE @xmldoc varchar(1000) SET @xmldoc =' <ROOT> <Employee EmployeeID="5"> <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"> <OrderDetail ProductID="42" Quantity="10"/> <OrderDetail ProductID="72" Quantity="5"/>Customer returned merchandise </Order> </Employee> <Employee EmployeeID="7"> <Order OrderID="10303" CustomerID="GODOS" EmployeeID="7" OrderDate="1996-09-11T00:00:00" note="Almost out of inventory"> <Priority>First Class</Priority> <OrderDetail ProductID="40" Quantity="40" />Customer needs this ASAP </Order> </Employee> </ROOT>' -- Create an in memory representation of the document. EXEC sp_xml_preparedocument @idoc OUTPUT, @xmldoc SELECT * FROM OPENXML (@idoc, '/ROOT/Employee/Order', 1) WITH Table1 EXEC sp_xml_removedocument @idoc The following example shows the results of using a table instead of a SchemaDescription in Listing 8.7. OrderID OrderDate EmployeeID ----------- ------------------------------------------------- ----------- 10248 1996-07-04 00:00:00.000 5 10303 1996-09-11 00:00:00.000 7 One thing to remember here is that you can't utilize the ColPattern parameter in this situation.You are limited to the structure of the table called by the OPENXML clause. Let's take a look at the edge table format we talked about earlier. Remember that an edge table format is produced when the WITH clause is left out of the OPENXML clause. This example will be a little different than the examples to this point. Listing 8.8 shows the procedure that illustrates the generation of an edge table. This procedure includes two edge table queries toward the bottom of the procedure. The results of the two queries are given in Listing 8.9. Listing 8.8 Generating an Edge TableDECLARE @idoc int DECLARE @xmldoc varchar(1000) SET @xmldoc =' <ROOT> <Employee EmployeeID="5"> <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"> <OrderDetail ProductID="42" Quantity="10"/> <OrderDetail ProductID="72" Quantity="5"/> </Order> </Employee> <Employee EmployeeID="7"> <Order OrderID="10303" CustomerID="GODOS" EmployeeID="7" OrderDate="1996-09-11T00:00:00"> <OrderDetail ProductID="40" Quantity="40" /> </Order> </Employee> </ROOT>' -- Create an in memory representation of the document. EXEC sp_xml_preparedocument @idoc OUTPUT, @xmldoc SELECT * FROM OPENXML (@idoc, '/ROOT/Employee') ---------- query the edge table ------------------------------------------------- SELECT count(*) FROM OPENXML(@idoc, '/') where localname = 'EmployeeID' SELECT DISTINCT localname FROM OPENXML(@idoc, '/') where nodetype = 2 -------------------------------------------------------------------------------- EXEC sp_xml_removedocument @idoc Listing 8.9 Results of Querying an Edge Table4 (1 row(s) affected) localname ---------------------------- CustomerID EmployeeID OrderDate OrderID ProductID Quantity (6 row(s) affected). Notice the different way of querying an edge table. The OPENXML expression is specified directly in the WHERE clause. Next up is a discussion of metaproperties, which enable us to obtain more information about a document than is normally obtainable from just the textual representation of the XML document. |