XML Updategrams and OpenXML: Data Manipulation Through XML | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

How can we insert, update, and delete data through XML? Microsoft has released the concept of updategrams that allow us to manipulate XML on the client and pass it back to the server so that it will update the SQL Server database, in a complete XML solution. We also have the ability to insert data with the OpenXML clause. While we have covered different methods of data retrieval, in reality, once we get data, we normally change it, and those changes must be stored in a database. Our updategrams and OpenXML give us the ability to do so with SQL Server 2005.

OpenXML

OpenXML is a T-SQL keyword that provides a result set, which we can equate to a table or a view, over an XML document that is retained in memory. OpenXML allows us access to XML data as if it were a relational recordset by providing a tabular, relational view of the internal representation of an XML document.

OpenXML can be used in SELECT and SELECT INTO statements wherever a table, a view, or OPENROWSET can appear as the source of the query. The syntax for OpenXML is as follows:

 OpenXML (idoc int [in], rowpattern nvarchar[in], [flags byte [in]]) [WITH (SchemaDeclaration | TableName)]

Here is a description of the syntax.

idoc

The idoc is the document handle of the internal representation of an XML document. Calling sp_xml_preparedocument creates this internal representation of the XML document.

rowpattern

The rowpattern refers to the XPath pattern used to identify the nodes to be processed as rows. These are the rows passed from the document handle received in the idoc parameter.

flags

The flags indicates the mapping that should be used between the XML data and the recordset, and how the spill-over column should be filled. Flags is an optional parameter.

SchemaDeclaration

The SchemaDeclaration is the schema definition in the form following this pattern:

 ColName ColType [ColPattern | MetaProperty] [, ColName ColType [ColPattern | MetaProperty] ...]

ColName Colname is the column name in the rowset.

ColType ColType represents the SQL data type of the column. If the column types differ from the underlying XML data type of the attribute, type coercion occurs. If the column is of type timestamp, the present value in the XML document is disregarded when selecting from an OpenXML recordset, and the autofill values are returned.

ColPattern The ColPattern is an optional XPath pattern that describes how the XML nodes should be mapped to columns. If the ColPattern is not specified, the default mapping (attribute-centric or element-centric mapping as specified by flags) takes place. The XPath pattern specified as ColPattern is used to specify the special nature of the mapping (in case of attribute-centric and element-centric mapping) that overwrites or enhances the default mapping indicated by flags. The general XPath pattern specified as ColPattern also supports the metaproperties.

MetaProperty MetaProperty is one of the metaproperties provided by OpenXML. If the metaproperty is specified, the column contains information provided by the metaproperty. The metaproperties allow you to extract information such as relative position and namespace information about XML nodes, which provides more information than is visible in the textual representation.

TableName

If a table with the desired schema already exists and no column patterns are required, the table name can be given instead of SchemaDeclaration.

To write queries against an XML document using OpenXML, you must first call sp_ xml_preparedocument, which parses the XML document and returns a handle to the parsed document that is ready for consumption. The parsed document is a tree representation of various nodes (elements, attributes, text, comment, and so on) in the XML document. The document handle is passed to OpenXML, which then provides a recordset view of the document based on the parameters passed to it.

The internal representation of an XML document must be removed from memory by calling the sp_xml_removedocument system stored procedure to free the memory.

So those are the gory details on the specifics of OpenXML. It all looks good on paper, but what does it all mean? Here is a brief summary, and then we will do a living, breathing example of OpenXML.

Basically, OpenXML allows us to leverage our existing relational model and use that with XML. I can send XML data to the server, execute a stored procedure that inserts or selects data, and then return XML back to the client.

To read BOL is a little weird because the samples have the XML embedded within the stored procedure. It would have been nice to see an example of an XML document being passed as a parameter. So that is exactly what we will do. Let’s create a page that adds new Knowledgebase entries in a support help desk application. We will do that with OpenXML. Here is the stored procedure that we need to create:

 CREATE PROC sp_ins_KB @kb ntext AS     DECLARE @hDoc int     EXEC sp_xml_preparedocument @hDoc OUTPUT, @kb     INSERT INTO KnowledgeBase       SELECT *       FROM OpenXML (@hDoc, '/KnowledgeBase')             WITH KnowledgeBase      EXEC sp_xml_removedocument @hDoc GO

This code is pretty simple. All we are doing is passing the XML as the @kb parameter; executing the sp_xml_preparedocument system stored procedure, which will give us a handle to the XML document; and then calling the INSERT INTO T-SQL against the XML document. So whatever we pass as an XML document will be inserted into the Knowledgebase table. The cool thing is that the XML document can contain multiple records to insert, so we can add many records with a single OpenXML call. An important note: Since we are creating an internal representation of the XML document, we need to run the system stored procedure sp_xml_removedocument to remove the document from memory. Just another example of good housekeeping.

Now we open up our handy Notepad and create a template called insertKB.xml in our XDesk virtual template directory that will accept this XML document as a parameter from an HTML page. Here is what the code should look like:

 <root xmlns:sql='urn:schemas-microsoft-com:xml-sql'> <sql:header> <sql :param name= "jason"><KnowledgeBase/x/sql :param> </sql:header> <sql:query>exec sp_ins_KB @jason </sql:query> </root>

We are accepting the “Jason” parameter from the HTML form. Notice the <sql:param> elements. We used this same technique earlier to pass parameters to our template for the stored procedure. So the template syntax for accepting parameters is exactly the same here. The next element, the <sql:query> element, is also the same as we described earlier in the chapter. We are simply executing a stored procedure and passing it the parameter that we took in from the <sql:param> element.

Last, we create an HTML page that accepts some user input and then actually calls the template file. Here is a sample HTML file that will accomplish just this:

 <html> <META name=VI60_defaultClientScript content=JavaScript> <body>   <form action="http://localhost/xDesk/template/InsertKB.xml"    method="post"> <input type="hidden"  name="jason"> <input type="hidden" name="contenttype" value="text/xml"> <br>         Q#:<input id=kbid value="Q19878" ><br>      Title: <input id=title value="Linux-Crash" ><br>      Description: <input id=details value="Reinstall" ><br>      <input type=submitonclick="UpdateKB(kb,kbid,title,details) "                 value= "Update Knowledgebase" ><br><br> <script>     function UpdateKB (kb, kbid, title, details)      {       kb.value = '<KnowledgeBase       KB Title=" ' + title. value +       ' " Details=" ' + details.value + ' "/>';      } </script> </form> </body> </html>

This HTML page has a few input boxes that allow the user to enter in an Article Number, a Title, and a Description for the Knowledgebase article. On the Submit action, we execute the UpdateKB script, which builds the XML document, and then the FORM action takes over and calls our InsertKB template, which inserts the XML document into the database.

Basically, we are building an XML document and passing the XML document to a template; the template in turn passes the XML document to the stored procedure, which uses OpenXML to read the XML and insert the data into the SQL Server table.

Keep in mind that, in our example, we have assumed that the schema of the table (field names) is the same as the elements that we have sent over in the XML document. This works fine in a perfect world. There may be times when you need to match elements from the XML elements to field names in the database. With OpenXML, this is no problem at all. Consider the following WITH clause in the OpenXML statement:

 SELECT * FROM OpenXML (@idoc, '//Customers')       WITH (CustomerID  varchar(10) 'CustomerID',             Zipcode   datetime      'Zip',             StateCode varchar(10)   '@STATE',             Address   varchar (30)  '@MailAddress')

In the preceding example, we have elements being passed by the XML document that do not match the field names in the database. In order to correctly match up the elements to the fields, we are using a schemadeclaration that will map the fields correctly according to the rowpattern specified in the OpenXML statement. So this is another example of the great flexibility that OpenXML gives us in tailoring our inserts to the database.

UpdateGrams

Updategrams consist of “blocks” of XML that describe which parts of our data we want to change. The XML data is enclosed in the < sync> element. Within this element, there are <before> and <after> elements indicating the blocks of XML that represent XML data before the changes and after the changes. SQL Server will receive this as a T-SQL update, insert, or delete query, depending on what XML the <before> and <after> blocks consist of.

For example, the XML contained within the <before> may contain a primary key element with the actual primary key data value. The <after> element will contain XML elements that contain the updated values for the matching element/field value pairs. OLE DB, running in the middle tier through xmlisapi.dll, then creates the correct T-SQL update statement that will update the database with the correct data values.

The syntax for the updategram is as follows:

 <ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram">  <updg:sync>       <updg:before>             <TABLENAME [updg:] col="value" col="value" />       </updg:before>       <updg:after>              <TABLENAME [updg:] [updg:at-identity="value"]              col="value" col="value" />       </updg:after>  </updg: sync> </ROOT>

Let’s do a simple Update, Insert, and Delete updategram to demonstrate how we can use this feature in real life. We will use the Customers table from the Northwind database.

Example 1: Update

We will use Notepad to create a template file that we will execute from a URL.

 <ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram"> <updg:sync > <updg:before>       <Incidents Incident  /> </updg:before> <updg:after>       <Incidents DateResolved= "7-29-2000" /> </updg:after> </updg:sync> </ROOT>

Save this file as UpdateIncidents.xml in the XDesk template virtual root, and type the following in the browser to execute our updategram:

 http://localhost/XDesk/template/Updatelncidents.xml

What we have done is to pass the IncidentID field as the initial element value that we are searching for in the Incidents table in the <before> block. In the <after> block, we pass the element attribute value DateResolved to instruct the updategram to update this field in the database. Even though we are passing only a single field to update, we can pass as many attributes or elements as we need in the <after> block to update fields in the database.

Example 2: Insert

In the Update updategram, we passed data in both the <before> and <after> blocks. In order to insert new values into the database, we do not include any elements or attributes in the <before> block. Here is an example of inserting a new record into the Incidents table:

 <ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram"> <updg: sync> <updg:before> </updg:before> <updg:after>       <Incidents Customer OperatingSystem="Linux" /> </updg:after> </updg: sync> </ROOT>

How easy was that? All we do is pass attribute values and the table name, and the data is inserted into the table when we run this template from the URL. What is even more cool is that to insert multiple records, we simply add as many <Incidents> elements with attributes as we like. Here is a code snippet of an <after> block that will insert several records into the Incidents table:

 <updg:after> <Incidents Customer OperatingSystem="Linux" /> <Incidents Customer OperatingSystem="Linux" Description="Unstable" /> <Incidents Customer OperatingSystem="Unix" Keywords="Crash/Burn" /> </updg:after>

So now all of a sudden, I can insert multiple records without all the typing involved in the OpenXML statement. I think this is really cool. Another point that needs to be made is that I am using attributes instead of elements in many of the template examples. This just saves some typing on my part. Either attributes or elements will work fine in any sample.

Example 3: Delete

We have done an Insert and an Update, so Delete is next. I am so amazed at how easy this is so far. There is really nothing to it. I know that you probably can already imagine how we are going to create the Delete template, but we'll go ahead and type it in anyway:

 <ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram"> <updg:sync > <updg:before>       <Incidents Incident /> </updg:before> </updg: sync> </ROOT>

Notice there is no <after> block. Since we do want any after data, SQL Server looks at this as a delete. The actual T-SQL that is generated by the middle tier is

 DELETE FROM Incidents WHERE IncidentID = 10014

Remember, all template processing occurs in the middle tiers, so the actual XML processing occurs in the ISAPI DLLs, which send SQL Server the T-SQL to process.

I know by now you are probably thinking that you will most likely need to pass the update, insert, or delete values as parameters. Well, the good news is you can. Since the updategrams are run though templates, we know that templates can take parameters. If you recall the <header> element, which took the <param> value in the previous templates, we were accepting parameters. We use the same syntax for passing parameters to updategram templates. So to pass, for example, an IncidentID to delete from the Incidents table, we would need to create the following template (I will save this as DeleteGram.xml in the XDesk virtual template root):

 <ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram"> <updg:header>             <updg:param name="IncidentID"/> </updg:header>       <updg:sync >            <updg:before>                  <Incidents Incident />            </updg:before> </updg: sync> </ROOT>

Now from the URL, I will type the following:

 http://localhost/XDesk/template/DeleteGram.xml?IncidentID=1043

Finito. So we have actually created something useful that we can use in real life. How can we pass the value from an HTML page? Just copy the same code we used on the KnowledgeBase page. Modify the “Form Action=” tag to reference the DeleteGram.xml file, and change the OnClick code to pass the ID value that you are entering into the input box. It does not get much easier than this.

If you wish to pass the Update or Insert parameter, no problem. Simply change the parameters that the template accepts and you have a full-fledged data entry middle tier. In summation,

If you specify only <after> data, you are doing an Insert statement.
If you specify only <before> data, you are doing a Delete statement.
If you specify both <before> and <after>, you are doing an Update statement.