Working with XML Data


Like traditional data, XML documents can be added to a database table, altered, removed, and retrieved using SQL Data Manipulation Language statements (INSERT, UPDATE, DELETE, and SELECT statements). Typically, XML documents (as defined in the XML 1.0 specification) are manipulated by application programs; when performing DML operations from an application program, IBM recommends that XML data be manipulated through host variables, rather than literals, so DB2 can use the host variable data type to determine some of the encoding information needed for processing. And although you can manipulate XML data using XML, binary, or character types, IBM recommends that you use XML or binary types to avoid code page conversion issues.

XML data used in an application is often in a serialized string format; when this data is inserted into an XML column or when data in an XML column is updated, it must be converted to its XML hierarchical format. If the application data type used is an XML data type, DB2 performs this operation implicitly. However, if the application data type is a character or binary data type, the XMLPARSE() function must be used to convert the data explicitly from its serialized string format to the XML hierarchical format during insert and update operations. A simple INSERT statement that uses the XMLPARSE() function to insert a string value into an XML column named CUSTINFO in a table named CUSTOMERS might look something like this:

 INSERT INTO customers (custinfo) VALUES   (XMLPARSE(DOCUMENT '<name>John Doe</name>'   PRESERVE WHITESPACE)) 

When the Command Line Processor is used to manipulate XML documents stored in XML columns, string data can be directly assigned to XML columns without an explicit call to the XMLPARSE() function if insert, update, and delete operationsare performed. For example, let's say you want to add a record containing XML data to a table named CUSTOMER that has the following characteristics:

Open table as spreadsheet

Column Name

Data Type

CUSTID

INTEGER

INFO

XML

You could do so by executing an INSERT statement from the Command Line Processor that looks something like this:

 INSERT INTO customer VALUES (1000, '<customerinfo xmlns="http://custrecord.dat" cust>  <name>John Doe</name>  <addr country="United States">   <street>25 East Creek Drive</street>   <city>Raleigh</city>   <state-prov>North Carolina</state-prov>   <zip-pcode>27603</zip-pcode>  </addr>  <phone type="work">919-555-1212</phone>  <email>john.doe@xyz.com</email> </customerinfo>') 

And if you wanted to update the XML data portion of this record from the Command Line Processor, you could do so by executing an UPDATE statement that looks something like this:

 UPDATE customer SET custinfo = '<customerinfo xmlns="http://custrecord.dat" cust>  <name>Jane Doe</name>  <addr country="Canada">   <street>25 East Creek Drive</street>   <city>Raleigh</city>   <state-prov>North Carolina</state-prov>   <zip-pcode>27603</zip-pcode>  </addr>  <phone type="work">919-555-1212</phone>  <email>jane.doe@xyz.com</email> </customerinfo>' WHERE XMLEXISTS ('declare default element namespace"http://custrecord.dat"; $info/customerinfo[name/text()="John Doe"]' PASSING custinfo AS "info") 

Finally, if you wanted to delete the record from the CUSTOMER table, you could do so by executing a DELETE statement from the Command Line Processor that looks something like this:

 DELETE FROM customer WHERE XMLEXISTS ('declare default element namespace "http://custrecord.dat"; $info/customerinfo[name/text()=" John Doe"]' PASSING custinfo AS "info") 

So how do you retrieve XML data once it has been stored in a table? With DB2 9, XML data can be retrieved using an SQL query or one of the SQL/XML query functions available. When querying XML data using SQL, you can only retrieve data at the column level-in other words, an entire XML document must be retrieved. It is not possible to return fragments of a document using SQL; to query within XML documents, you need to use XQuery.

XQuery is a functional programming language that was designed by the World Wide Web Consortium (W3C) to meet specific requirements for querying XML data. Unlike relational data, which is predictable and has a regular structure, XML data is often unpredictable, highly variable, sparse, and self-describing. Because the structure of XML data is unpredictable, the queries that are performed on XML data often differ from typical relational queries. For example, you might need to create XML queries that perform the following operations:

  • Search XML data for objects that are at unknown levels of the hierarchy.

  • Perform structural transformations on the data (for example, you might want to invert a hierarchy).

  • Return results that have mixed types.

In XQuery, expressions are the main building blocks of a query. Expressions can be nested and form the body of a query. A query can also have a prolog that contains a series of declarations that define the processing environment for the query. Thus, if you wanted to retrieve customer names for all customers who reside in North Carolina from XML documents stored in the CUSTINFO column of a table named CUSTOMER (assuming this table has been populated with the INSERT statement we looked at earlier), you could do so by executing an XQuery expression that looks something like this:

 XQUERY declare default element namespace "http://custrecord.dat"; FOR $info IN db2-fn:xmlcolumn('CUSTOMER.CUSTINFO')/customerinfo WHERE $info/addr/state-prov=" North Carolina" RETURN $info/name 

When this XQuery expression is executed from the Command Line Processor, it should return information that looks like this (again, assuming the CUSTOMER table has been populated with the INSERT statement we looked at earlier):

 1 ---------------------------------------------------- <name xmlns="http://custrecord.dat">John Doe</name> 

If you wanted to remove the XML tags and just return the customer name, you could do so by executing an XQuery expression that looks like this instead:

 XQUERY declare default element namespace "http://custrecord.dat"; for $info in db2-fn:xmlcolumn('CUSTOMER.CUSTINFO')/customerinfo where $info/addr/state-prov=" North Carolina" return $info/name/text() 

Now, when the XQuery expression is executed from the Command Line Processor, it should return information that looks like this:

 1 -------- John Doe 

As mentioned earlier, XQuery expressions can be invoked from SQL using any of the following SQL/XML functions or predicates:

  • XMLQUERY(). XMLQUERY() is an SQL scalar function that enables you to execute an XQuery expression from within an SQL context. XMLQUERY() returns an XML value, which is an XML sequence. This sequence can be empty or it can contain one or more items. You can also pass variables to the XQuery expression specified in XMLQUERY().

  • XMLTABLE(). XMLTABLE() is an SQL table function that returns a table from the evaluation of XQuery expressions; XQuery expressions normally return values as a sequence, however, XMLTABLE() allows you to execute an XQuery expression and return values as a table instead. The table that is returned can contain columns of any SQL data type, including XML. The structure of the resulting table is defined by the COLUMNS clause of XMLTABLE().

  • XMLEXISTS. The XMLEXISTS predicate determines whether an XQuery expression returns a sequence of one or more items. If the XQuery expression specified in this predicate returns an empty sequence, XMLEXISTS returns FALSE; otherwise, TRUE is returned. The XMLEXISTS predicate can be used in the WHERE clauses of UPDATE, DELETE, and SELECT statements. This usage means that values from stored XML documents can be used to restrict the set of rows that a DML statement operates on.

By executing XQuery expressions from within the SQL context, you can:

  • Operate on parts of stored XML documents instead of entire XML documents (only XQuery can query within an XML document; SQL alone queries at the whole document level)

  • Enable XML data to participate in SQL queries

  • Operate on both relational and XML data

  • Apply further SQL processing to the returned XML values (for example, ordering results with the ORDER BY clause of a subselect)

Thus, suppose you wanted to retrieve customer IDs and customer names a table named CUSTOMER that has the following characteristics:

Open table as spreadsheet

Column Name

Data Type

CUSTID

INTEGER

INFO

XML

You could do so (assuming this table has been populated with the INSERT statement we looked at earlier) by executing a SELECT statement from the Command Line Processor that looks something like this:

 SELECT custid, XMLQUERY ('declare default element namespace "http:// custrecord.dat"; $d/customerinfo/name' PASSING custinfo AS "d") AS address FROM customer; 

And when this query is executed, it should return information that looks something like this:

 CUSTID    ADDRESS ------    ---------------------  1000     <name xmlns="http://custrecord.dat">John Doe</name> 




DB2 9 Fundamentals Certification Study Guide
DB2 9 Fundamentals: Certification Study Guide
ISBN: 1583470727
EAN: 2147483647
Year: 2007
Pages: 93

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net