Chapter 11: XML in the Data Tier


Although XML is frequently a part of the data tier-because it stores data-this chapter concentrates on XML and databases. Databases, both relational and native XML, are part of many applications. You must, therefore, often store XML in those databases or retrieve some of the data in XML format. While the conversion of data to XML could be done in an intermediate layer of your application, it is sometimes more efficient to do this work within the database itself. This chapter will look at how you can store and retrieve XML from such common databases as Microsoft SQL Server and Oracle 10g. In addition, this chapter will cover XML databases, and their possible role in your applications.

XML and Databases

Although XML is an excellent format for moving data between platforms, applications, or application tiers, it may not be the best format to use for storing data for your application. As the volume of data increases, so too does the time it takes to search for and manipulate that data. Databases, on the other hand, minimize the query times, even for large data sets. Because of this, you might often combine XML and databases in some applications. This can create mismatches: Data types stored in the database may not be in the same format as XML, and the structure of the two are different. XML tends to be more loosely structured or hierarchical, and relational databases (the most common forms today) are designed around tabular data. Moving data in and out of relational databases can be a code-intensive operation. However, some databases are embracing XML, either as a first-class data type or, at least, for enabling querying and indexing semistructured data.

Retrieving Data as XML

Databases are convenient stores for data in many applications, but you also frequently share the data with clients from other platforms or move the data between tiers of your application and computers. Most native database formats are not optimal for this type of data exchange; however, XML is ideally suited for these scenarios. You can convert data stored in a database into XML, providing an excellent cross platform format for data exchange. You can also easily write components to convert the rows and columns of relational data into XML. However, many databases are beginning to provide support for retrieving your queries as XML. This may be provided by proprietary extensions to SQL or via the upcoming SQL/XML standard. SQL/XML (or SQL-X) is a proposed extension to the SQL programming language (see Chapter 14 of that specification if you like reading specification documents). Note that this SQL/XML should not be confused with Microsoft's SQLXML. SQL/XML is a proposed extension to the standard SQL language for working with databases. SQLXML is a Microsoft-specific API for integrating SQL Server and XML. The SQL/XML standard defines a number of new SQL keywords:

  • q XML-a data type to hold XML data

  • q XMLAgg- used to group XML data in GROUP BY queries

  • q XMLAttributes- used to add attributes in XML elements

  • q XMLConcat- used to concatenate two or more XML values

  • q XMLElement-used to transform a relational value into an XML element

  • q XMLForest-used to generate a list of XML elements

  • q XMLNamespaces-used to declare namespaces in an XML element

  • q XMLSerialize-used to serialize an XML value as a character string

These extensions are used in two ways. First, the XML data type becomes a native data type for columns. This provides you with a method for storing XML as XML, rather than as text. In addition, when you work with XML in stored procedures, having an XML type ensures that validity and other rules apply to the data. The second use of the new SQL/XML extensions is to create a standard mechanism for querying data and returning it as XML. The XMLAttributes, XMLElement, XMLForest, and XMLNamespaces operators create the appropriate structures within a SELECT statement. See the Oracle section later in this chapter for samples of using these publishing functions.

In addition to allowing you to directly query using a SQL dialect, many databases are adding support for the relatively new standard XQuery for querying data (see Chapter 11 for more details on XQuery).

Storing XML

When storing XML in a relational database, you usually have three choices:

  • q Shred the XML to fit into the rows and columns of one or more relational tables.

  • q Store the XML in a Binary Large Object (BLOB) or Character Large Object (CLOB) field.

  • q Store the XML in a field specialized for storing and/or indexing XML.

Shredding the XML, or converting into rows and columns, is simple, but requires the most processing. In addition, it also means you must make more decisions: Do you process the XML within the database itself-using stored procedures-or in another component of your application? Using stored procedures might provide better performance, but only if the variant of SQL supported by your database provides XML-handling functions. This adds complexity, however, if the elements and attributes of the XML do not align directly with rows and columns. You may need to perform additional processing or conversion of the data when saving and loading. For example, the XML may hold the full name of a client, when you need to save the first and last names separately in the database. Therefore, you would need to separate the name before storing.

Storing XML in BLOB or CLOB fields (for example, text fields) or even a large character field is a simple solution that works with every database. However, the data is essentially meaningless at this point. It is difficult to query text fields in any meaningful way. You can use whatever full-text search is available from the database, but this is certainly less useful than a search that includes the tags. For example, if you had a BLOB field full of resumés in XML format, searching for candidates with experience in .NET using a full-text search would likely give results cluttered with URL values and possibly become a fishing experience. If the search were aware of the structure of the XML, you would have more luck narrowing the scope of the search.

Finally, you can use a dedicated XML field. Not all databases have a native way of storing XML, and the techniques used to store and index the XML vary. The sections that follow describe the features available for some databases.

XML databases add a fourth choice to storing XML: Store it as is. The XML is stored natively, and queries are usually carried out using XQuery or XPath. Although XML databases are not as prevalent as relational databases, this feature makes them quite attractive in some scenarios.




Professional XML
Professional XML (Programmer to Programmer)
ISBN: 0471777773
EAN: 2147483647
Year: 2004
Pages: 215

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net