XML to the Rescue

Extensible Markup Language, or XML, has in recent times become one of the most talked about technologies since the relational database. Indeed, if you believed everything you heard, you’d assume that XML is the solution to just about every software development problem in the world. Of course it’s not, and we need to be wary of using it inappropriately, but it is a tremendously useful technology for building integration solutions.

The key to identifying solutions in which XML is useful is to understand what it does well. And the thing that XML does well is describe data. In the same way that relational databases provide an efficient way to store business data, XML provides a great way to communicate business data.

XML is well-suited to solving integration problems for a number of reasons. First, it’s completely neutral with respect to platform, operating system, programming language, and so on. XML documents are simply streams of text that can be sent and received by any application on any platform, in a similar way to its close relative HTML. Second, XML is an Internet standard, approved by the World Wide Web Consortium (W3C), so parsers for reading and processing XML documents are available on almost any platform you care to mention—including Microsoft Windows, UNIX, LINUX, and Macintosh. Third, developers can make use of a number of related standards for defining, processing, and transforming XML documents, including Document Type Definitions (DTDs), XPath query language, the Document Object Model (DOM), the Simple API for XML (SAX), and Extensible Stylesheet Language (XSL). Additionally, more XML-related standards are in the process of being approved, including XML schemas, which provide a way to define XML business documents.

Microsoft supports the XML-Data W3C recommendation for XML schemas, and many Microsoft products use XML-Data Reduced (XDR) schemas, which use a subset of the XML-Data definition. Microsoft has confirmed that when it is approved, Microsoft products will support the final standard for XML schemas as well as the current XDR implementation.

Representing Business Entities with XML

So, XML and relational databases are both concerned with representing business entities. The key thing is to understand the differences in the way that they do it.

Relational databases represent entities using tables; XML does it using documents. An instance of an entity in a relational database is represented by a row in a table, while in XML an instance of an entity is represented by an element in a document. So far, so good. However, when we start to represent an entity’s attributes, the correlation between relational data and XML becomes a little more murky.

Mapping Table Columns to XML

In a relational database, as I’ve said, an entity’s characteristics are represented by columns in a table. In an XML document, the characteristics can be attributes, element values, or subelements. For example, examine the following relational table:

Customers

CustID Name Phone

1001

Graeme

555 111222

1002

Rose

555 222111

To represent this table in XML, we could create an XML document named Customers containing two Customer elements. The columns could be represented in an attribute-centric fashion—the columns in a table are mapped to attributes in an XML document, as shown in the following example:

 <Customers>     <Customer CustID=‘1001’ Name=‘Graeme’ Phone=‘555 111222’/>     <Customer CustID=‘1002’ Name=‘Rose’ Phone=‘555 222111’/> </Customers> 

Alternatively, we could use an element-centric mapping. In this mapping, all columns are returned as subelements of the element representing the table they belong to, as shown in the following example:

 <Customers>     <Customer>          <CustID>1001</CustID>         <Name>Graeme</Name>         <Phone>555 111222</Phone>      </Customer>     <Customer>          <CustID>1002</CustID>         <Name>Rose</Name>         <Phone>555 222111</Phone>     </Customer> </Customers> 

Of course, there’s no reason why a mixed approach couldn’t be taken, as shown in this example:

 <Customers>     <Customer CustID=‘1001’>         Graeme         <Phone>555 111222</Phone>     </Customer>     <Customer CustID=‘1002’>         Rose         <Phone>555 222111</Phone>     </Customer> </Customers> 

This example is interesting because it uses all three ways of representing the characteristics of an entity. CustID is represented as an attribute, Name is represented as the value of the element representing the entity instance, and Phone is represented as a subelement.

The particular mapping of database columns to XML is largely a matter of style, although a couple of considerations are worth bearing in mind. Attribute-centric documents result in smaller XML streams, so for large amounts of data, they’re more efficient. However, each element (and therefore each entity) can have only one of each attribute. Subelements are useful for potentially multivalued characteristics. For example, in the preceding sample code, a customer can have only one ID or name, but he can have more than one telephone number, as shown here:

 <Customers>     <Customer CustID=‘1001’>         Graeme         <Phone>555 111222</Phone>         <Phone>555 111333</Phone>     </Customer>     <Customer CustID=‘1002’>         Rose         <Phone>555 222111</Phone>     </Customer> </Customers> 

So as you can see, you can choose from many ways to represent the same business data using XML. So far, we’ve seen how XML can be used to represent a single entity. Let’s now turn our attention to representing relationships among multiple entities.

Representing Relationships in XML

Relational databases, as their name suggests, are designed to enable you to represent relationships between entities. For example, an order entity can contain one or more item entities, as shown in the following tables:

Orders

OrderNo Date Customer

1235

01/01/2001

1001

1236

01/01/2001

1002

Items

ItemNo OrderNo ProductID Price Quantity

1

1235

1432

12.99

2

2

1235

1678

11.49

1

3

1236

1432

12.99

3

The most common approach to representing this data in XML is to use a nested XML document, as shown here:

 <Orders>     <Order OrderNo=‘1235’ Date=‘01/01/2001’ Customer=‘1001’>         <Item ProductID=‘1432’ Price=‘12.99’ Quantity=‘2’/>         <Item ProductID=‘1678’ Price=‘11.49’ Quantity=‘1’/>     </Order>     <Order OrderNo=‘1236’ Date=‘01/01/2001’ Customer=‘1002’>         <Item ProductID=‘1432’ Price=‘12.99’ Quantity=‘3’/>     </Order> </Orders> 

In most circumstances, XML documents such as this will be used to exchange data that involves relationships. However, for large data transfers where the elimination of duplication is important in order to keep the document size down, an alternative approach can be taken. XML-Data schemas support the use of the ID, IDREF, and IDREFS data types when you’re defining XML attributes, and you can use this strategy to create relationships between entities in XML documents.

For example, suppose a supplier needed to send a catalog document listing all products by category. You could use the following schema to define the XML catalog document elements:

 <Schema name=‘catalogschema’     xmlns=‘urn:schemas-microsoft-com:xml-data’     xmlns:dt=‘urn:schemas-microsoft-com:datatypes’>     <ElementType name=‘Category’ model=‘closed’>         <AttributeType name=‘CategoryID’ dt:type=‘id’/>         <AttributeType name=‘CategoryName’ dt:type=‘string’/>         <attribute type=‘CategoryID’/>         <attribute type=‘CategoryName’/>     </ElementType>     <ElementType name=‘Product’ model=‘closed’>         <AttributeType name=‘ProductID’ dt:type=‘i4’/>         <AttributeType name=‘ProductName’ dt:type=‘string’/>         <AttributeType name=‘Category’ dt:type=‘idref’/>         <attribute type=‘ProductID’/>         <attribute type=‘ProductName’/>         <attribute type=‘Category’/>     </ElementType>     <ElementType name=‘Catalog’ content=‘eltOnly’ model=‘closed’>         <element type=‘Category’ maxOccurs=‘*’/>         <element type=‘Product’ maxOccurs=‘*’/>     </ElementType> </Schema> 

This schema defines a Category element and a Product element. The Category element has two attributes (CateogryID and CategoryName), and the Product element has three attributes (ProductID, ProductName, and Category). Finally, this schema defines a Catalog element that can contain multiple Category and Product elements. Using this schema, the catalog data could then be represented by the following XML document:

 <Catalog xmlns=‘x-schema:catalogschema.xml’>     <Category CategoryID=‘1’ CategoryName=‘Games’/>     <Category CategoryID=‘2’ CategoryName=‘Educational’/>     <Product ProductID=‘131’ ProductName=‘TicTacToe’ Category=‘1’/>     <Product ProductID=‘1432’ ProductName=‘Chess’ Category=‘1’/>     <Product ProductID=‘1678’ ProductName=‘Spelling’ Category=‘2’/> </Catalog> 

You can see an example of an XML document based on a schema by viewing Catalog.xml in the Demos\Chapter1 folder on the companion CD. Because the schema defines the CategoryID attribute in the Category element as an ID field and the Category attribute in the Product element as an IDREF field, a link can be discerned between products and categories. When processing the document using the Microsoft implementation of the DOM, you can use the nodeFromID method of the XMLDOMDocument object to retrieve related data for a given element. An IDREFS datatype could have been used to allow products to belong to multiple categories, represented by a comma-delimited list.

Explicit references using ID, IDREF, and IDREFS datatypes rely on XML parsers that can use XML-Data schemas. Because the standard for schemas isn't yet officially defined, many parsers don’t support the XML-Data grammar and therefore can’t use this technique to represent relational data. As an alternative approach, XSL Transformation (XSLT) style sheets containing the <xsl:key> instruction are often used to define relationships in XML documents. For more information about XSLT, visit www.w3c.org/TR/xslt.



Programming Microsoft SQL Server 2000 With Xml
Programming Microsoft SQL Server(TM) 2000 with XML (Pro-Developer)
ISBN: 0735613699
EAN: 2147483647
Year: 2005
Pages: 89

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net