Advanced Techniques for Processing Content


In this section I present a few other techniques for your tool kit. The first has to do with overcoming a side effect of one style of XSLT coding. The others have to do with providing some functionality that is often essential to electronic commerce or EAI applications.

Omitting Empty Elements and Attributes

If you look back to HierarchyToFlat.xml, the desired result file for one of the examples above, you'll notice that ShipToStreet2 and ShipToCountry are empty Elements. This may be okay for some applications and schemas, but it may not be for others. These were created because in our xsl:template content we had as a literal result the start and end tags for these Elements, and the select for the xsl:value-of didn't find a match in the source tree. We can tell if we might run into a situation like this by either knowing the business data or reviewing the schema for the source document. If there's a minOccurs of zero on an Element in a sequence, then you should code to avoid creating empty Elements if you can't handle them in your result document.

There is a fairly simple technique for making sure that we don't create such empty Elements. Let's take a simple case of converting a buyer's name and address from one format to another. There may or may not be a BuyerStreet2 Element. We want the result tree to contain a Street2 Element if there is a BuyerStreet2 Element in the source but not to create one if there isn't. The source tree fragment appears below.

Source (MissingElement.xml)
 <?xml version="1.0" encoding="UTF-8"?> <MissingElement>   <BuyerName>My Name</BuyerName>   <BuyerStreet1>My Street</BuyerStreet1>   <BuyerCity>My City</BuyerCity>   <BuyerState>TX</BuyerState>   <BuyerZip>99999</BuyerZip> </MissingElement> 

We want the result to look like the following file, with no empty Street2 Element.

Result (NoEmpties.xml)
 <?xml version="1.0" encoding="UTF-8"?> <NoEmpties>   <Name>My Name</Name>   <Street1>My Street</Street1>   <City>My City</City>   <State>TX</State>   <Zip>99999</Zip> </NoEmpties> 

Here's the stylesheet.

Stylesheet (NoEmpties.xsl)
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0"     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:output method="xml" version="1.0" encoding="UTF-8"       indent="yes"/>   <xsl:template match="/MissingElement">     <NoEmpties>       <Name>         <xsl:value-of select="BuyerName"/>       </Name>       <Street1>         <xsl:value-of select="BuyerStreet1"/>       </Street1>       <xsl:if test="BuyerStreet2">         <Street2>           <xsl:value-of select="BuyerStreet2"/>         </Street2>       </xsl:if>       <City>         <xsl:value-of select="BuyerCity"/>       </City>       <State>         <xsl:value-of select="BuyerState"/>       </State>       <Zip>         <xsl:value-of select="BuyerZip"/>       </Zip>     </NoEmpties>   </xsl:template> </xsl:stylesheet> 

XSLT is so flexible that there are other ways to handle this, but this method is pretty simple and straightforward. All we need to do is to put the literals for the Street2 start and end tags and the associated xsl:value-of into an xsl:if. The expression in the select simply has the name of the source Element. The expression evaluates to true if the Element is present and false if it isn't.

Converting Coded Values

In electronic commerce and EAI applications we frequently need to convert one coded value to another. A common example is when customers order goods using standard UPC codes or their own catalog numbers, and orders that you import into your order management system must use your internal item identifiers. Let's again use Big Daddy's Gourmet Cocoa as an example. The company needs to convert from the UPC numbers its customers are sending to the item identifiers used by the order management system. We'll just use a fragment of the line item as the source tree to demonstrate the conversion. Our basic problem is that we want to convert this:

Source (ItemUPCs.xml)
 <?xml version="1.0" encoding="UTF-8"?> <OrderedItems>   <LineItem>     <ItemUPC>35790000724</ItemUPC>     <Qty>12</Qty>     <UnitPrice>2.59</UnitPrice>   </LineItem>   <LineItem>     <ItemUPC>35790000122</ItemUPC>     <Qty>24</Qty>     <UnitPrice>2.59</UnitPrice>   </LineItem> </OrderedItems> 

to this:

Result (ItemIDs.xml)
 <?xml version="1.0" encoding="UTF-8"?> <LineItems>   <Item>     <ItemID>HCVAN</ItemID>     <OrderedQty>12</OrderedQty>     <UnitPrice>2.59</UnitPrice>   </Item>   <Item>     <ItemID>HCMIN</ItemID>     <OrderedQty>24</OrderedQty>     <UnitPrice>2.59</UnitPrice>   </Item> </LineItems> 

I hate to overuse the word "straightforward," but again that describes the transformation (at least, once you grasp the basic concepts). There are two XML documents involved. One, of course, is the stylesheet. The other is a catalog document that serves as a cross-reference table. Let's look at the catalog document first.

Lookup Table (ItemCatalog.xml)
 <?xml version="1.0" encoding="UTF-8"?> <!--This is an item catalog document for converting from UPC     codes to our own item IDs--> <Catalog>   <Item>     <ID>HCMIN</ID>     <UPC>35790000122</UPC>     <Description>       Instant Hot Cocoa Mix - Mint flavor     </Description>   </Item>   <Item>     <ID>HCVAN</ID>     <UPC>35790000724</UPC>     <Description>       Instant Hot Cocoa Mix - Vanilla flavor     </Description>   </Item>   <Item>     <ID>HCMOC</ID>     <UPC>35790000999</UPC>     <Description>       Instant Hot Cocoa Mix - Mocha flavor     </Description>   </Item>   <Item>     <ID>HCDUC</ID>     <UPC>35790000641</UPC>     <Description>       Instant Hot Cocoa Mix - Dutch Chocolate flavor     </Description>   </Item> </Catalog> 

(Okay, those of you who work in the grocery industry, just bite your tongues . I know those values aren't constructed as valid UPC numbers, but gimme a break!)

Here's the stylesheet.

Stylesheet (ItemUPCToIDLookup.xsl)
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0"     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:output method="xml" version="1.0" encoding="UTF-8"       indent="yes"/>   <!--Declare the catalog document and read it in -->   <xsl:variable name="vCat"       select="document('ItemsCatalog.xml')"/>   <!--Template for root Element of result tree -->   <xsl:template match="/OrderedItems">     <LineItems>       <xsl:apply-templates select="LineItem"/>     </LineItems>   </xsl:template>   <!--This is the template for a line item in our result       tree -->   <xsl:template match="LineItem">     <!--Grab the UPC Number so that it is available within the         context of the select from the catalog -->     <xsl:variable name="vUPC" select="ItemUPC"/>     <Item>       <ItemID>         <!-- Select the Item from the catalog-->         <xsl:value-of             select="$vCat/Catalog/Item/ID[../UPC = $vUPC]"/>       </ItemID>       <OrderedQty>         <xsl:value-of select="Qty"/>       </OrderedQty>       <UnitPrice>         <xsl:value-of select="UnitPrice"/>       </UnitPrice>     </Item>   </xsl:template> </xsl:stylesheet> 

As I said, the whole thing is pretty simple, but you have to get your head around a few concepts first. We've already seen things similar to most of what's in this stylesheet. Only a few things are actually new.

The first new concept is loading the catalog document. In its select expression the stylesheet declares a variable named vCat using the XSLT document function. A variable in XSLT functions in some ways like a variable in conventional programming languages. However, I think the choice of this term was unfortunate because some differences in behavior and usage can lead to confusion. Once a value has been assigned to an XSLT variable it can't be changed. In addition, a variable can represent more than just a single atomic value. In our case it represents the contents of an entire document. So, you might think of it really as more of a constant or macro than a variable. The document function is one of several such functions defined by XSLT that augment the built-in XPath functions. Using it in an expression allows us to use documents beyond just our source document. There are various ways to call the document function, but in this example our first (and only) argument is the file specification (or URI) of the document to include. We could just use the document function in every select expression where we needed to retrieve a value from the catalog. However, each time the XSLT processor would read in the requested document. Reading it into a variable at the top of the stylesheet means that we read it in only once.

So, now we have the catalog in memory and can do something with it in the xsl:template for the LineItem Element. We save the value of the ItemUPC Element into another variable. It may look like we're continually assigning a new value to the vUPC variable, but actually we aren't. It has scope only within the template and gets created fresh each time the template is invoked. We save the UPC number from the source tree so that we can use it for matching against the UPC number in the catalog document. The reason we need to save it will become evident shortly.

We pull the value from the catalog into the ItemID Element in the result tree using our customary xsl:value-of. However, the expression in the select Attribute looks a bit different than what we've seen so far. Instead of having the familiar absolute or relative path reference to something in the source tree, it is a location path to something in the catalog document. We prefix vCat with a dollar sign ($) to indicate a variable reference. This sets the root of the catalog document's tree as the first step in our location path. The rest of the steps, Catalog/Item/ID, bring us to the Element we want. Because there are four Elements in our catalog that match that path, we have to filter down to the one we want with this predicate:

 [../UPC = $vUPC] 

This also requires a bit of explanation. We previously grabbed the UPC number from the source tree into the vUPC variable. We needed to do that because our location path expression in the select Attribute points us to an entirely different context node. It would be fairly convoluted to try to get back to the source tree's UPC number within this select expression, so we take the easy route and first save the value to the vUPC variable. The other part of the test in the predicate uses an abbreviated syntax for relative path expressions. The first part of the whole select expression set the context node to the ItemID Element in the catalog, but we want to test against the value of its sibling Element, UPC. We do this with a two-step relative location path in the predicate. In the first step we go back up to the parent with the abbreviated syntax (..), and in the second step we use the default child axis to go to the parent's UPC child Element. This notation is again similar to a relative UNIX directory path.

This is the easy, dumb, and simple way to perform lookups. If you have more than a few items in your lookup table, XSLT provides a feature to declare keys and retrieve items using keys. Depending on the XSLT processor implementation and the size of your lookup table, using keys can be quite a bit more efficient than the method I describe here. However, showing you how it works is more involved than I want to get in this tutorial. Refer to my Web site for several other good XSLT books if you want to learn more about it.

Creating Powerful Lookup Tables for Item Numbers

The lookup approach discussed in this section becomes more and more useful as you understand all that it can do. We can use the table in the example not only to look up our own item ID from a UPC number but also to do the reverse lookup. We merely have to change the source of our lookup variable and the Element names referenced in the select Attribute's location step and predicate. But it doesn't have to stop there. All of the various permutations on UPC numbers (2-5-5, 1-5-5, 1-1-5-5, and so on) can be included in the same table and cross-referenced not only against an internal item number but also against each other. You could even include relevant supplemental information such as units per case, weights, dimensions, and other measurements often needed for shipment notification and invoices. Going one step beyond this, you could add a stylesheet reference to this document that would display selected information in HTML as a Web page. What an aid that might be toward ensuring that your customers order by using valid item numbers! In addition to performing lookups for standard numbers like UPC or EAN (European Article Numbering) codes, you could also include Elements for customer-specific item numbers.

As I write this I'm working with an EDI client who has just such a mess of item numbers and supplemental information to manage. My client's EDI system has a pretty good table lookup system. However, if it did all that a simple document like the example in this sidebar could do, my work with this client would be a lot easier.

Handling Calculations

We can use arithmetic support in XPath expressions to insert calculated values into the result tree based on values in the source tree. Let's take a hypothetical example in which our order management system is able to export shipment notices as XML documents. These documents include Elements for quantity ordered, quantity shipped, and quantity previously shipped. However, we have a big customer who insists that we send information about backordered quantity, too. Here's the source document.

Source (AddAndSubtract.xml)
 <?xml version="1.0" encoding="UTF-8"?> <AddAndSubtract>   <QtyOrdered>1000</QtyOrdered>   <QtyShipped>500</QtyShipped>   <QtyPreviouslyShipped>0</QtyPreviouslyShipped> </AddAndSubtract> 

Here is the desired result.

Result (AddedAndSubtracted.xml)
 <?xml version="1.0" encoding="UTF-8"?> <AddedAndSubtracted>   <QtyOrdered>1000</QtyOrdered>   <QtyShipped>500</QtyShipped>   <QtyPreviouslyShipped>0</QtyPreviouslyShipped>   <QtyBackordered>500</QtyBackordered> </AddedAndSubtracted> 

And here's the stylesheet.

Stylesheet (AddAndSubtract.xsl)
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0"     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:output method="xml" version="1.0" encoding="UTF-8"       indent="yes"/>   <xsl:template match="AddAndSubtract">     <AddedAndSubtracted>       <xsl:copy-of select="QtyOrdered"/>       <xsl:copy-of select="QtyShipped"/>       <xsl:copy-of select="QtyPreviouslyShipped"/>       <QtyBackordered>         <xsl:value-of      select="QtyOrdered - (QtyShipped + QtyPreviouslyShipped)"/>       </QtyBackordered>     </AddedAndSubtracted>   </xsl:template> </xsl:stylesheet> 

Yes, folks, we can do math in an XPath expression! We're also using a new feature in this stylesheet. We use the xsl:copy-of Element to copy Elements from our source tree into the result tree when they have the same names.

Simple addition and subtraction are usually fairly easy and reliable. However, some authorities advise you to be cautious about multiplication and division. Some XSLT processors implement these simple math functions with floating point operations, which means that 1 + 1 can sometimes equal 1.9999999 rather than 2. However, I ran the following example with both msxsl and Xalan and got the same (correct) results. All I can say is to consult the documentation of your XSLT processor (or API library) and perform some due diligence testing. If things look okay, do math with confidence.

Source (Multiply.xml)
 <?xml version="1.0" encoding="UTF-8"?> <Multiply>   <QtyOrdered>48</QtyOrdered>   <UnitPrice>2.59</UnitPrice> </Multiply> 
Stylesheet (Multiply.xsl)
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0"       xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:output method="xml" version="1.0" encoding="UTF-8"       indent="yes"/>   <xsl:template match="/Multiply">     <Multiplied>       <xsl:copy-of select="QtyOrdered"/>       <xsl:copy-of select="UnitPrice"/>       <ExtendedPrice>         <xsl:value-of select="QtyOrdered * UnitPrice"/>       </ExtendedPrice>     </Multiplied>   </xsl:template> </xsl:stylesheet> 
Result (Multiplied.xml)
 <?xml version="1.0" encoding="UTF-8"?> <Multiplied>   <QtyOrdered>48</QtyOrdered>   <UnitPrice>2.59</UnitPrice>   <ExtendedPrice>124.32</ExtendedPrice> </Multiplied> 


Using XML with Legacy Business Applications
Using XML with Legacy Business Applications
ISBN: 0321154940
EAN: 2147483647
Year: 2003
Pages: 181

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net