At some point when building in XML support at the code level, you have to consider the actual code. That's what we'll do in this section.
Input and Import
If you have existing validation routines for import and data entry, reusing them for XML import will save you a lot of work. As I said earlier, there are limits to schema validation. If your application needs to validate order item numbers by looking them up in a database, you'll need to call this code when you import XML data. A well-designed system might use the same modules for validating an imported purchase order that it does for one entered by a human operator. If your system isn't coded this way, building in XML support might be a good opportunity for centralizing and cleaning up your validation logic.
The specific changes you need to make in your code depend on how you import data from the user's perspective. Do you provide a batch import from a command line, or is it driven by a user interface where the user selects files to import? If by batch it may make sense to build a completely separate program to import XML. If it's by a user interface, the determining design factor will probably be the degree to which the logic for reading the source, validating it, and storing it is segregated and modular. As you could see from the code examples, the logic for loading an XML document and retrieving data from it can be fairly involved. If your code for reading, validating, and storing is already mixed together in one module, adding XML calls is only going to make the code more complicated and less maintainable . You might be better served to code a completely separate module for XML import. Or you can again take advantage of the opportunity to redesign your code and clean it up a bit.
If the functions are already nicely segregated and the code is modular, you can probably code a specific XML import routine to load the data into the same data structures used when reading from other sources. You can then pass these data structures off to existing validation and storage routines and be done with the job.
Printed Output and Export
Not every application offers the ability to import data by means other than a user keying it. However, nearly every application has the ability to create printed documents, even if it doesn't provide features for exporting data files. Certain steps always have to be performed. Certain aspects aren't that much different for printing, exporting flat files, or exporting XML documents.
The key thing to bear in mind when coding your XML export as opposed to a flat file export is that we generally export several logical documents to one physical flat file. For XML we must export each logical document to a separate physical file. A well- formed XML document can have only a single top-level Element, the document root Element. Writing two documents to the same file violates this constraint.
We can still model our XML export on an existing extract or print routine. The example below shows typical approaches at a high level. In this simplified example of an invoice export, we'll assume that we're extracting data from two relational database tables, one for the overall invoice information and the other for item details. We're going to write the invoices to disk files, though they might be handed off to communications modules as DOM Document objects. Here's the logic for a typical flat file extract routine.
Flat File Extract Routine
Arguments: Selection criteria, e.g., invoice numbers Output file name Open output file Build and open cursor for selecting invoice headers Fetch first invoice header row DO for each invoice header Format data for export Write record to output file Open secondary cursor for line item details Fetch first line item detail row DO For each line item detail row Format data for export Write record to output file ENDDO Close secondary cursor ENDDO Close main cursor Close output file Return
The XML export routine follows the same general logic for extracting the data from the database but differs in some key areas. We'll again assume a DOM API. Other models for processing XML will differ in a few details, but the overall flow will be identical.
XML Export Routine
Arguments: Selection criteria, e.g., invoice numbers Output directory specification Set up API for creating XML document Build and open cursor for selecting invoice headers Fetch first invoice header row DO for each invoice header Create new XML document Create root Element and attach to document Build output filename Create invoice header element and attach to root DO for each column to be added to document Create Element and attach to invoice header Create Text node and attach to column Element ENDDO Open secondary cursor for line item details Fetch first line item detail row DO for each line item detail row Create line item Element and attach to root Element DO for each column in line item row to be added to document Create Element and attach to line item Element Create Text node and attach to column Element ENDDO Close secondary cursor ENDDO Save DOM document to file ENDDO Close main cursor Return
There is an important difference, aside from the obvious details of formatting and writing a flat file record as opposed to creating Elements. Rather than opening the output file at the beginning of the routine and closing it at the end, for each iteration of the DO loop that processes an invoice header row we create and save a new XML document.
Of course, every application will be somewhat different, but this example illustrates some of the major issues to consider.
You will be doing your users a great favor if you accept the fact that they won't be dealing with just one XML format for each type of document. Assume that transformation is going to be the norm, and make it easy for users. One way you can do this from a functional perspective is to build in support for XSLT. This could be done by giving users the ability to associate a stylesheet with an XML export or import. It will probably be most useful if the association is done on a very specific basis. For example, for an order management and billing system, make the association at the customer level and for each specific type of document. In this fashion the appropriate XSLT transformation could be performed for each document from that customer before you import it and on each document for that customer after you export it.
Unlike the DOM, there is no universal, language-independent standard API for XSLT processors. However, both Xalan and MSXML have support for invoking transformations on in-memory documents. Using this strategy you could load, parse, and validate a document; transform it if necessary into your expected format; validate again if desired; and then process it. For export you could create the output document, validate it if desired, transform it into the expected format, validate again if desired, and then serialize it to disk or hand it off as a DOM object to an API that will transmit it to the destination.
For programmers working in the Java world, Xalan provides code-level XSLT support primarily through some classes first defined in JAXP 1.1. The relevant Java package is javax.xml.transform. Within that package the most important classes are the Transformer class with its transform method and the TransformerFactory that creates Transformer instances. The model is somewhat similar to the DocumentBuilderFactory that builds DocumentBuilder instances. If using the DOM API, a DOMSource object is created from a DOM Document (or other Node) and passed to the transform method. This returns a DOMResult object from which a DOM Document object can be retrieved.
The model in MSXML is a bit simpler. XSLT support is built into the DOM (non standard, to be sure, but certainly convenient ). The method we're probably most interested in is the transformNodeToObject method. For our primary interest in transforming complete documents, it takes a DOM Node representing a stylesheet as an argument and returns a COM Variant output object, of which one of the variations can be a DOM Document. The online documentation provided with MSXML has a pretty good C++ example for how to call this method.
Another way you can make transformations easier for your users is to provide some sample XSLT stylesheets for converting to and from your selected formats and various other formats. These other source and target formats might include things such as the OASIS UBL standards, the Open Applications Group's OAGIS standards, standards from UN/CEFACT or X12 (when they become available), and various vendor formats. [*] For any particular usage scenario your users would probably have to modify the sample stylesheets a bit, but giving them something to start with would be a big help.
Making XML the Native Output Format
If you've taken the trouble to go this far and add an XML export function, one thing you could consider is just using XML as the output format. Provide XSL stylesheets with the formatting objects appropriate for various types of output and you have succeeded in the often sought after goal of separating presentation from content. Once you have created the XML document your users can do with it what they like. And if they don't like your default printed formats, with a reasonable stylesheet graphical user interface they can create their own. This has to be much easier than building custom formatting routines into your application. Stylesheets for printing could be associated with specific documents and customers just as stylesheets for exporting are.
Making XML a Native Storage Format
Even with some "XML databases" popping up here and there, I doubt if XML will ever replace the relational databases used for most business applications. The relational model with SQL and ODBC is just too well known, proven by time, and well entrenched. If it isn't broken, why fix it? However, applications use several types of data (for example, configuration files) that aren't necessarily stored in relational databases even if the bulk of application data is. The utilities in this book use XML for such a purpose. Using XML in these situations can often be easier than writing code in native Java or C++. This is especially true if you want to make sure that the configuration information is valid before loading it.