Using XML, XSLT from Domino

In this section, we discuss how you can work with XML data in Domino applications. The importance of XML in today's application development environments and especially with Domino and WebSphere products is hard to overstate. From data exchange formats to rendering formats (XHTML) to configuration file formats, XML is becoming the pervasive syntax for structured text data. An understanding of how to work with XML in Domino will be essential when implementing Web services in Domino.

With any discussion of XML, a discussion of the Extensible Stylesheet Language (XSL) and XSL-based XML transform processing is not far behind. As with natural languages of a common syntax and structure ”for example, English and French ”XSL transform processing provides the translation function between XML-based data formats. XSL provides a powerful means for recasting one XML format into another.

Our discussion of XML processing in Domino will cover the following topics:

Parsing and building XML data within a Domino database . Use of the W3C Document Object Model (DOM) and Simple API for XML (SAX) parsers supplied with Domino.
The Domino XML (DXL) schema for design elements . Working with DXL to format Domino elements in XML.
XSL transform support in Domino . Using the built-in Domino XSL transform engine.

The Domino server product contains the libraries for parsing XML via the W3C DOM and SAX APIs. Agents or event actions can be written within a Domino database to parse and then process XML data. XML formatted data itself can be accessed in various ways within a Domino application. XML data can be read from an external file or from a database document field. XML data can also be obtained as the result of the Domino HTTP Server rendering a form or page or from request content sent to a Web agent. The DOM or SAX APIs are available to Java or LotusScript code within Domino Designer.

Most often we find ourselves needing to parse XML contained in document fields or in the request content directed to a Web agent (as when handling a Web service request). Let's examine how to use the parsing APIs, both DOM and SAX, within Domino agents in detail. The basic steps for parsing XML text are:

Set up access to the XML input.
Invoke the parser.
Process the parsed data. For this discussion, we assume some familiarity with the W3C DOM and SAX APIs (see the Bibliography for references).

Briefly, the DOM API parses XML and creates a tree structure with nodes representing the parsed XML elements, text, etc.; whereas, the SAX API parses XML and invokes callback functions for the different XML constructs. When using the SAX API, step three above is done while the data is being parsed. The choice between using the DOM or SAX API is often based on what the application needs to do with the parsed data. If the XML data must be modified or added to and then rewritten, the DOM API is often used. If the need is to search for certain data within the XML, this is more efficiently done using the SAX API. Note that the DOM API requires enough memory for the entire DOM parse tree, whereas the SAX API is a single pass parser and can be ended as soon as the required data is parsed.

The Domino, Java, and LotusScript APIs have been extended to interoperate with the DOM and SAX APIs. In the Java API, a parseXML() method, which parses XML into a DOM tree, has been added to the various Domino classes (interfaces) that can hold XML text, such as Item , RichTextItem , MIMEEntity , and EmbeddedObject . Another new Java method is the getInputSource() method, which returns a org.xml.sax.InputSource object for use by a SAX parser. In the LotusScript API, there are now NotesDOMParser and NotesSAXParser classes. Let's look at a specific example of Java agent code to perform both DOM and SAX parsing of XML text within a Domino database.

The following Java code illustrates how to parse XML text held within a field of a Domino database document using the DOM parser. It also shows how the parsed data can be extracted from the DOM tree built by the parser. In this example, we extract the top level XML elements, which contain only text, and build a new Domino document with fields (items) corresponding to these extracted elements. The root element name is used to set the form associated with the Domino document.

 // Parses given field (Item) contents as XML using DOM parser. // Returns root element name. // private String ParseXMLviaDOM(Item theField)    throws java.io.IOException, NotesException, org.xml.sax.SAXException {    // Parse field contents (false means no validation against DTD).    org.w3c.dom.Document aDOMtree = theField.parseXML(false);    // Create a new Notes document, set its form as root element name,    // then build document from elements.    Document aNewDoc =       itsAgentContext.getCurrentDatabase().createDocument();    org.w3c.dom.Node aRoot = aDOMtree.getFirstChild();    aNewDoc.appendItemValue("Form", aRoot.getNodeName());    XMLUtils.DOM2Doc(aRoot, aNewDoc);    return(aRoot.getNodeName()); } // Convert an XML DOM tree into a set of items in a document. // Each child of the given Node is turned into an item named // after the element. public static void DOM2Doc(Node theDOM, lotus.domino.Document     theDoc)    throws lotus.domino.NotesException {    NodeList aNodes = theDOM.getChildNodes();    for(int i = 0;  i < aNodes.getLength();  i++)    {       Node aNode = aNodes.item(i);       if (aNode.getNodeType() == Node.ELEMENT_NODE)       {          // Look for text in child node, then append to Notes document          Node aChild = aNode.getFirstChild();          if (aChild != null && (aChild.getNodeType() == Node.TEXT_NODE                aChild.getNodeType() == Node.CDATA_SECTION_NODE))            theDoc.appendItemValue(aNode.getNodeName(),                                    aChild.getNodeValue());       }    }    if (aNodes.getLength() > 0)  theDoc.save(); }

(The DOM2Doc method can be made part of a Java script library within Designer so that it could be reused among other databases.)

The next example illustrates how to parse XML text using a SAX parser. The Java code shown next performs the same function as the previous example, parsing the XML text from a Domino document field and creating a new Domino document with fields corresponding to the XML element names and values.

 // Parses given field (Item) contents as XML using SAX parser. // Returns root element name. // private String ParseXMLviaSAX(Item theField)    throws java.io.IOException, NotesException,       org.xml.sax.SAXException, ClassNotFoundException,       IllegalAccessException, InstantiationException {    // Get a SAX parser via its factory    org.xml.sax.Parser aParser =       org.xml.sax.helpers.ParserFactory.makeParser("com.ibm.xml.parsers.SAXParser");    // Set the XML document handler as our own SAXHandler.    SAXHandler aHandler = new SAXHandler();    aParser.setDocumentHandler(aHandler);    // Create a new Notes document as the target for the parsed data    // and tell our handler about it.    Document aNewDoc =       itsContext.getCurrentDatabase().createDocument();    aHandler.setNotesDoc(aNewDoc);    // Parse the given field contents    aParser.parse(theField.getInputSource());    return(aHandler.getRootName()); } import org.xml.sax.*; public class SAXHandler extends HandlerBase {    // Notes Document object used as target for parsed data.    lotus.domino.Document itsNotesDoc;    // Holds root element name, also set as form for itsNotesDoc.    String itsRootName;    // Holds current element, value being parsed.    String itsCurElt;    StringBuffer itsValue = new StringBuffer();    public void setNotesDoc(lotus.domino.Document theDoc)    {       itsNotesDoc = theDoc;    }    public void startDocument()    {       System.out.println("Start SAX API parse of document.");    }    public void endDocument()    {       // Set root element name as Notes document Form, save it.       if (itsNotesDoc != null && itsRootName != null)       {          try {             itsNotesDoc.appendItemValue("Form", itsRootName);             itsNotesDoc.save();          } catch(lotus.domino.NotesException e) {             System.out.println("Parse error: " + e);          }       }       System.out.println("End SAX parse.");    }    public void startElement(String theName, AttributeList theAttrs)       throws SAXException    {       // If root element name not set, assume this is it.       if (itsRootName == null)          itsRootName = theName;       itsCurElt = theName;    }    public void endElement(String theName)       throws SAXException    {       // If current element has a text value, add it to the Notes       // document as an Item.       if (itsNotesDoc != null && itsCurElt != null &&            itsValue.length() > 0)       {          try {             itsNotesDoc.appendItemValue(itsCurElt, itsValue.toString());          } catch(lotus.domino.NotesException e) {             System.out.println("Parse error: " + e);          }       }       itsCurElt = null;       itsValue.setLength(0);    }    public void characters(char theChars[],       int theStart, int theLength) throws SAXException    {       if (itsCurElt != null)          itsValue.append(theChars, theStart, theLength);    }    public String getRootName()    {       return itsRootName;    } }

For the SAX parse, the SAXHandler class overrides the set of " callback " methods invoked by the parser (via the HandlerBase interface). Note that you must supply the class members and logic required to hold element data across the callback methods. In our example, we used the class members , itsCurElt and itsValue , to hold the element name and value across the startElement , characters, and endElement methods.

A couple of final notes about the XML parsing examples:

The code can be tested easily within Domino Designer or the Notes client. The System.out stream writes to the Java Debug Console window on the client and to the Notes log database when run on a server.
The Designer Reference tab shows the DOM and SAX Java API methods under the "Third-Party Java" topic. Only three of the SAX classes appear in the Reference, although the entire SAX API can be written and compiled.
The parseXML() method can invoke a validating DOM parser by setting its argument to "true." In this case, the parser looks for any DTDs referenced in the XML as page elements within the Domino database having the DTD filename. At this time, it is not possible to perform XML Schema validation.

Next, we look at how to construct XML formatted text from data within a Domino database. Of course, with programming, anything is possible, so one can take the "brute force" approach and generate the XML text character by character (or substring by substring) using Java or Lotus Script code. For simple XML, this approach may be appropriate but quickly becomes time-consuming and inflexible for more complex XML. A more flexible approach can be taken by making use of the application "programming" features of Domino, namely forms and views.

Although we assume general familiarity with Domino application development, we'll take a couple of paragraphs here to review the features of Domino forms and views. Not only are forms and views useful as a means of displaying and listing database documents to a Notes client or Web browser, but they can be and often are used as programming functions. Forms can be used in a similar way as XSL stylesheets are used to format XML data, that is, to format the contents (items and their values) of a Domino database document in a certain way. A single document can be presented in different ways by associating it with different forms. Views can be thought of as forms with a query function ”they can determine a collection of documents (via the view selection formula) and format the collection into a list using a subset of document fields.

Using Domino database forms and views, where the form text or view column text surrounding the field data is XML, we can "format" the document data into XML. Let's look at an example. Suppose we have a Domino database used to hold information about books, say for a publishing house application, and the documents in this database contain fields as shown by the "book" form in Figure 9-7. (Keep in mind that Domino database documents only contain field items and values; the formatting stuff ”the table, text, etc. ”are part of the form.) Also suppose that we want to generate XML documents for each of these Domino documents. We can simply create another form, say "bookXML" and specify the XML text as the form text as shown in Figure 9-8.

Figure 9-7. Book form for display.

Figure 9-8. BookXML form for generating XML.

We can apply the "bookXML" form to database documents containing the fields originally created by the "book" form, and then we can write, send, or process the resulting content, which will be in XML format. Note that the fields in the "bookXML" form can be a subset of the original set of document fields or, using lookup formulas, could come from other documents or even other databases.

Hopefully, at this point it should be clear how the "bookXML" form can be used to display XML content for a "book" document within a browser or Notes client, but perhaps not at all clear is how to obtain this XML content within agent or event code. Let's look at one technique for obtaining the XML content via a form within agent code. Basically, we need to have the Domino server "apply" the XML form to a specific document and "return" the XML content to our agent code. Having Domino "apply" the form can be done by creating a view in the database having a "form formula" defined specifying the form name. This view can be defined to select all documents for which we want to generate XML (in our case, documents originally created with the "book" form) and to have a sorted column with the document's unique Domino identifier as the column value. Having Domino "return" the XML content (that is, the document with the XML form applied) can then be done by simply requesting the document from this view via a Domino URL. Figure 9-9 shows a "BooksAsXML" view created as we just described.

Figure 9-9. "BooksAsXML" view.

The following Java agent code uses the "BooksAsXML" view to generate the "bookXML" content for a specific "book" document and write this XML content to a file.

 import lotus.domino.*; import java.net.*; import java.io.*; public class JavaAgent extends AgentBase {    public void NotesMain()    {       try {          Session session = getSession();          AgentContext agentContext = session.getAgentContext();          // Get selected document  should be a 'book' document.          DocumentCollection aDocs =            agentContext.getUnprocessedDocuments();          Document aDoc = aDocs.getFirstDocument();          if (aDoc != null)          {             // Set URL to get document with XML form applied.             URL aURL = new URL("http", "localhost", 80,                "/booklist.nsf/BooksAsXML/" + aDoc. getUniversalID());             URLConnection aCnctn = aURL.openConnection();             aCnctn.connect();             // Copy XML output to file.             PrintWriter aWtr = new PrintWriter(new FileWriter("c:\temp\document.xml"));             BufferedReader aRdr = new BufferedReader(new InputStreamReader(aCnctn.getInputStream()));             String aLine = aRdr.readLine();             while (aLine != null)             {                aWtr.println(aLine);                aLine = aRdr.readLine();             }             aWtr.close();          }       }       catch(Exception e)       {          System.out.println("GenerateXML error: " + e);       }    } }

The URL used to obtain the document from our "BooksAsXML" view has the following format: http://<hostname>/<database-name>/<view-name>/<doc-name>[?OpenDocument] where the <doc-name> value is used to lookup the document in the first sorted column of the specified view <view-name>. In our example, we used the Domino Universal Document ID as this value. Some final points about this example:

The "XML" form must have the Content Type option set to "Other" with "XML" specified in the adjacent input field, as shown in Figure 9-10. (The Content Type setting is located in the Form properties dialog under the Advanced tab.) This setting tells the Domino HTTP Server to set the content type of the generated form text as XML; otherwise , it would be generated with HTML headers that would appear before the XML header.

Figure 9-10. Form properties for XML.
The agent (or event) code, because it invokes a URL, must be run against a Domino server or under a Notes client with the "Local Web Preview" process running. You can simply select the Preview In Web Browser action in the Notes client to start this process.

If XML content needs to be created that must contain data from multiple documents, for example, a "Category" XML document which contains multiple "Book" elements, it can be generated using a similar techniques as described previously. We can use a page design element to specify the top-level XML (header and root element) and an embedded view to format a set of documents as child elements. In this case, the view would define column formulas to specify the XML tag data around the document fields (column values). See the IBM Redbook, "XML Powered by Domino," for details on how to use this technique.

A third approach to generating XML content from Domino data is to make use of the Domino programming functions, which generate DXL from Domino database, design elements, and Domino's built-in ability to transform XML via XSL. Recall that DXL is the XML format defined by the IBM/Lotus-supplied Domino DTD and is expressed in terms of Domino design elements such as documents, items, forms, views, etc. With the ability to perform XSL transformation, we can use DXL as the starting point from which to generate other XML formats or other formats entirely, such as HTML or plain text. Domino provides API functions to generate DXL and to perform XSL-based transforms from stylesheets incorporated into the database or external to it. If you are proficient with XSL, this approach may be the most flexible way to produce sophisticated XML content from Domino data.

Figure 9-7. Book form for display.

Figure 9-8. BookXML form for generating XML.

Figure 9-9. "BooksAsXML" view.

Figure 9-10. Form properties for XML.