Java Servlet Programming, 2nd Edition > 17. XMLC > 17.1 A Simple XML Compile |
17.1 A Simple XML CompileTo learn how to use XMLC, we'll start with a simple substitution application. Example 17-1 shows a simple hello.html mock-up web page that greets a user by name and tells her how many messages she has waiting for her. The HTML 4.0-compliant <SPAN> tags surround the text to be replaced. Example 17-1. A Simple XMLC Substitution<!-- hello.html --> <HTML> <HEAD><TITLE>Hello</TITLE></HEAD> <BODY> <H2><SPAN >Hello, Kathlyn</SPAN></H2> You have <SPAN >103</SPAN> new messages. </BODY> </HTML> To run XMLC on this file you must first install XMLC. Download the tool from http://xmlc.enhydra.org, unpack the archive, and follow the instructions in the README for running xmlc-config. Then you're ready to run the xmlc compiler: xmlc -class Hello -keep hello.html This tells XMLC to compile the hello.html file into a class named Hello, keeping the generated Hello.java file for our examination. Because standard HTML files don't have to be well-formed XML, the XMLC compiler uses the Tidy parser to handle the conversion from HTML to the DOM. Beware, though, Tidy can get confused with HTML files that stray too far from valid HTML, and when this occurs you'll see a long list of warnings and errors during the XMLC compile. To eliminate this concern, you can use HTML design tools (like Dreamweaver) to create the HTML for XMLC processing. You can learn more about Tidy at http://www.w3.org/People/Raggett/tidy and the Java port named JTidy at http://www3.sympatico.ca/ac.quick/. XMLC accepts dozens of compile options. The most commonly used options are listed below.[2]
Here is the command-line syntax: xmlc [options] docfile
When you run XMLC with the -keep flag you can examine the generated source. The source for Hello.java is shown in Example 17-2. Example 17-2. The Autogenerated Hello.java Source/* ************************************ * XMLC GENERATED CODE, DO NOT EDIT * ************************************ */ import org.w3c.dom.*; import org.enhydra.xml.xmlc.XMLCUtil; import org.enhydra.xml.xmlc.XMLCError; import org.enhydra.xml.xmlc.dom.XMLCDomFactory; public class Hello extends org.enhydra.xml.xmlc.html.HTMLObjectImpl { /** * Field that is used to identify this as an XMLC * generated class. Contains an reference to the * class object. */ public static final Class XMLC_GENERATED_CLASS = Hello.class; /** * Field containing CLASSPATH relative name of the source file * that this class was generated from. */ public static final String XMLC_SOURCE_FILE = "hello.html"; /** * Get the element with id <CODE>Greeting</CODE>. * @see org.w3c.dom.html.HTMLElement */ public org.w3c.dom.html.HTMLElement getElementGreeting() { return $elementGreeting; } private org.w3c.dom.html.HTMLElement $elementGreeting; /** * Get the value of text child of element <CODE>Greeting</CODE>. * @see org.w3c.dom.Text */ public void setTextGreeting(String text) { XMLCUtil.getFirstText($elementGreeting).setData(text); } /** * Get the element with id <CODE>Messages</CODE>. * @see org.w3c.dom.html.HTMLElement */ public org.w3c.dom.html.HTMLElement getElementMessages() { return $elementMessages; } private org.w3c.dom.html.HTMLElement $elementMessages; /** * Get the value of text child of element <CODE>Messages</CODE>. * @see org.w3c.dom.Text */ public void setTextMessages(String text) { XMLCUtil.getFirstText($elementMessages).setData(text); } /** * Create document object. */ private static Document createDocument() { XMLCDomFactory domFactory = new org.enhydra.xml.xmlc.dom.DefaultHTMLDomFactory(); Document document = domFactory.createDocument(null, null); return document; } /** * Create document as a DOM and initialize accessor method fields. */ public void buildDocument() { Document document = createDocument(); setDocument(document); Node $node0, $node1, $node2, $node3, $node4; Element $elem0, $elem1, $elem2, $elem3; Attr $attr0, $attr1, $attr2, $attr3; $elem0 = document.getDocumentElement(); $elem1 = document.createElement("HEAD");; $elem0.appendChild($elem1); $elem2 = document.createElement("TITLE");; $elem1.appendChild($elem2); $node3 = document.createTextNode("Hello");; $elem2.appendChild($node3); $elem1 = document.createElement("BODY");; $elem0.appendChild($elem1); $elem2 = document.createElement("H2");; $elem1.appendChild($elem2); $elem3 = document.createElement("SPAN");; $elem2.appendChild($elem3); $elementGreeting = (org.w3c.dom.html.HTMLElement)$elem3; $attr3 = document.createAttribute("id"); $attr3.setValue("Greeting"); $elem3.setAttributeNode($attr3); $node4 = document.createTextNode("Hello, Jason");; $elem3.appendChild($node4); $node2 = document.createTextNode("You have ");; $elem1.appendChild($node2); $elem2 = document.createElement("SPAN");; $elem1.appendChild($elem2); $elementMessages = (org.w3c.dom.html.HTMLElement)$elem2; $attr2 = document.createAttribute("id"); $attr2.setValue("Messages"); $elem2.setAttributeNode($attr2); $node3 = document.createTextNode("103");; $elem2.appendChild($node3); $node2 = document.createTextNode(" new messages.");; $elem1.appendChild($node2); $node1 = document.createComment(" hello.html ");; $elem0.appendChild($node1); } /** * Recursive function to do set access method fields from the DOM. * Missing ids have fields set to null. */ protected void syncWithDocument(Node node) { if (node instanceof Element) { String id = ((Element)node).getAttribute("id"); if (id.length() == 0) { } else if (id.equals("Greeting")) { $elementGreeting = (org.w3c.dom.html.HTMLElement)node; } else if (id.equals("Messages")) { $elementMessages = (org.w3c.dom.html.HTMLElement)node; } } } /** * Default constructor. */ public Hello() { buildDocument(); } /** * Constructor with optional building of the DOM. * * @param buildDOM If false, the DOM will not be built until * buildDocument() is called by the derived class. If true, * the DOM is built immediately. */ public Hello(boolean buildDOM) { if (buildDOM) { buildDocument(); } } /** * Copy constructor. * @param src The document to clone. */ public Hello(Hello src) { setDocument((Document)src.getDocument().cloneNode(true)); syncAccessMethods(); } /** * Clone the document. * @param deep Must be true, only deep clone is supported. */ public Node cloneNode(boolean deep) { cloneDeepCheck(deep); return new Hello(this); } } This example contains a lot of code, but it's fairly simple. The class is named Hello as we dictated with the -class option. It extends org.enhydra.xml.xmlc.html.HTMLObjectImpl, the XMLC superclass for all HTML objects, and implements org.w3c.dom.html.HTMLDocument , the standard DOM interface representing an HTML document. The Hello constructor calls the buildDocument( ) method to populate the DOM tree with the content from the HTML file. DOM is just a set of interfaces, and there are many possible DOM implementations to choose from. We can see in the createDocument( ) method that this class uses the XMLC default implementation as returned by DefaultHTMLDomFactory , which in 1.2b1 is Apache Xerces (see http://xml.apache.org). If you run xmlc with the -methods option you can see a summary of the generated access methods in the source: % xmlc -class Hello -keep -methods hello.html public org.w3c.dom.html.HTMLElement getElementGreeting(); public void setTextGreeting(String text); public org.w3c.dom.html.HTMLElement getElementMessages(); public void setTextMessages(String text); The getter methods return the HTMLElement DOM object representing the HTML element marked with the ID attribute. The setter methods set the text content of those elements. Notice the value of the ID attribute is encoded into the method names. Values that cannot be converted to legal Java identifiers will not have access methods generated for them, so be careful to use only letters, numbers, and underscores. Also remember ID values are supposed to be unique within the scope of the page.
|