OutputFormat


The detailed behavior of a serializer is controlled by an OutputFormat object. This class can configure almost any aspect of serialization, including setting the maximum line length, changing the indentation, specifying which elements have their text escaped as CDATA sections, and more. A few options even have the potential to make your documents malformed . For example, if you add an element to the list of nonescaping elements, then any reserved characters like < and & that appear in its text content will be output as themselves rather than escaped as &lt; and &amp; .

One of the most frequent requests for serializers is "pretty printing" data with extra line breaks and indentation. Within reasonable limits, the OutputFormat class can provide this. Simply pass true to setIndenting() , pass the number of spaces you want each level to be indented to setIndent() , and pass the maximum line length to setLineWidth() . Example 13.1 demonstrates .

Example 13.1 Using Xerces' OutputFormat Class to "Pretty Print" XML
 import java.math.*; import java.io.IOException; import org.w3c.dom.*; import javax.xml.parsers.*; import org.apache.xml.serialize.*; public class IndentedFibonacci {   public static void main(String[] args) {     try {       // Find the implementation       DocumentBuilderFactory factory        = DocumentBuilderFactory.newInstance();       factory.setNamespaceAware(true);       DocumentBuilder builder = factory.newDocumentBuilder();       DOMImplementation impl = builder.getDOMImplementation();       // Create the document       Document doc = impl.createDocument(null,        "Fibonacci_Numbers", null);       // Fill the document       BigInteger low  = BigInteger.ONE;       BigInteger high = BigInteger.ONE;       Element root = doc.getDocumentElement();       for (int i = 0; i < 10; i++) {         Element number = doc.createElement("fibonacci");         Text text = doc.createTextNode(low.toString());         number.appendChild(text);         root.appendChild(number);         BigInteger temp = high;         high = high.add(low);         low = temp;       }       // Serialize the document       OutputFormat format = new OutputFormat(doc);       format.setLineWidth(65);       format.setIndenting(true);       format.setIndent(2);       XMLSerializer serializer        = new XMLSerializer(System.out, format);       serializer.serialize(doc);     }     catch (FactoryConfigurationError e) {       System.out.println("Could not locate a JAXP factory class");     }     catch (ParserConfigurationException e) {       System.out.println(        "Could not locate a JAXP DocumentBuilder class"       );     }     catch (DOMException e) {      System.err.println(e);     }     catch (IOException e) {      System.err.println(e);     }   } } 

When run, this program produces the following output:

 C:\XMLJAVA>  java IndentedFibonacci  <?xml version="1.0" encoding="UTF-8"?> <Fibonacci_Numbers>   <fibonacci>1</fibonacci>   <fibonacci>1</fibonacci>   <fibonacci>2</fibonacci>   <fibonacci>3</fibonacci>   <fibonacci>5</fibonacci>   <fibonacci>8</fibonacci>   <fibonacci>13</fibonacci>   <fibonacci>21</fibonacci>   <fibonacci>34</fibonacci>   <fibonacci>55</fibonacci> </Fibonacci_Numbers> 

I think you'll agree that this looks much more attractive than the smushed together output from the bare serialization without any extra white space. One warning, however: White space is significant in XML. Adding this white space has changed the document. This is not the same document as existed before it was "pretty printed." For this particular application, the extra white space is insignificant, but this is not true for all XML applications.

White space is just the beginning of what the OutputFormat class can control. Other features include the MIME media type, the XML declaration, the system and public IDs for the document type, which elements' content should be escaped as CDATA sections, and more. Following is a list of the properties you can control by invoking various methods on OutputFormat . In some cases, the default is document dependent. When it's not, the default value is given in parentheses.

Method

The method is normally set to one of three values xml , html , or text indicating the type of output that is desired. The serializer uses this value to configure itself. The default value is determined by the type of the document being serialized.

 public void  setMethod  (String  method  ) public String  getMethod  () public static String  whichMethod  (Document  doc  ) 

Media Type (Null)

This is the MIME media type for the output, such as application/xml or application/xhtml+xml. Although not included in the document itself, this may be used as part of the stream's metadata if it's written into a file system or onto an HTTP connection or some such.

 public void  setMediaType  (String  version  ) public String  getMediaType  () public static String  whichMediaType  (Document  doc  ) 

Version (1.0)

The version number used in the encoding declaration should always be "1.0 . " Do not change this.

 public void  setVersion  (String  version  ) public String  getVersion  () 

Standalone (No)

The value of the standalone attribute in the XML declaration. This should be true for "yes" and false for "no" .

 public void  setStandalone  (boolean  standalone  ) public boolean  getStandalone  () 

Encoding (UTF-8)

The encoding specifed in the encoding attribute in the XML declaration and used to convert characters to bytes when serializing onto an OutputStream .

 public void  setEncoding  (String  encoding  )  public String  getEncoding  () 

Omit XML Declaration (False)

If true, then no XML declaration is output. If false, then an XML declaration is written.

 public void  setOmitXMLDeclaration  (boolean  omitXMLDeclaration  ) public boolean  getOmitXMLDeclaration  () 

Document Type

This specifies the system and public IDs of the external DTD subset given in the document type declaration. These values are used only if the Document being serialized does not contain a DocumentType object of its own.

 public void  setDoctype  (String  publicID,  String  systemID  ) public String  getDoctypePublic  () public String  getDoctypeSystem  () public static String  whichDoctypePublic  (Document  doc  ) public static String  whichDoctypeSystem  (Document  doc  ) 

Omit Document Type (False)

If true, then no document type declaration is output. If false, then a document type declaration is written. If the document does not have a document type declaration and none has been set with setDoctype() , then no document type declaration will be written, regardless of the value of this property.

 public void  setOmitDocumentType  (boolean  omitDocumentType  ) public boolean  getDocumentType  () 

Nonescaping Elements

The elements whose text-node children should not be escaped using entity references.

 public void  setNonEscapingElements  (String[]  elementNames  ) public String[]  getNonEscapingElements  (String[]  elementNames  ) public boolean  isNonEscapingElement  (String  name  ) 

CDATA Elements

The elements whose text content should be enclosed in a CDATA section.

 public void  setCDATAElements  (String[]  elementNames  ) public String[]  getCDATAElements  (String[]  elementNames  ) public boolean  isCDATAElement  (String  name  ) 

Omit Comments (False)

If true, then comments in the document are not written onto the output. If false, they are written.

 public void  setOmitComments  (boolean  omitComments  ) public boolean  getOmitComments  () 

Indenting (False)

If true, then the serializer will add indents at each level and wrap lines that exceed the maximum line width. If false, it won't. The number of spaces to indent is set by the indent property, and the column to wrap at is set by the line width property.

 public void  setIndenting  (boolean  indenting  ) public boolean  getIndenting  () 

Indent (4)

The number of spaces to indent each level if indenting is true.

 public void  setIndent  (int  indent  ) public int  getIndent  () 

Line Width (72)

The maximum number of characters in a line when indenting is true. Setting this to zero turns off line wrapping completely.

 public void  setLineWidth  (int  width  ) public int  getLineWidth  () 

Line Separator (\n)

The character or characters to use for a line break. Take care to set this property only to a carriage return, a linefeed , or a carriage return/linefeed pair.

 public void  setLineSeparator  (String  separator  ) public String  getLineSeparator  () 

Example 13.2 uses these methods to create a valid MathML document encoded in ISO-8859-1 with a document type declaration, an XML declaration, no comments, a 65-character maximum line width, a two-space indent, a standalone declaration with the value yes, and the MIME media type application/xml:

Example 13.2 Using Xerces' OutputFormat Class to "Pretty Print" MathML
 import java.math.*; import java.io.*; import org.w3c.dom.*; import javax.xml.parsers.*; import org.apache.xml.serialize.*; public class ValidFibonacciMathML {    public static String MATHML_NS     = "http://www.w3.org/1998/Math/MathML";    public static void main(String[] args) {      try {        DocumentBuilderFactory factory         = DocumentBuilderFactory.newInstance();        factory.setNamespaceAware(true);        DocumentBuilder builder = factory.newDocumentBuilder();        DOMImplementation impl = builder.getDOMImplementation();        Document doc = impl.createDocument(MATHML_NS, "math", null);        BigInteger low  = BigInteger.ONE;        BigInteger high = BigInteger.ONE;        Element root = doc.getDocumentElement();        root.setAttribute("xmlns", MATHML_NS);        for (int i = 1; i <= 10; i++) {          Element mrow = doc.createElementNS(MATHML_NS, "mrow");          Element mi = doc.createElementNS(MATHML_NS, "mi");          Text function = doc.createTextNode("f(" + i + ")");          mi.appendChild(function);          Element mo = doc.createElementNS(MATHML_NS, "mo");          Text equals = doc.createTextNode("=");          mo.appendChild(equals);          Element mn = doc.createElementNS(MATHML_NS, "mn");          Text value = doc.createTextNode(low.toString());          mn.appendChild(value);          mrow.appendChild(mi);          mrow.appendChild(mo);          mrow.appendChild(mn);          root.appendChild(mrow);          BigInteger temp = high;          high = high.add(low);          low = temp;        }        OutputFormat format = new OutputFormat(doc);        format.setLineWidth(65);        format.setIndenting(true);        format.setIndent(2);        format.setEncoding("ISO-8859-1");        format.setDoctype("-//W3C//DTD MathML 2.0//EN",         "http://www.w3.org/TR/MathML2/dtd/mathml2.dtd");        format.setMediaType("application/xml");        format.setOmitComments(true);        format.setOmitXMLDeclaration(false);        format.setVersion("1.0");        format.setStandalone(true);        XMLSerializer serializer         = new XMLSerializer(System.out, format);        serializer.serialize(doc);      }      catch (FactoryConfigurationError e) {        System.out.println("Could not locate a JAXP factory class");      }      catch (ParserConfigurationException e) {        System.out.println(          "Could not locate a JAXP DocumentBuilder class"        );      }      catch (DOMException e) {        System.err.println(e);      }      catch (IOException e) {        System.err.println(e);      }     }  } 

Following is the beginning of the output that this program produces:

 C:\XMLJAVA>  java ValidFibonacciMathML  <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE math PUBLIC "-//W3C//DTD MathML 2.0//EN"                   "http://www.w3.org/TR/MathML2/dtd/mathml2.dtd"> <math xmlns="http://www.w3.org/1998/Math/MathML">   <mrow>     <mi>f(1)</mi>     <mo>=</mo>     <mn>1</mn>   </mrow>   <mrow>     <mi>f(2)</mi>     <mo>=</mo>     <mn>1</mn>   </mrow> ... 

You can imagine other requests for the serializer. For example, you might want a line break after each </mrow> end-tag but no line breaks inside mrow elements. Although OutputFormat doesn't give you enough control to arrange serialization to this level of detail, you could write a custom subclass of XMLSerializer to accomplish this.



Processing XML with Java. A Guide to SAX, DOM, JDOM, JAXP, and TrAX
Processing XML with Javaв„ў: A Guide to SAX, DOM, JDOM, JAXP, and TrAX
ISBN: 0201771861
EAN: 2147483647
Year: 2001
Pages: 191

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net