The DocType Class


The org.jdom.DocType class summarized in Example 15.19 represents a document type declaration. Note that this points to and/or contains the document type definition (DTD), but it is not the same thing. JDOM does not have any representation of the DTD.

Example 15.19 The JDOM DocType Class
 package org.jdom; public class DocType implements Serializable, Cloneable {   protected String   elementName;   protected String   publicID;   protected String   systemID;   protected Document document;   protected String   internalSubset;   protected DocType();   public DocType(String elementName, String publicID,    String systemID);   public DocType(String elementName, String systemID);   public DocType(String elementName);   public String   getElementName();   public DocType  setElementName(String elementName);   public String   getPublicID();   public DocType  setPublicID(String publicID);   public String   getSystemID();   public DocType  setSystemID(String systemID);   public Document getDocument();   public void     setInternalSubset(String newData);   public String getInternalSubset();   public       String  toString();   public final boolean equals(Object o);   public final int     hashCode();   public       Object  clone(); } 

Each DocType object has four String properties, of which the last three may be null:

  • Root element name

  • Internal DTD subset

  • System ID

  • Public ID

For example, consider this document type declaration:

 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"                           "docbook/docbookx.dtd"> 

It has the root element name chapter , the public ID -//OASIS//DTD DocBook XML V4.1.2//EN , and the system ID docbook/docbookx.dtd . However, its internal DTD subset is null. This code fragment constructs a DocType object representing this document type declaration and uses it to construct a new Document object:

 DocType doctype = new DocType("chapter",   "-//OASIS//DTD DocBook XML V4.1.2//EN", "docbook/docbookx.dtd"); Element chapter = new Element("chapter"); Document doc = new Document(chapter, doctype); 

Note that JDOM does not require validity, only well- formedness . This means that the root element may in fact be different from what the document type declaration specifies. For example, the following is perfectly legal:

 DocType doctype = new DocType("chapter",   "-//OASIS//DTD DocBook XML V4.1.2//EN", "docbook/docbookx.dtd"); Element book = new Element("book"); Document doc = new Document(book, doctype); 

This document type declaration has a root element name and an internal DTD subset, but no public ID or system ID:

 <!DOCTYPE Fibonacci_Numbers [    <!ELEMENT Fibonacci_Numbers (fibonacci*)>   <!ELEMENT fibonacci (#PCDATA)>   <!ATTLIST fibonacci index CDATA #IMPLIED> ]> 

To set this up, you need to store the internal subset in a String and pass that to the setInternalSubset() method after the DocType object has been constructed , like so:

 DocType doctype = new DocType("Fibonacci_Numbers");  String dtd = "<!ELEMENT Fibonacci_Numbers (fibonacci*)>\n"; dtd += "<!ELEMENT fibonacci (#PCDATA)>\n"; dtd += "<!ATTLIST fibonacci index CDATA #IMPLIED>\n"; doctype.setInternalSubset(dtd); Element root = new Element("Fibonacci_Numbers"); Document doc = new Document(root, doctype); 

Unlike most node classes, JDOM doesn't fully check the data used in a DocType object for well-formedness . It does test that the root element name is a legal XML name, and it checks that the public and system IDs adhere to the minimum constraints for these items. However, it does not check that the public ID follows the standard conventions for public identifiers; it does not check that the system ID is a legal URL; and it does not even check the characters in the internal DTD subset, much less the syntax.

As an example of this class, let's look at a program that validates XHTML 1.0 documents. XHTML validity is a little stricter than HTML validity. In particular, according to the XHTML 1.0 specification, a valid XHTML document must satisfy these four conditions:

  • The document must be valid according to one of the three XHTML DTDs: strict, transitional, or frameset.

  • The root element of the document must be html .

  • This root html element of the document must specify the default namespace as http://www.w3.org/1999/xhtml using an xmlns attribute.

  • The document must contain a DOCTYPE declaration. The public identifier for the external DTD subset must reference one of the three XHTML DTDs, using one of these three public identifiers:

    • -//W3C//DTD XHTML 1.0 Strict//EN

    • -//W3C//DTD XHTML 1.0 Transitional//EN

    • -//W3C//DTD XHTML 1.0 Frameset//EN

There are a few other flaky rules scattered throughout the XHTML specification, mostly involving constraints that can't be reasonably specified in a DTD, such as that an a element cannot contain another a element, but these are the major ones that define strict XHTML conformance.

Example 15.20 is similar to the earlier JDOMValidator . That is, it reads a URL from the command line and validates the document found at that URL against its DTD. However, it also checks the four constraints just listed. Of particular interest in this discussion is that it checks that the document type declaration is pointing to one of the three legal DTDs. This is something that pure XML validation normally doesn't tell you.

Example 15.20 Validating XHTML with the DocType Class
 import java.io.IOException; import org.jdom.*; import org.jdom.input.SAXBuilder; public class XHTMLValidator {   public static void main(String[] args) {     for (int i = 0; i < args.length; I++) {       validate(args[i]);     }   }   private static SAXBuilder builder = new SAXBuilder(true);                                /* turn on validation ^^^^ */   // not thread safe   public static void validate(String source) {       Document document;       try {         document = builder.build(source);       }       catch (JDOMException e) {         System.out.println(source          + " is invalid XML, and thus not XHTML.");         return;       }       catch (IOException e) {         System.out.println("Could not read: " + source);         return;       }       // If we get this far, then the document is valid XML.       // Check to see whether the document is actually XHTML       boolean valid = true;       DocType doctype = document.getDocType();       if (doctype == null) {         System.out.println("No DOCTYPE");         valid = false;       }       else {         // verify the DOCTYPE         String name     = doctype.getElementName();         String systemID = doctype.getSystemID();         String publicID = doctype.getPublicID();         if (!name.equals("html")) {           System.out.println(            "Incorrect root element name " + name);           valid = false;         }         if (publicID == null           (!publicID.equals("-//W3C//DTD XHTML 1.0 Strict//EN")            && !publicID.equals(             "-//W3C//DTD XHTML 1.0 Transitional//EN")            && !publicID.equals(             "-//W3C//DTD XHTML 1.0 Frameset//EN"))) {           valid = false;           System.out.println(source            + " does not seem to use an XHTML 1.0 DTD");         }       }       // Check the namespace on the root element       Element root = document.getRootElement();       Namespace namespace = root.getNamespace();       String prefix = namespace.getPrefix();       String uri = namespace.getURI();       if (!uri.equals("http://www.w3.org/1999/xhtml")) {         valid = false;         System.out.println(source          + " does not properly declare the"          + " http://www.w3.org/1999/xhtml namespace"          + " on the root element");        }       if (!prefix.equals("")) {         valid = false;         System.out.println(source          + " does not use the empty prefix for XHTML");        }       if (valid) System.out.println(source + "is valid XHTML.");   } } 

Following is the result of running this program on the XHTML 1.0 specification:

 D:\books\XMLJAVA>  java XHTMLValidator http://www.w3.org/TR/xhtml1/  http://www.w3.org/TR/xhtml1/is valid XHTML. 

As one would hope, it proves valid.



Processing XML with Java. A Guide to SAX, DOM, JDOM, JAXP, and TrAX
Processing XML with Javaв„ў: A Guide to SAX, DOM, JDOM, JAXP, and TrAX
ISBN: 0201771861
EAN: 2147483647
Year: 2001
Pages: 191

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net