The Entity Interface


The Entity interface represents a parsed or unparsed general entity declared in a document's DTD. (DOM does not expose parameter entities.) A map of the entities declared in a document is available from the getEntities() method of the DocumentType interface. However, entities are not part of the tree structure, and the parent of an entity is always null.

An Entity object represents the actual storage unit. It does not represent the entity reference such as Ω or &copyright; that appears in the instance document, but rather the replacement text to which that reference points. For parsed entities that the XML parser has resolved, the descendants of the Entity object form a read-only tree containing the XML markup for which the entity reference stands. For unparsed entities and external entities that the XML parser has not read, the Entity object has no children.

Example 11.22 summarizes the Entity interface, which includes methods to get the public ID, system ID, and notation name for the entity. These methods all return null if the property is not applicable to this entity. To get the replacement text of an entity, use the methods Entity inherits from its Node superinterface, such as hasChildNodes() and getFirstChild() .

Example 11.22 The Entity Interface
 package org.w3c.dom; public interface Entity extends Node {   public String getPublicId();   public String getSystemId();   public String getNotationName(); } 

In Example 11.23, let's look at a program that walks the document looking for entity references. Every time it sees one, it prints out that reference's name, public ID, and system ID. To do this, it has to look up the entity reference name in the entities map returned by the getEntities() of the DocumentType interface. A java.util.Set keeps track of which entities have been printed to avoid printing any entity more than once.

Example 11.23 Listing Parsed Entities Used in the Document
 import javax.xml.parsers.*; import org.w3c.dom.*; import org.xml.sax.SAXException; import java.io.IOException; import java.util.*; public class EntityLister {   // Store the entities that have already been printed   private Set          printed = new HashSet();   private NamedNodeMap entities;   // Recursively descend the tree   public void printEntities(Document doc) {     DocumentType doctype = doc.getDoctype();     entities = doctype.getEntities();     seekEntities(doc);   }   // note use of recursion   private void seekEntities(Node node) {     int type = node.getNodeType();     if (type == Node.ENTITY_REFERENCE_NODE) {       EntityReference ref = (EntityReference) node;       printEntityReference(ref);     }     if (node.hasChildNodes()) {       NodeList children = node.getChildNodes();       for (int i = 0; i < children.getLength(); i++) {         seekEntities(children.item(i));       }     }   }   private void printEntityReference(EntityReference ref) {     String name = ref.getNodeName();     if (!printed.contains(name)) {       Entity entity   = (Entity) entities.getNamedItem(name);       String publicID = entity.getPublicId();       String systemID = entity.getSystemId();       System.out.print(name + ": ");       if (publicID != null) System.out.print(publicID + " ");       if (systemID != null) System.out.print(systemID + " ");       else {// Internal entities do not have system IDs         System.out.print("internal entity");       }       System.out.println();       printed.add(name);     }   }   public static void main(String[] args) {     if (args.length <= 0) {       System.out.println("Usage: java EntityLister URL");       return;     }     String url = args[0];     try {       DocumentBuilderFactory factory        = DocumentBuilderFactory.newInstance();       // By default JAXP does not include entity reference nodes       // in the tree. You have to explicitly request them by       // telling DocumentBuilderFactory not to expand entity       // references.       factory.setExpandEntityReferences(false);       DocumentBuilder parser = factory.newDocumentBuilder();       // Read the document       Document document = parser.parse(url);       // Print the entities       EntityLister lister = new EntityLister();       lister.printEntities(document);     }     catch (SAXException e) {       System.out.println(url + " is not well-formed.");     }     catch (IOException e) {       System.out.println(        "Due to an IOException, the parser could not read " + url       );     }     catch (FactoryConfigurationError e) {       System.out.println("Could not locate a factory class");     }     catch (ParserConfigurationException e) {       System.out.println("Could not locate a JAXP parser");     }   } // end main } 

Mostly this is fairly straightforward tree-walking code of the sort you've seen several times before. Note, however, that by default JAXP DocumentBuilder objects do not put any entity reference nodes in the trees they build. To get these, the expand-entity-references property must be explicitly set to false on the DocumentBuilderFactory that creates the DocumentBuilder .

Here is the output when I ran this across the DocBook source for this chapter. All of the entity references used here are internally defined references to single hard-to-type characters such as curly quotes and the em dash.

 D:\books\XMLJAVA>  java EntityLister file://D/books/XMLJava/ch11.xml  rsquo: internal entity mdash: internal entity hellip: internal entity 


Processing XML with Java. A Guide to SAX, DOM, JDOM, JAXP, and TrAX
Processing XML with Javaв„ў: A Guide to SAX, DOM, JDOM, JAXP, and TrAX
ISBN: 0201771861
EAN: 2147483647
Year: 2001
Pages: 191

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net