The EntityRef class shown in Example 15.21 represents a defined entity reference such as © or &chapter1; . It is used only for entity references that the parser does not expand. Given a fully validating parser, or even just one that reads the external DTD subset, no EntityRef objects will normally be present in the tree. Example 15.21 The JDOM EntityRef Classpackage org.jdom; public class EntityRef implements Serializable, Cloneable { protected String name; protected String publicID; protected String systemID; protected Object parent; protected EntityRef(); public EntityRef(String name); public EntityRef(String name, String systemID); public EntityRef(String name, String publicID, String systemID); public EntityRef detach(); public Document getDocument(); public String getName(); public EntityRef setName(String name); public Element getParent(); public String getPublicID(); public EntityRef setPublicID(String newPublicID); public String getSystemID(); public EntityRef setSystemID(String newSystemID); public final boolean equals(Object ob); public final int hashCode(); public String toString(); public Object clone(); } Each EntityRef object has these four properties:
The public and system IDs will be null if the parser did not read the part of the DTD that defined the entity. The one thing you might expect that is not available is the entity's replacement text. Unlike the EntityReference interface in DOM, JDOM EntityRef objects do not have any children. If the builder knows the replacement text of the entity, then it will insert the corresponding nodes in the tree rather than including an EntityRef object. There's infrequent need to use this class directly. You can add an entity reference in place of the characters that you know are going to cause problems in your choice of encoding. On the other hand, you're probably better off just letting the XMLOutputter emit numeric character references instead. If you do choose to insert EntityRef objects into your JDOM tree, then be sure to use a DocType that either points to an external DTD subset or includes an internal DTD subset that defines your entities. JDOM will not do this for you automatically, so if you aren't careful you can produce a malformed document. For an example, let's turn once again to XHTML. Browsers generally use nonvalidating parsers and tend not to read the external DTD subset by default. Thus they're likely to encounter skipped entity references. The XHTML specification states:
Here is a simple method that assists with this requirement by converting all EntityRef objects in a tree to Text objects of the form &name; . public static void entityRefToText(Element element) { List content = element.getContent(); ListIterator iterator = content.listIterator(); while (iterator.hasNext()) { Object o = iterator.next(); if (o instanceof Element) { Element child = (Element) o; entityRefToText(child); } else if (o instanceof EntityRef) { EntityRef ref = (EntityRef) o; Text fauxRef = new Text("&"); fauxRef.append(ref.getName()); fauxRef.append(";"); iterator.set(fauxRef); } } } There's one technique here you may not have seen before. Instead of a basic Iterator , I used a ListIterator . The reason is that ListIterator has an optional set() method (which JDOM does implement) that replaces the last object returned by next() with another object. That's how I replace the EntityRef with a Text . Caution Be sure you understand the difference here. A Text object always contains plain text, never an entity reference or a tag, even if it contains some characters such as & and < that might need to be escaped when the document is serialized. For example, invoking element.setText("<") sets the content of element to the four characters &, l, t, and ; in that order. It does not set it to the single character <. When element is serialized, its content will be written as &lt; . |