Receiving Skipped EntitiesValidating parsers resolve all general entity references that occur in both element content and attribute values. However, nonvalidating parsers are allowed not to read the external DTD subset. Consider the simple XHTML document in Example 6.14. Example 6.14 An XML Document Containing a Potentially Skipped Entity Reference<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <body> <h1>My resumé</h1> </body> </html> If a parser does not read the DTD, then it has no way of knowing what the entity reference é stands for, or indeed whether that entity reference is even properly defined. However, such a nonvalidating parser will assume that the entity reference is defined in the external DTD subset it didn't read. But rather than reporting the replacement text for that entity, it reports a skipped entity using the skippedEntity() callback method: public void skippedEntity (String name ) throws SAXException For example, according to the XHTML 1.0 specification, if a User Agent such as a browser
In other words, rather than rendering &prescription_take; as the symbol 8, the browser is supposed to draw it as simply &prescription_take;. If you were writing an XHTML browser that did not validate but did require full conformance to XHTML 1.0, you would probably implement the skippedEntity() method by passing an ampersand, the name of the entity reference, and a semicolon to the characters() method in the same content handler, like this: public void skippedEntity(String name) throws SAXException { StringBuffer sb = new StringBuffer(); sb.append('&'); sb.append(name); sb.append(';'); char[] text = new char[sb.length()]; sb.getChars(0, sb.length(), text, 0) this.characters(text, 0, text.length); } Skipped entities can also appear in attribute values. For example: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <body> <div purpose="resumé"> ... </div> </body> </html> This is one of the few holes in SAX. The parser will not report such an entity to you. The value it assigns to the attribute is calculated by simply deleting the entity reference. In this example, the value of the purpose attribute would be reported as "resum" if the parser does not read the DTD. |