Entities

In the previous chapter, we got an introduction to the idea of entities in XML documents. There are two kinds of entities: general entities and parameter entities. General entities are probably used by more XML authors because you use them in the content of your XML document. However, parameter entities, which you use in a document's DTD, are also available and very powerful.

So what exactly is an entity? An entity is simply XML's way of referring to a data item; entities are usually text, but they can also be binary data. You declare an entity in a DTD and then refer to it by reference in your document. General entity references start with & and end with ; . Parameter entity references start with % and end with ; . For text entities, the entity reference is replaced by the entity itself when parsed by an XML processor.

In other words, you declare an entity in the DTD and refer to it with an entity reference, either in the document's content for general entities or in the DTD for parameter entities.

Entities can be internal or external. An internal entity is defined completely inside the XML document that references it (and, in fact, the document itself is considered an entity in XML). External entities, on the other hand, derive their content from an external source, such as a file, and a reference to them usually includes a uniform resource identifer (URI) at which they can be found. Entities can also be parsed or unparsed. The content of parsed entities is well- formed XML text; unparsed entities hold data that you don't want parsed, such as simple text or binary data. We'll see how to deal with all kinds of entities here.

In fact, we've already seen the five predefined general entity references in XML: &lt; , &gt; , &amp; , &quot; , and &apos; . They stand for the characters < , > , & , " , and ' , respectively. Because these entities are predefined in XML, you don't need to define them in a DTD; for example, here's a document that uses all five predefined entity references:

Listing ch04_01.xml
 <?xml version = "1.0" standalone="yes"?> <TEXT>     This text about the &quot;S&amp;O Railroad&quot;     is the &lt;TEXT&gt; element&apos;s content. </TEXT> 

Each of these entity references is replaced by the appropriate character when parsed by an XML processor. For example, you can see this document open in Internet Explorer in Figure 4-1. As you see in that figure, every entity reference has indeed been replaced.

Figure 4-1. Using the predefined entities in Internet Explorer.

graphics/04fig01.gif

The five predefined entity references are very useful when you want to use as text the specific characters that are interpreted as markup.

You can also define your own entities by declaring them in a DTD. To declare an entity, you use the <!ENTITY> element (just as you use the <!ELEMENT> element to declare an element). Declaring a general entity looks like this:

 <!ENTITY  NAME   DEFINITION  > 

Here, NAME is the entity's name and DEFINITION is its definition. The name of the entity is just the name you want to use to refer to the entity. The entity's definition can take several different forms, as we'll see in this chapter.

The simplest possible entity definition is just the text that you want a reference to that entity to be replaced with. Here's an example showing how that looks. In this case, I'm defining a general entity named TODAY to hold a dateOctober 15, 2003, in this DTD:

 <?xml version = "1.0" standalone="yes"?>  <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)>  <!ENTITY TODAY "October 15, 2003">  ]>     .     .     . 

And that's all it takes. Now when I put a reference to this entity, &TODAY; , into the document, it'll be replaced with the text October 15, 2003 by the XML processor:

Listing ch04_02.xml
 <?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ENTITY TODAY "October 15, 2003"> ]> <DOCUMENT> <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>  <DATE>&TODAY;</DATE>  <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>  <DATE>&TODAY;</DATE>  <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>  <DATE>&TODAY;</DATE>  <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT> 

You can see the results of this document in Internet Explorer in Figure 4-2. As you see there, the &TODAY; entity references have been replaced with the full text we've specified.

Figure 4-2. Using user -defined entities in Internet Explorer.

graphics/04fig02.gif

Besides general entities as in this example, we'll see parameter entities in this chapter, designed to be used in DTDs themselves . Declaring a parameter entity looks like this (note the % ):

 <!ENTITY %  NAME   DEFINITION  > 

Besides setting up your own entities in XML, you can customize your document's elements by declaring attributes for those elements.



Real World XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 440
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net