CDATA Sections

As you know, XML processors are very sensitive to characters such as < and & . So what if you had a large section of text that contained a great many < and & characters that you didn't want to interpret as markup? You can escape those characters as &lt; and &amp; , of coursebut with many such characters, that's awkward and hard to read. Instead, you can use a CDATA section.

CDATA sections hold character data that is supposed to remain unparsed by the XML processor. This is a useful asset to XML. Otherwise, all the text in an XML document is parsed and searched for characters such as < and & . You use CDATA sections simply to tell the XML processor to leave the enclosed text alone and pass it on unchanged to the underlying application.

You start a CDATA section with the markup <![CDATA[ and end it with ]]> . Note that this means that actually CDATA sections are searched, but only for the ending text ]]> . Among other things, this means that you cannot include the text ]]> inside a CDATA sectionand it also means that you cannot nest CDATA sections.

Here's an example. In this case, I've added an element named <MARKUP> to a document, and this element itself contains markup that I want to preserve as character data (so that it can be printed out, for example). To make sure that the markup inside this element is preserved as text, I enclose it in a CDATA section like this:

 <?xml version = "1.0" standalone="yes"?>  <DOCUMENT>     <MARKUP>  <![CDATA[   <CUSTOMER>   <NAME>   <LAST_NAME>Smith</LAST_NAME>   <FIRST_NAME>Sam</FIRST_NAME>   </NAME>   <DATE>October 15, 2003</DATE>   <ORDERS>   <ITEM>   <PRODUCT>Tomatoes</PRODUCT>   <NUMBER>8</NUMBER>   <PRICE>.25</PRICE>   </ITEM>   <ITEM>   <PRODUCT>Oranges</PRODUCT>   <NUMBER>24</NUMBER>   <PRICE>.98</PRICE>   </ITEM>   </ORDERS>   </CUSTOMER>   ]]>  </MARKUP> </DOCUMENT> 

As you can see, CDATA sections are powerful because they enable you to embed character data directly in XML documents without having it parsed (normally, character data in XML documents is parsed by the XML processor and becomes parsed character data).

Here's another example. In this case, I'm adapting the JavaScript example in the previous section to show how the W3C wants to handle script code in XHTML pagesby placing that code in a CDATA section:

 <?xml version="1.0"?>  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using The if Statement In JavaScript         </title>     </head>     <body>         <script language="javascript">  <![CDATA[   var budget   budget = 234.77   if (budget < 0) {   document.writeln("Uh oh.")   }   ]]>  </script>         <center>             <h1>                 Using The if Statement In JavaScript             </h1>         </center>     </body> </html> 

Unfortunately, as mentioned in the previous topic, the idea of a CDATA section, especially one that starts with the expression <![CDATA[ and ends with the expression ]]> , confuses the major browsers. When those browsers are configured to handle XHTML, the situation will improve.



Real World XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 440
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net