7.2. How It WorksXML has four basic components :
Take a closer look at each. 7.2.1. XML DocumentsXML documents may be used for a wide variety of content. A document might be text based (such as a magazine article), or it might contain only numerical data to be transferred from one database or application to another. An XML document might also contain an abstract structure, such as a particular vector graphic shape (as in SVG) or a mathematical equation (as in MathML).
It is important to note that an XML document is not limited to one physical file. It may be made up of content from multiple files that are integrated via special markup, or it may exist only as records in a database that are assembled on the fly. The end result is always marked-up text content. 7.2.2. Document Type Definition (DTD)Some XML languages also use a Document Type Definition (DTD) that defines each element allowed in the document along with its attributes and rules for use. An XML-compliant application may check the document against its DTD to "decode" the markup and make sure that it follows its own rules. A document that conforms to its DTD is said to be valid . DTDs are discussed in detail later in this chapter. An updated method for defining XML elements and document structure is XML Schemas . A particular instance of an XML Schema is called an XML Schema Definition (XSD) . The difference is that XSDs are XML-based, while DTDs (an older form of schema) are created according to the rules of SGML. XSDs are more powerful in describing XML languages, but the price is that they also tend to be more complicated and difficult to read and write. XML Schemas are outside the scope of this introductory chapter, but you can find information on the W3C site at www.w3c.org/XML/Schema. 7.2.3. Style Sheets and XMLA markup language describes only the structure of a document; it is not concerned with how it looks. Like HTML, XML documents can use Cascading Style Sheets for presentation. In fact, the CSS Level 2 Recommendation has been broadened for use with all XML applications, not just web documents. CSS is covered in Part III of this book. Another style sheet language called the Extensible Stylesheet Language (XSL) exists for XML documents. XSL creates a large overhead in processing, whereas CSS is fast and simple, making it generally preferable. XSL is useful when the contents of the XML document need to be "transformed" before final display. Transforming generally refers to the process of converting one XML language to another, such as turning a particular XML language into XHTML on the fly, but it can also be used for transformations as simple as replacing words with other words. An Extensible Stylesheet Language for Transformations(XSLT, a subset of XSL) style sheet works as a translator in the transformation process. XSL is not covered in this chapter; for more information, see the XSL information on the W3C site at www.w3.org/Style/XSL/. 7.2.4. ParsersSoftware that interprets the information in XML documents is called an XML parser or processor. Parsers are generally built into other XML-compliant applications (such as web browsers or database servers), although standalone, command-line XML parsers do exist. It's the parser's job to pass elements and their contents to the application piece by piece for display or execution. One of the things the parser does is make sure that the XML document is well-formed , that is, that it follows all of the rules of XML markup syntax correctly. If a document is not well-formed, parsers are instructed not to process it (although some are more forgiving than others). Well-formedness is discussed in the following section. Some parsers are also validating parsers, meaning they check the document for validity against a DTD. |