Chapter 7. XML on the Web


XML began as an effort to bring the full power and structure of SGML to the Web in a form that was simple enough for nonexperts to use. Like most great inventions , XML turned out to have uses far beyond what its creators originally envisioned . Indeed, there's a lot more XML off the Web than on it. Nonetheless, XML is still a very attractive language in which to write and serve web pages. Since XML documents must be well- formed and parsers must reject malformed documents, XML pages are less likely to have annoying cross-browser incompatibilities. Since XML documents are highly structured, they're much easier for robots to parse. Since XML element and attribute names reflect the nature of the content they hold, search-engine robots can more easily determine the true meaning of a page.

XML on the Web comes in three flavors. The first is XHTML, an XMLized variant of HTML 4.0 that tightens up HTML to match XML's syntax. For instance, XHTML requires that all start-tags correspond to a matching end-tag and that all attribute values be quoted. XHTML also adds a few bits of syntax to HTML, such as the XML declaration and empty-element tags that end with /> . Most of XHTML can be displayed quite well in legacy browsers, with a few notable exceptions.

The second flavor of XML on the Web is direct display of XML documents that use arbitrary vocabularies in web browsers. Generally, the formatting of the document is supplied either by a CSS stylesheet or by an XSLT stylesheet that transforms the document into HTML (perhaps XHTML). This flavor requires an XML-aware browser and is not supported by older web browsers such as Netscape 4.0.

A third option is to mix raw XML vocabularies, such as MathML and SVG, with XHTML using Modular XHTML. Modular XHTML lets you embed RDF cataloging information, MathML equations, SVG pictures, and more inside your XHTML documents. Namespaces sort out which elements belong to which applications.

