What Is XML?


XML (and its parent SGML) is an open , international standard that has been under development for a number of years . Originally envisioned as a language for describing traditional documents and facilitating their conversion between different media (for example, a printed manual and an online help system), XML has proved to be invaluable for a huge variety of tasks where information must be described and shared between different applications. Today XML is being used in applications that vary from bank and stock transaction processing, to Web pages, and of course to document processing. XML has reached a level of maturity that enables its use in "mainstream" businesses, and it is starting to assume a prominent role in the processing of business documents.

What distinguishes XML documents from other types of documents is that the document is marked up to reflect the structure and semantics of the content. The way the document is presented to the user is handled separately through the Extensible Stylesheet Language (XSL). (XSL Transformations, which are also called style sheets or transforms , are discussed more later in this hour .)

To apply the XML structuring to a document, the author encloses discrete pieces of content ”address, phone number, author, filename, part number, you name it ”within tags that indicate what type of data each piece of content (or element ) represents. (The tags and other XML- related information in an XML document are known collectively as XML markup . If you're familiar with HTML, you will note the similarity between XML tags and HTML tags.)

The structure of an XML document and the elements it may contain are defined in an associated file called a schema . The schema enables Word and other programs to parse the document ”validate its structure and identify its components . Programs can then take actions based on the content (elements) defined in the schema and tagged in the XML document (for instance, text in the document tagged as a part number could be automatically checked to ensure that the part number was valid). You can associate multiple schemas with an XML document, each of which defines different aspects in the XML document. For example, a typical XML document in Word will have the document's own schema, which describes the content (or data) of the document, plus Word's internal schema (called WordML), which describes the formatting applied to the document, document properties such as the author's name and protected regions , plus tracked changes.

graphics/bookpencil_icon.gif

In the world of XML, people often talk about a document's presentation rather than its formatting because in some applications the document may not be "formatted." For example, a document that is used to drive a text-to-speech application would not contain formatting in the usual sense.


The presentation of XML documents is defined by XSLT style sheets (transformations), which are essentially lists of instructions telling Word and other programs that process the XML documents how the various elements should look and what should be done with them. If a developer wants to output an XML document to multiple media, such as a Web page, online help, and a PDF file, he or she creates a separate style sheet for each type of output. For example, a style sheet that is used to transform an XML document into a Web page might include instructions that convert the e-mail address elements in the document into HTML mailto links, and the style sheet that you use to output the same document to a PDF file might contain instructions for formatting the email address element in boldface.

XSL Transformations convert the source XML document into a customized XML version of that document that is geared toward a particular output (or other use of the information in the document). In many cases the actual conversion of the document to its final output media, say a PDF or an online help file, is handled by an application that reads the transformed XML file and generates the final output.

The complex range of possible transformations of a document can be quite overwhelming. In Word, Microsoft uses the concept of a solution to package the components involved in a transformation. Word solutions can also incorporate Smart Documents (described in the next section) to further simplify the process.

Standard Word formatting can, optionally , be saved saved in an XML document. Word uses XML elements defined in the WordML schema to store formatting information. In addition to formatting, WordML provides a way to store document metadata, such as tracked changes, protected regions, and document properties while maintaining XML compatibility. The WordML schema definition is not stored in the Schema Library with your XML schemas, so you won't see it when browsing schemas. As you'll see later in this hour, when you save an XML document you can specify whether to include the WordML information in it. If you save an XML file without applying your own schema to it, Word saves the document in WordML (in other words, it applies the WordML schema to it). When you save an XML file as XML (by choosing XML Document in the Save As Type list in the Save As dialog box), the file is stored on the disk in plain text. You can also save an XML file as a Word document or Word template. If you save the file in either of these formats, the file is stored on disk in Word's binary format and, although the XML markup in the file is preserved, the file will not be accessible to other applications that process XML files.



Sams Teach Yourself Microsoft Office Word 2003 in 24 Hours
Sams Teach Yourself Microsoft Office Word 2003 in 24 Hours
ISBN: 067232556X
EAN: 2147483647
Year: 2003
Pages: 315
Authors: Heidi Steele

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net