Introduction to XML

 < Day Day Up > 



The XML features in Word 2003 are so simple to use that you really don't need to know XML to enjoy its benefits. But knowing the basics of XML makes it easier to understand how it can be helpful in your own work. Toward that end, this section of the chapter gives you some XML fundamentals and definitions of some common terms you'll find used in connection with XML.

XML Glossary

Here are the XML terms you're sure to see in this chapter and other writings on XML:

  • DTD (Document Type Definition) A definition of the data elements allowed in the XML document.

  • Element A piece of data defined in an XML document, enclosed with start and end tags; for example, <TITLE>Microsoft Office Word 2003 Inside Out</TITLE>.

  • Cascading Style Sheets (CSS) A collection of formatting instructions that control the display of the document. Stylesheets can be in a separate file and linked to the document or can be embedded in the document itself.

  • XML data Also called an XML document, the .xml file is the raw XML data stored independently of the format for presenting it.

  • XML schema A definition of the data elements allowed in the XML document. The XML schema is a superset of the DTD. You may also see the acronym XSD used to refer to an XML schema definition.

  • XSLT (XSL Transformations) XSLT is used to convert XML documents into various document formats, most commonly HTML.

  • XSL (Extensible Stylesheet Language) A language used to create stylesheets that can be used to transform XML into various document formats, using XSLT.

  • Well-formed A well-formed XML document is one that adheres to the constraints defined by either the DTD or the XML schema.

XML Defined

Although it's hard to give XML a concise definition, the easiest and broadest approach says that XML is a highly flexible format for exchanging data and using it in applications. XML is an open standard for describing data in a format that is readable to humans, while also defining data elements within a document. Some people refer to XML as a markup language because, after all, that's what its name (Extensible Markup Language) says. But XML is more than a language of tags; XML actually lets you create a type of markup language that is specific to your data needs. With XML, you use specific rules to create your own tags and stylesheets; the individual tags describe the content and meaning of the data rather than its display format, which HTML controls.

In the big picture, the process goes like this:

  1. You create or open a document in Word 2003.

  2. You attach or create an XML schema or DTD.

  3. You save the document as XML data only or in full XML format.

  4. If you choose to transform the XML document, you can apply an XSLT to display the saved XML document in a specific view.

What Does XML Look Like?

XML is generally reader-friendly, meaning that people, not just machines, can easily read and follow the basic logic in the code. For example, the following simple example of an XML document contains information about a series of workshops offered by a sporting goods company:

<TRAINING>     <CLASS>         <TITLE>Mountain Biking</TITLE>         <INTRUCTOR>Lee</INSTRUCTOR>         <DATE>August 8, 2003</DATE>         <DURATION>6 weeks</DURATION>         <COST>$240</COST>     </CLASS>     <CLASS>         <TITLE>Rappelling</TITLE>         <INTRUCTOR>Jack</INSTRUCTOR>         <DATE>June 24, 2003</DATE>         <DURATION>4 weeks</DURATION>         <COST>$160</COST>     </CLASS>     <CLASS>         <TITLE>Kayaking</TITLE>         <INTRUCTOR>Jason</INSTRUCTOR>         <DATE>July 10, 2003</DATE>         <DURATION>6 weeks</DURATION>         <COST>$240</COST>     </CLASS> </TRAINING>

As you can see, each element has an opening tag and a closing tag (for example, the cost of a class is enclosed by a beginning tag, <COST>, and an ending tag, </COST>). The tagged elements are nested inside other tags; for example, each class record begins and ends with a <CLASS> tag; inside those tags are nested tags for each individual data item stored for that class.

Note 

XML tags are case-sensitive by default. If you're creating an XML document that contains data to be used by various business systems, be sure to verify how the systems expect to operate with your tags before creating your document.

Because XML allows you to name the content of the data, you can use that same class information as easily in a database as you can in a spreadsheet, a word processing document, a report, or an e-mail. Using an XML schema, which defines the rules for naming the XML data, and an XSLT, which is the template that will provide the format for the final document, you can make XML data usable in many different forms in all sorts of contexts, from one end of your organization to another.

By comparison, HTML (Hypertext Markup Language), the primary markup language used on the Web, is a tagging system that controls the way information is formatted, not the content of the data itself. A heading, for example, might have an <H1> or <H2> tag to designate the size of the heading; the <FONT> tag is used to specify the type family, size, color, and style of the text. But the HTML tags can't describe the content of the heading, and it's the content—the actual data itself—that can be used in other documents, such as databases, spreadsheets, reports, and so forth. The XML tags name the content of the data—not the formatting.

start sidebar
XML: A Public, International Standard

XML was developed by the World Wide Web Consortium (W3C) to create an easy-to use, easy-to-read standard that would allow information exchange across platforms all over the world. The W3C is a public organization with the sole purpose of creating standards and new technologies for the Internet. You can find out more about the W3C and its various activities (including in-depth information on the development and application of XML for businesses and individuals) by going to http://www.w3.org.

end sidebar

Note 

What is an "open standard"? A technology based on an open standard is open for use and development by the public; there are no licensing fees or proprietary standards owned by a specific company or organization.

The Benefit of Reusable Data

One aspect of XML's flexibility is the way it separates form from function. Because you are creating names that reflect the content and not the format of your data, you can pull that content into other forms by applying what's known as a transform (or template).

Suppose you're writing a marketing flyer about a new product. When you apply XML tags to the product name, specifications, and cost, you are coding that content so that it can be pulled easily into other documents—perhaps a catalog, a Web page, or an advertising brochure. This enables you not only to save keystrokes but also to reduce the margin for error—instead of having five different people write five different types of marketing copy about one product, you can have someone do it once, and then allow everyone to use the finalized document. This creates a consistent message (everybody says the same thing about the product, which is a great help toward consistent communication) and frees time others would spend rewriting the same text.

Make Your Own Rules

The XML support in Word 2003 gives you the option of attaching customized XSDs (XML Schema Definitions). This means that you can create your own rules for naming and storing your data—and you can also share schemas with others in your office, company, or industry. You can choose to save your documents in Word's default schema (WordML) and attach your own customized (also called arbitrary) schema, which describes the language and functionality you need in your business.

Note 

Creating a schema is a fairly complicated task—certain rules govern the development of "well-formed" code—but it's not as daunting as you might think. If you'd like to know more about XML and learn to develop your own schemas for your customized XML use, see XML Step by Step, Second Edition (Microsoft Press, 2002), by Michael J. Young.



 < Day Day Up > 



Microsoft Office Word 2003 Inside Out
Microsoft Office Word 2003 Inside Out (Bpg-Inside Out)
ISBN: 0735615152
EAN: 2147483647
Year: 2005
Pages: 373

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net