Chapter 2. Creating Well-Formed XML Documents

Chapter 2. Creating Well- Formed XML Documents

In the previous chapter, we got our start in XML and got an overview of how XML lets you structure your own documents, what XML is all about, and some of the uses you can make of it. It's time to take a look at XML in more depth and to sharpen our XML understanding until it's crystal clear.

In HTML, about 100 elements already are defined. Browsers can check the HTML in a Web page and display that page as they see fit. In XML, you have more freedomand so more responsibility. In XML, you define your own elements, and it's up to you to decide how they should be used. Despite their apparently free-form nature, XML documents are subject to a number of rules that allow them to be handled in a useful and reproducible way.

In fact, the rules that XML documents are subject to are significantly more stringent than the rules that HTML documents are subject to. As mentioned in the previous chapter, if an XML document cannot be successfully understood by an XML processor, the processor is not supposed to make any guesses about the structure of the document at allit's just supposed to quit, possibly returning an error.

As we also saw in the previous chapter, there are two specific constraints that XML documents are subject to: well- formedness and validity. As far as World Wide Web Consortium (W3C) is concerned , well-formedness is the more basic constraint. In the XML 1.0 specification itself, which represents the foundation of this and the next chapter, the W3C says that you can't even call a data object an XML document unless it's well formed:

A data object is an XML document if it is well-formed, as defined in this specification. A well-formed XML document may in addition be valid if it meets certain further constraints.

Why is it so important that XML documents be well formed? Why does the W3C specify that XML processors should not attempt to fix documents that are not well formed?

The W3C makes this stipulation mainly to stop XML processors from doing the same thing that HTML browsers have done to HTML: By trying to fix things, the major browsers have introduced their own versions of HTML that authors now rely on. The result is that there are many "versions" of HTML current today, and the W3C wants to avoid this problem with XML.

In this chapter, we're going to see what makes an XML document well formed, which is the minimal requirement that a data object must satisfy to be an XML document. The second constraint that you can require of XML documents is that they be valid, which means that they must obey the document type definition (DTD) or schema that you use to specify the legal syntax of the document. This chapter is all about what makes XML documents well formed, and the next chapter is all about what makes them valid.

Now that we're taking a look at how to build XML documents in a formal way, I'm going to start from the beginning so the foundation we're building is complete and solid. And that means starting with the W3C itself.



Real World XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 440
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net