XML is short for extensible markup language:
Extensible—That is, we are not stuck with a predefined set of tags or identifiers
Markup—A way to add information about structure or content from within a document or transaction
Language—A standard syntax and grammar for the markup
To put it another way, XML is a tagged markup language that allows anyone to define their own tags. On the surface that sounds like a prescription for anarchy, and there is some of that, but as we'll discuss in this chapter, it is this flexibility that lends itself to semantic expression.
Figure 11.1 is a tiny snippet of HTML. It is a tagged markup language. The <h1> and matching </h1> are "tags" around the name "John Smith." But in the case of HTML, all we know about John Smith is that it is a heading.
<h1> John Smith ?/h1>
The XML fragment in Figure 11.2 is also tagged, but in this case we have some clue as to what or who "John Smith" is. "Author" is something other than a formatting hint, but at this level we can't tell any more about it. The difference is between format tags and content tags. But before we get into the difference let's take a look at the motivation behind the creation of XML.
?Author> John Smith ?/Author>