Section 2.3.  Elements and the logical structure

Prev don't be afraid of buying books Next

2.3. Elements and the logical structure

Most documents (for example books and magazines) can easily be broken down into components (chapters and articles). These can also be broken down into components (titles, paragraphs, figures and so forth). And those components can be broken down into components until we get to the textual data itself – words and sentences. At this point we would typically stop breaking the document into components unless we were interested in linguistic research.

It turns out that every document can be viewed this way, though some fit the model more naturally than others. In fact all information can be viewed this way...with the same caveat!

In XML, these components are called elements. Each element represents a logical component of a document. Elements can contain other elements and can also contain the words and sentences that you would usually think of as the text of the document. XML calls this text the document's character data. This hierarchical view of XML documents is demonstrated in Figure 2-4.

Figure 2-4. Hierarchical views of documents




Markup professionals call this the tree structure of the document. The element that contains all of the others (e.g. Book, Article or Memo) is known as the root element. This name captures the fact that it is the only element that does not "hang" off of some other element. The root element is also referred to as the document element because it holds the entire logical document within it. The terms root element and document element are interchangeable.

The elements that are contained in the root are called its subelements. They may contain subelements themselves. If they do, we will call them branches. If they do not, we will call them leaves.

Thus, the Chapter and Section elements are branches (because they have subelements), but the Paragraph and Title elements are leaves (because they only contain character data).

Elements can also have extra information attached to them called attributes. Attributes describe properties of elements. For instance a CIA-record element might have a security attribute that gives the security rating for that element. A CIA database might only release certain records to certain people depending on their security rating. It is somewhat of a judgement call which aspects of a document should be represented with elements and which should be represented with attributes.

Real-world documents do not always fit this tree model perfectly. They often have non-hierarchical features such as cross-references or hypertext links from one section of the tree to another. XML can represent these structures too.

A WordML document also has an element structure. But as the WordML document type is a complex rendition with over 400 element types, the structure isn't usually as clear as the abstraction in Figure 2-4.

Amazon


XML in Office 2003. Information Sharing with Desktop XML
XML in Office 2003: Information Sharing with Desktop XML
ISBN: 013142193X
EAN: 2147483647
Year: 2003
Pages: 176

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net