Structured Standards

The structure of a document will often be one of the most important features for use and human readability. Take this book as an example—it has top-level headings, secondary headings, and so on. This is accomplished through the use of various "styles" that many of you have seen in word-processing programs. Many of these styles imply a structure, and that structure results in the table of contents: the ordering of the information in the book world.

When using XML, structure is important because it can show the parent-child relationship between elements. Take a hypothetical <name> element. If it's the child element of a <company> element, it most likely has a different meaning than if it's a child of a <person> element. This structure, although not necessary, makes the use of long element and attribute names unnecessary.

We've included unstructured.xml (Listing A-1) and structured.xml (Listing A-2) as examples. Although they contain the same information, the ability of a human or application to digest what is contained in the document is much easier in the more verbose structured.xml. Because XML requires only a single root element to be well-formed, the unstructured.xml document is actually a well-formed XML document.

Listing A-1 unstructured.xml: Using long names to imply structure.

 <?xml version = "1.0" encoding = "UTF-8"?> <entry> <employeecompanyname>Some Company, Inc</employeecompanyname> <employeename>Allen Wyke</employeename> </entry> 

Listing A-2 structured.xml: Applying structure with a hierarchy of elements.

 <?xml version = "1.0" encoding = "UTF-8"?> <entry type="employee"> <company> <name>Some Company, Inc</name> </company> <name>Allen Wyke</name> </entry> 

The DTDs of XML are not the only method of applying structure to your documents and data. Within this section we will explore other structure-related specifications, such as Namepaces in XML, XML Schema, and the Resource Description Framework, and standards at your disposal.

Namespaces in XML

At its core, the purpose of XML is to be extensible. The ability of individuals, companies, and industries to define languages, markup, dialects, or all these factors to describe their data is powerful. However, this inter networked world has no use for a reinvented wheel. Many languages have similar elements, attributes, and structure. Contact information, for instance, usually contains a name, address, phone number, email, and the like.

Instead of having each schema define its own language, XML was created to be flexible enough to embed multiple languages into instance documents. This includes elements, as well as attributes. Flexibility allows an XHTML document, for instance, to also contain MathML markup. Let's use this as an example.

Listing A-3, sample-ns.xml, is an XHTML document that defines XHTML as the default namespace. In the <body> of the XHTML we have embedded some MathML, defining its own namespace for the enclosed elements.

Listing A-3 sample-ns.xml: Shows how namespaces in XML are used.

 <?xml version = "1.0" encoding = "UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> <title>XHTML and MathML Namespace Example</title> </head> <body> <p>The following is marked in MathML</p> <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> <mi>x</mi> <mo>=</mo> <mfrac> <mrow> <msqrt> <mrow> <msup> <mi>b</mi> <mn>2</mn> </msup> <mo>-</mo> <mrow> <mn>4</mn> <mo>&InvisibleTimes;</mo> <mi>a</mi> <mo>&InvisibleTimes;</mo> <mi>c</mi> </mrow> </mrow> </msqrt> </mrow> <mrow> <mn>2</mn> <mo>&InvisibleTimes;</mo> <mi>a</mi> </mrow> </mfrac> </mrow> </math> </body> </html> 

We included MathML in with our XHTML. All we needed to do was define the namespace for the MathML elements, which is done with the xmlns attribute specifying the namespace URI so a parser will know how to interpret the content. Figure A-1 shows what this example looks like in the MathML supporting version of the open source Mozilla browser.

Figure A-1 Rendering our sample-ns.xml document in the Mozilla browser.

If you want to see this work for yourself, download the Mozilla version that supports MathML. Also download some extra fonts to support the MathML characters. You can obtain more information about this at http://www.mozilla.org/projects/mathml.

Associating a namespace with a block of elements (between the start and ending <math> elements in our example) is not the only way to declare namespaces. Declaring a namespace within the root element of the document is also possible, giving it a prefix and then prefacing all elements and attributes of that namespace with the same prefix. For instance, if we wanted to create a namespace in this manner for our previous example, we could have had the following:

 <html xmlns="http://www.w3.org/1999/xhtml" xmlns:mml="http://www.w3.org/1998/Math/MathML" lang="en" xml:lang="en"> 

By declaring the MathML namespace here, we can use any of the elements and attributes of this language in the body of our document by placing mml: in front of them. Listing A-4, sample2-ns.xml, demonstrates this approach. Notice how we are able to include a MathML element inside an XHTML element without having the root <math> element present; the namespace allows us to do this.

Listing A-4 sample2-ns.xml: Using Namespaces in XML to include MathML within an XHTML document.

 <?xml version = "1.0" encoding = "UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:mml="http://www.w3.org/1998/Math/MathML" lang="en" xml:lang="en"> <head> <title>XHTML and MathML Namespace Example</title> </head> <body> <p>The following is marked in MathML</p> <mml:mrow> <mml:mi>x</mml:mi> <mml:mo>=</mml:mo> <mml:mfrac> <mml:mrow> <mml:msqrt> <mml:mrow> <mml:msup> <mml:mi>b</mml:mi> <mml:mn>2</mml:mn> </mml:msup> </mml:mrow> </mml:msqrt> </mml:mrow> <mml:mrow> <mml:mn>2</mml:mn> </mml:mrow> </mml:mfrac> </mml:mrow> <p>Here is an example of including a MathML element right in the body of an XHTML element <mml:msqrt></mml:msqrt></p> </body> </html> 

This is all we are going to cover on Namespaces in XML here. We used namespaces heavily throughout the book, so you should be familiar with them and the practice of using them.

If you would like more information on Namespaces in XML, please check out http://www.w3.org/TR/REC-xml-names.

XML Schema

XML Schema is, though controversial at times, considered to be the next generation of XML. It brings together the best of XML 1.0 with other related standards such as XML-Data, Document Content Description for XML (DCD), SOX, and Document Definition Markup Language (DDML), all of which are W3C Notes.

This effort, housed at the W3C, reached Recommendation status recently. The Recommendation is divided into three sections: Primer, Structures, and Datatypes. The Primer, Part 0, is a document to get you familiar with the core XML Schema language: Structures and Datatypes. It provides some good reading and examples, so if you are not familiar with XML Schema, it's worth the read.

Part 1, Structures, provides the mechanisms to define structure and any constraints that your data might need. This not only includes the DTDs defined in XML 1.0, but also those exploited through the use of namespaces. This part of the specification also relies on Part 2, Datatypes.

The final part of XML Schema addresses datatypes. It provides means by which datatypes can be defined in XML Schema or other XML-based languages. Thes datatypes defined in XML Schema represent a superset of the datatyping capabilities in XML 1.0's DTDs.

As an example, let's first build an XML DTD (entry.dtd in Listing A-5) for the structured.xml document we used earlier, and then show its XML Schema representation. Before we show you the schema, let's create a copy of the structured.xml document, called structured-valid.xml, and include the following line so it can be compared against our schema by a parser.

Listing A-5 entry.dtd: DTD that corresponds and describes the data in structured-valid.xml.

 <!DOCTYPE entry SYSTEM "entry.dtd"> <?xml version='1.0' encoding='UTF-8' ?> <!ELEMENT entry (company , name)> <!ATTLIST entry type CDATA #IMPLIED > <!ELEMENT company (name)> <!ELEMENT name (#PCDATA)> 

The XML Schema representation is shown in entry.xsd (Listing A-6). As you can see, XML Schema can be more verbose than a DTD, but at the same time, this example is a direct translation and does not add any of the benefits of XML Schema, such as the datatyping.

Listing A-6 entry.xsd: XML Schema version of our DTD.

 <?xml version = "1.0" encoding = "UTF-8"?> <xsd:schema xmlns:xsd = "http://www.w3.org/2001/XMLSchema"> <xsd:element name = "entry"> <xsd:complexType> <xsd:sequence> <xsd:element ref = "company"/> <xsd:element ref = "name"/> </xsd:sequence> <xsd:attribute name = "type" type = "xsd:string"/> </xsd:complexType> </xsd:element> <xsd:element name = "company"> <xsd:complexType> <xsd:sequence> <xsd:element ref = "name"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name = "name" type = "xsd:string"/> </xsd:schema> 

XML Schema delivers many advancements and enhancements, so it would be wise to familiarize yourself with XML Schema and potentially build some applications using it. It is an incredibly powerful standard that can offer a wealth of abilities to implementers.

If you want more information on XML Schema, check out http://www.w3.org/XML/Schema. More good information is at http://www.ascc.net/~ricko/XMLSchemaInContext.html.

Resource Description Framework

One of the first efforts that emerged to enforce structure on data was the Resource Description Framework (RDF). RDF represents a manner in which metadata can be created to describe other data, as well as the basic syntax for the encoding and transmission of this metadata. The table of contents of this book can be considered metadata because it is not the actual contents of the book, but represents a description of what the book contains.

Behind the scenes, RDF is an XML language with the use of namespaces. It not only allows for the interoperability of metadata, but also for a machine-understandable description of Web resources (anything with a URI). These resources are described through a collection of properties, including property type and value, called RDF Description.

As an example, let's say we had a URI (http://mspress.microsoft.com/xml/example) that we wanted to describe as being created by a person at a particular company. Let's also say that the entry.dtd we created earlier contained the structure we wanted to use to define the name of this person and company. Listing A-7 shows our example, sample.rdf.

Listing A-7 sample.rdf: Sample RDF document describing a URI with our entry.dtd data model.

 <?xml version="1.0"?> <RDF xmlns = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"  xmlns:EN = "http://mspress.microsoft.com/xml/entry.dtd"> <Description   about="http://mspress.microsoft.com/xml/example" > <EN:entry> <EN:company> <EN:name>Some Company, Inc.</EN:name> </EN:company> <EN:name>R. Allen Wyke</EN:name> </EN:entry> </Description> </RDF> 

The # symbol at the end of the RDF URI is important. It combines the namespace name with the local name to get the full URI of a property type.

The first task we perform after declaring it as an XML document is to define the default namespace (RDF) and a second namespace (Entry Name—entry.dtd). Using the RDF <Description> element, we specify that we want to describe the http://mspress.microsoft.com/xml/example URI. To define this we use the <name> element of (child of <company>) to hold the company name and the top level <name> element to define the individual.

This is a simplistic example, but it demonstrates the purpose of RDF: the ability to describe URIs (Web resources). In addition to the <Description> element, RDF also has other elements. Elements, for instance, that allow you to define Bags (<Bag>), Sequences (<Seq>), and Alternatives (<Alt>)—all types of collections. You also have several attributes at your disposal that do everything from identify individual elements within a collection to specifying the type of resource it is.

For more information on RDF check out http://www.w3.org/RDF.



XML Programming
XML Programming Bible
ISBN: 0764538292
EAN: 2147483647
Year: 2002
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net