Introducing XML


XML is a markup language using which you can store data in a structured format in plain text files, which can be read by many applications. You can reuse data in different types of applications, such as database applications. XML works in combination with HTML to separate data from its presentation. For example, in XML, you can define a name as a combination of first and last name ; HTML describes how data should be displayed on the Web page. The documents created in XML are stored with the extension .xml.

In HTML, you cannot use tags other than those that are already defined. Using XML, you can create your own tags.

Structure of XML Document

An XML document consists of user -defined tags. The XML document starts with a processing instruction, <?xml version="1.0">.

The tags that you create in the XML documents are called elements. All XML documents contain a root element, which is the outermost tag in the document. It is mandatory to close all tags that you create in an XML document. XML is case sensitive, and the closing tag must match the opening tag.

Listing 1-12 shows the structure of an XML document:

Listing 1-12: Structure of an XML Document
start example
 <?xml version="1.0"?> <Books> <Book> <Name>Programming in VC++</Name> <Author>Stephen Miller</Author> <Category>Programming</Category> <Price>50</Price> </Book> <Book> <Name>Designing Websites in Dreamweaver MX</Name> <Author>Angela Jones</Author> <Category>Web Design</Category> <Price>20</Price> </Book> </Books> 
end example
 

The above listing shows an XML document that stores information on several books. Save the above listing as Books.xml. The <Books> tag is the root element, and the starting and ending element of the document. Other user-defined tags in the listing are:

  • <Title> tag: Indicates the name of the book.

  • <Author> tag: Indicates the name of the author.

  • <Category> tag: Indicates the type of the book.

  • <Price> tag: Indicates the price of the book.

Figure 1-12 shows the output when you view Books.xml in the Mozilla Web browser:

click to expand: this figure shows all the <book> elements present in the <books> tag.
Figure 1-12: Viewing Books.xml

You can also use attributes with XML elements. An attribute provides additional information about the element. For example, you can indicate that the price of the books is in dollars, using an attribute, currency. The following code shows an example of using an attribute in an element:

 <Price currency="dollar">20</Price> 

You can also define empty elements in an XML document. An empty element does not contain any text. The following code shows an example of an empty element:

 <Currency type="dollar"/> 

You can also insert comments within XML documents. The XML parser does not parse the statements embedded in the comment tags. The following code shows how to insert comments in an XML document:

 <!--Text for comments--> 

The comment text is enclosed within the <!-- and --> symbols.

DTD

The rules that an XML document should follow can be defined using DTD. An XML document is valid if it conforms to the rules defined in a DTD document. The declarations in a DTD document include:

  • Element declarations : Represent the rules for the tags in an XML document.

  • Attribute declarations : Represent the rules for the attributes in the tags of XML documents.

  • Content declarations : Represent the rules for text contained in the elements.

You can create two types of DTDs, which are:

  • Internal DTD : Defines the rules for validating the structure of an XML file, within the XML file.

  • External DTD : Creates the rules for validating the structure of an XML file in a separate document and stores the file with the extension .dtd.

A DTD begins with the syntax, <!DOCTYPE>, and the rules defining the structure of a document are present within the <!DOCTYPE> tag. The instruction tag, <!ELEMENT>, defines the rules for an element. An example of <!ELEMENT> is:

 <!ELEMENT Book(Name, Author, Category, Price)> 

In the above example, the Book element contains four elements, Name, Author, Category, and Price. The <ELEMENT> tag indicates that the <Book> tag must contain the elements, Name, Author, Category, and Price, in an XML document, in the given order.

You can also define the type of content within an element. The XML elements can contain two types of data: Parsed Character Data (PCDATA) and Character Data (CDATA). The XML parser parses PCDATA but does not parses CDATA. You can define the content in an element using the following syntax:

 <!ELEMENT Name (#PCDATA)> 

In the above example, the content within the Name element is of PCDATA type.

You declare attribute using the <!ATTLIST> instruction. You use the <!ATTLIST> instruction using the following syntax:

 <!ATTLIST Price Currency CDATA #REQUIRED> 

In the above syntax:

  • Price is the name of the element for which you define the attribute rules.

  • Currency is the name of the attribute for the Price element.

  • The CDATA parameter specifies that the attribute value is of character data type.

  • The #REQUIRED parameter specifies that it is mandatory to use the Currency attribute with the Price element.

Listing 1-13 shows an internal DTD for an XML document:

Listing 1-13: Internal DTD
start example
 <?xml version="1.0"?> <!DOCTYPE Books [ <!ELEMENT Books (Book)+> <!ELEMENT Book (Name, Author, Category, Price)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Author (#PCDATA)> <!ELEMENT Category (#PCDATA)> <!ELEMENT Price (#PCDATA)> <!ATTLIST Price Currency CDATA #REQUIRED> ]> <Books> <Book> <Name>Web Designing in XHTML</Name> <Author>Angela Jones</Author> <Category>Web Designing</Category> <Price Currency="dollar">20</Price> </Book> <Book> <Name>Programming in VC++</Name> <Author>Stephen Miller</Author> <Category>Programming</Category> <Price Currency="dollar">50</Price> </Book> </Books> 
end example
 

The above listing shows the internal DTD created in an XML file. In the above listing:

  • The XML document conforms to the rules defined in the internal DTD.

  • The root element is defined as Books, and contains more than one Book element.

  • The Book element contains the Name, Author, Category, and Price elements that contain PCDATA.

  • The Price element defines the mandatory Currency attribute.

Save the above listing as InternalDTDBooks.xml.

Figure 1-13 shows the output when InternalDTDBooks.xml is viewed in the Mozilla Web browser:

click to expand: this figure shows that both the <book> elements are nested in the <books> element.
Figure 1-13: Viewing InternalDTDBooks.xml

You can also define an external DTD, where the rules to validate the structure of an XML document are specified outside the XML document. In an external DTD, the rules are defined in a .dtd file, and the reference to the file is provided in the XML file that conforms to the rules of the external DTD.

Listing 1-14 shows an external DTD:

Listing 1-14: The External DTD MyDTD.dtd
start example
 <!ELEMENT Book (Name, Author, Category, Price)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Author (#PCDATA)> <!ELEMENT Category EMPTY> <!ELEMENT Price (#PCDATA)> <!ATTLIST Category Ctype CDATA #REQUIRED> 
end example
 

The above listing shows an external DTD file. The Book element contains the Name, Author, Category, and Price elements. The Category element is empty and defines an attribute, Ctype, which is mandatory with the Category element.

The <!DOCTYPE> instruction provides a reference to the external DTD document in an XML file.

Listing 1-15 shows the XML file with a reference to the external DTD:

Listing 1-15: Referencing an External DTD
start example
 <?xml version="1.0"?> <!DOCTYPE Book "MyDTD.dtd"> <Book> <Name>Programming in C++</Name> <Author>Stephen Wright</Author> <Category Ctype="Programming"/> <Price></Price> </Book> 
end example
 

The above listing shows how to use the <!DOCTYPE> instruction to provide a reference to the external DTD file, MyDTD.dtd. The structure of the XML document is checked against the rules defined in MyDTD.dtd.

Namespaces

Namespaces are naming conventions that you use to prevent inconsistency between element names. An inconsistency between element names occurs if you use the same name to identify two different elements. You can prevent this inconsistency by assigning a prefix to the name of the elements.

Listing 1-16 shows how to assigning a prefix to an element:

Listing 1-16: Assigning Prefix
start example
 <doc> <keyframes>12</keyframes> <framerate>24</framerate> <my:frame> <my:keyframes>100</my:keyframes> <my:framerate>100</my:framerate> </my:frame> </doc> 
end example
 

In the above code, you use the my prefix with the frame tag to avoid conflict with the existing HTML tags, <keyframe> and <framerate>. The <my:keyframes> and <my:framerate> tags are encapsulated in </my:frame> tags.

The xmlns attribute is used with a user-defined element, such as <my:frame>, to assign the namespace URI to the element. The syntax of using the xmlns attribute is:

 <prefix:MyElement xmlns:prefix="NamespaceURI"> 

In the above syntax:

  • The prefix parameter represents the name of the prefix that is used with the user-defined element to avoid name conflicts.

  • The MyElement parameter represents the name of the user-defined XML element.

  • The xmlns:prefix represents the namespace URI. In the xml:prefix attribute, the :prefix parameter represents the prefix that is being used to access the namespace.

  • The NamespaceURI parameter is the namespace URI.

Listing 1-17 shows the use of namespaces in an XML file:

Listing 1-17: Namespaces in XML
start example
 <doc xmlns:document="namespace1" xmlns:my="namespace2"> <document:keyframes>12</document:keyframes> <document:framerate>24</document:framerate> <my:frame> <my:animationno>1</my:animationno> <my:keyframes>100</my:keyframes> <my:framerate>20</my:framerate> </my:frame> </doc> 
end example
 

The above listing shows how to create a file that contains user-defined XML elements with namespace prefixes applied on the elements.

XML Schemas

An XML schema defines the structure of an XML document and is an alternative to DTD. An XML Schema defines the following:

  • Elements that can be used in an XML document.

  • Attributes and text that an element can contain.

  • Child elements that a parent element can contain, and the order of the child elements in which they appear within a parent element.

The <schema> element represents the root of an XML schema. The syntax to use the <schema> element is:

 <?xml version="1.0"?> <xsi:schema xmlns:xsi="http://www.w3.org/2001/XMLSchema" xmlns="namespace1" elementFormDefault="qualified"> ... </xsi:schema> 

In the above syntax:

  • The prefix, xsi:, represents the XML namespace.

  • Data types, such as schema, element, string, and Boolean, used in the XML schema, are present in the www.w3.org/2001/XMLSchema name space.

  • The xmlns attribute specifies the default namespace, which is namespace1.

  • The elementFormDefault attribute specifies that the namespace qualifier, xsi, should prefix every element declared in the XML document.

XML schemas are of two types, external and internal. An external schema is saved in a separate file with the extension, .xsd. An internal schema is defined within the same XML file that contains the XML elements. After defining an XML schema namespace prefix, you can use the schemaLocation attribute of the <schema> element to refer to an external schema. The schemaLocation attribute takes two values. The first parameter represents the namespace and the second parameter represents the location of the external schema. For example, if the name of the external schema is external.xsd, then you can refer to it in your XML document using the following code:

 xsi:schemaLocation=" http://www.w3.org/2001/XMLSchema external.xsd" 

After beginning the <schema> element, you can define various elements that it can contain. The syntax to define a schema element is:

 <prefix:element name="xxx" type="yyy" default="value" fixed="value"/> 

The above syntax shows how to define a schema element. In the above syntax:

  • Prefix represents the namespace prefix, such as xsi.

  • The name attribute represents the name of the element, and the type attribute represents the data type of the element.

  • The default attribute specifies the default value for the element.

  • The fixed attribute indicates that the element cannot have any other value except the value specified by the fixed attribute.

Some XML schema data types are: xsi:string, xsi:decimal, xsi:integer, xsi:Boolean, xsi:date, and xsi:time.

Listing 1-18 shows how to use the internal XML schema to define the structure of an XML document:

Listing 1-18: Using Internal XML Schema
start example
 <?xml version="1.0"?> <xsi:schema xmlns:xsi="http://www.w3.org/2001/XMLSchema" targetNamespace="namespace1" xmlns="namespace2" elementFormDefault="qualified"> <xsi:element name="Employee"> <xsi:complexType> <xsi:sequence> <xsi:element name="id" type="xsi:integer"/> <xsi:element name="name" type="xsi:string"/> <xsi:element name="address" type="xsi:string"/> </xsi:sequence> </xsi:complexType> </xsi:element> <Employee> <id>1</id> <name>John Smith</name> <address>Washington, D.C.</address> </Employee> </xsi:schema> 
end example
 

The above listing defines an XML schema for the <Employee> element that contains three attributes: id, name, and address. In the above listing:

  • The <xsi:complexType> element specifies that the <Employee> element contains other XML elements, such as <Id>, <name>, and <address>.

  • The <xsi:sequence> element defines various elements that an <Employee> element contains in the order in which they appear in the XML document.

Save the above listing as EmployeeSchema.xml.

Figure 1-14 shows the output when EmployeeSchema.xml is viewed in the Mozilla Web browser:

click to expand: this figure shows that the default namespace prefix, xsi, precedes each schema element. the <employee> element is embedded in the <xsi:schema> element.
Figure 1-14: Viewing EmployeeSchema.xml



Integrating PHP and XML 2004
Integrating PHP and XML 2004
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 51

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net