The Structure of XML

 <  Day Day Up  >  

The structure of XML is simple. It always begins with a prologue, thus:

 

 <?xml version="1.0" encoding="iso-4459-1" standalone="true" ?> 

This is optionally followed by a schema, which describes the structure of the data that follows . Although it's optional, we'll shortly see that including the schema is usually a good idea.

Schemas can come in several forms. The most basic is the DTD, which looks like the example in Listing 7.1.

Listing 7.1. A Document Type Definition
 <?xml version="1.0"?> <!DOCTYPE clientes [   <!ELEMENT clients   (client)+>   <!ELEMENT client    (nombre, apellidos)>   <!ATTLIST client    id CDATA #REQUIRED>   <!ELEMENT firstname     (#PCDATA)>   <!ELEMENT lastname  (#PCDATA)> ]> 

The DTD is like a schema, like what you see in FoxPro if you type

 

 USE (filename) DISPLAY STRUCTURE 

It lets you reconstruct the structure of the original table that was used to build the XML file.

The DTD is followed by elements , which is what XML calls data, and the element tags that bracket it. An element is like a field (a table column). It consists of the name of the element between angle brackets (for example, <name> ), followed by the data value, followed by the element tag with a forward slash before the element name (for example, </name> ). For example

 

 <firstname>Les</firstname> 

would be the XML element representing my first name.

Every record in an XML table has a record descriptor element in the same tagged format, so that a record with two fields might look like this:

 

 <record> <firstname>Les</firstname> <lastname>Pinter</lastname> </record> 

The table itself also has prefix and suffix tags. If an element is empty, a single tag with a forward slash at the end indicates that the attribute is empty, for example, <phone/> means that the phone field is empty. This is not required; you can simply put beginning and ending tags (for example, <phone></phone> ) to describe an empty field. But the shorter format occupies less space.

An element can contain attributes , which are specified within angle brackets as well. Attributes can represent a record in less space. The fragment shown in Listing 7.2 comes from the Currency.xml table that is widely distributed for use in virtually any software to display the proper symbol for any currency.

Listing 7.2. An XML File Making Extensive Use of Attributes
 <?xml version="1.0" encoding="UTF-8" ?> <!-- _lcid="1033" _version="" --> <!-- _LocalBinding --> <currencies> <currency name="" symbol="" display="Select..." /> - <!-- _locID@display="xdsln_cur_display" _locComment="{StringCategory=TXT}" --> <currency name="AED" symbol="?.?.?" display="AED (?.?.?)" /> <currency name="ALL" symbol="Lek" display="ALL (Lek)" /> <currency name="AMD" symbol="??." display="AMD (??.)" /> <currency name="ARS" symbol="$" display="ARS ($)" /> <currency name="AUD" symbol="$" display="AUD ($)" /> <currency name="AZM" symbol="???." display="AZM (???.)" /> <currency name="BGL" symbol="??" display="BGL (??)" /> <currency name="BHD" symbol="?.?.?" display="BHD (?.?.?)" /> <currency name="BND" symbol="$" display="BND ($)" /> <currency name="BOB" symbol="$b" display="BOB ($b)" /> <currency name="BRL" symbol="R$" display="BRL (R$)" /> ... <Currencies> 

XML can also contain XML, because it's just data. But to prevent the XML processor from trying to treat the embedded XML text as XML that needs to be rendered (for example, if the text contains the string "<abc>" ), the strings can be enclosed in CDATA sections, which consist of the string "<![CDATA[", followed by the text, followed by "]]" to close the CDATA section. This mechanism is commonly used in memo fields "just in case."

Encoding

The "encoding" notation in the XML prologue (the part of the ?XML line that says encoding="UTF-8" ) is referred to as a transformation format , and can be quite important, especially if you're using a language other than English. Because there are many alphabets, you use the encoding string to specify how the data is encoded.

UTF-8 is the most general. It uses one to six bytes to represent each character. If you ever get a chance to see a Chinese word processor, you can see the operator move down the narrowing tree of choices after each byte until the final tier appears, and you'll have a good understanding of how this works. But the browser just reads bytes until it completes a character, then displays it. How does it know when it's at the end of a character? Easy: The last byte of a character contains a zero in the high-order bit, so the first byte with a hex value of 128 or less is the end of the character. UTF-16 is Unicode, which uses two bytes for every character.

If a transformation format is used, the first two or three bytes of the XML file are used as a Byte Order Mark (BOM) to indicate which code was used to create the content. Documents encoded in UTF-8 start with a BOM of 0xEF 0xBB 0xBF , and those encoded in UTF-16 begin with 0xEF 0Xff or 0Xff 0XFE . The browser knows .

Namespaces

The namespace, if specified (using the xmlns identifier), serves as a prefix to distinguish different XML elements that may have the same element name. And if two different tables happen to have a Phone field, or if two different databases both have a Customers table, this could easily happen. We do something like this already in FoxPro when pulling tables from two different databases:

 

 SELECT * FROM Accounting!Invoices A, Sales!Invoices B where A.InvNum = B.InvNum 

The namespace can be specified before specifying the elements, in which case it can be applied to any of the elements that follow (see Listing 7.3).

Listing 7.3. XML File with Namespace Reference
 <?xml version="1.0" ?> <clients xmlns:pinter=" www.pinter.com <http://www.pinter.com>">  <client>   <pinter:name>Jos</pinter:name>   <pinter:lastname>Garca</pinter:lastname>  </client> </clients> 

The only purpose of this is to permit you to qualify each entry with the namespace and a colon , for example, Pinter:name instead of name . This allows you to build XML files in which the same element name may exist in several schema, by prefixing them with the alias for the namespace. It doesn't have anything to do with your Web site . The only reason you usually use your Web site is that there's only one, thanks to Register.com and the DSN servers. I've never had a name conflict requiring the use of namespaces to resolve conflicts, but the solution is free, so you probably should use it.

Data Models: XDR and XSD

Microsoft created a specification called XDR that was implemented with SQL Server, Office, and BizTalk to validate data structures. However, the World Wide Web Consortium (W3C) has come up with its own specification, the XSD, which does the same thing that the XDR does. Thus XDR is not the preferred type of schema to use; however, you should be able to read XML files containing an XDR schema if you encounter one based on this Microsoft legacy specification.

To implement an XDR schema, you create a DTD file with the extension .xdr and include a <schema></schema> section that describes the elements by type (for example, integers and dates). Then you implement it by including an "x-schema" specification, thusly:

 

 <clients xmlns="x-schema:mixdr.xdr"> 

After you've done this, any attempt to load an XML file that doesn't conform to the data definition included in the XDR file will fail. In addition to specifying data types, you can specify minimum and maximum values for certain fields, maximum string lengths, absence of required fields or presence of forbidden ones, and other types of validations.

Later in this chapter we'll see how to create an XSD file in FoxPro and use it to validate XML files. I've never used an XDR to enforce the structure of an XML file. I generally have control over both ends of the process, so I know for sure that the XML is what my program is expecting. I suppose that a browser environment would be the most likely one for using XDR validation. But it's another tool, and at least you'll know what it is when you see it.

Examples of XML

Suppose you have a DBF with the following structure:

 

 CLIENTS.DBF: ClientID   Numeric (6) Company    Char    (20) Phone      Char    (10) Balance    Numeric (8,2) 

And it contains two records:

 

 104  Smith Electronics     311-4032     189.00 107  Family Clinic         289-2904     225.12 

One XML representation is shown in Listing 7.4.

Listing 7.4. XML Representation of Data in Listing 7.5
 <clients>  <client>   <clientid>104></clientid>   <company>Smith Electronics</company>   <phone>311-4032</phone>   <balance>189.00</balance>  </client>  <client>   <clientid>104></clientid>   <company>Family Clinic</company>   <phone>289-2904</phone>   <balance>225.25</balance>  </client> </clients> 

I say one representation because there are several ways to represent this data as XML. For example, you can include a schema at the top of the file, and it could be an XDR or an XSD representation.

You can include or exclude the carriage returns that visually separate the elements and make the file easier for humans to read, although mattering not at all to a parsing program. And you can represent fields as elements or as attributes and thereby reduce the number of tags by half (because the </ends> aren't required when using attributes), so long as the receiving program knows what to look for.

FoxPro can read this kind of XML and turn it into a cursor (a temporary DBF). Use the following syntax:

 

 XMLToCursor ("D:93SAMS\Chapter7Code\Simple.XML", "Test", 512) BROWSE 

You can also open this file up and view it as a table in Visual Studio. Select File, Open , File from the menu, and then select Chapter7Code\Simple.XML , and you'll see the screen shown in Figure 7.1. Select the Data tab at the lower-left corner of the screen to display the table.

Figure 7.1. A simple XML file displayed as a table.
graphics/07fig01.jpg

A console program to read this file in Visual Basic .NET is approximately the same code, given Visual Basic .NET's syntax requirements:

 

 Sub Main()     Dim oXML As New Xml.XmlDocument     oXML.Load("D:93SAMS\Chapter7Code\Simple.xml")     Dim oNode As XmlNode     MsgBox(oXML.InnerXml) End Sub 

 <  Day Day Up  >  


Visual Fox Pro to Visual Basic.NET
Visual FoxPro to Visual Basic .NET
ISBN: 0672326493
EAN: 2147483647
Year: 2004
Pages: 130
Authors: Les Pinter

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net