You can create XML content in many different ways, including
Typing XML content in a text or XML editor
Generating XML content with a server-side file
Extracting XML content from software such as Office 2003
Consuming XML generated by a web service or news feed
Each of the XML documents that you create will have different content and structure. The only thing theyll have in common is the rules that you use to create them. At the very minimum, all XML documents must be well formed . Later on, well look at creating valid documents with a DTD or schema.
You can use a text editor like Notepad or SimpleText to type your XML content. Youll need to enter every line using your keyboard, which could take a long time if youre working with a large document. When youve finished, save the file with an .xml extension and youll have created an XML document.
You can also use a text editor to create a DTD, schema, or XSL style sheet. Just remember to use the correct file extension .dtd for DTDs, .xsd for schemas, and .xsl for XSL style sheets.
Dont forget that if youre using Notepad, youll probably need to change Save as type to All Files before you save the document. Otherwise, you could end up with a file called address.xml.txt by mistake. Figure 3-1 shows the correct way to do this.
Text editors are easy to use, but they dont offer any special functionality for XML content. Text editors wont tell you if your tag names dont match, if youve mixed up the cases of your element names , or if youve nested them incorrectly. There are no tools to check if your XML document meets the rules set down in a DTD or schema. Text editors dont automatically add color to your markup. In fact, you may not find any errors in your XML documents until you first try to use an XML parser.
You can also use HTML editors like HomeSite and BBEdit to create XML documents. The advantage of these over text editors is that they can automate the process a little. HTML editors often come with extensions for working specifically with XML documents. For example, they can add the correct declarations to the file and auto-complete your tag names. Theyll also add coloring to make it easier to read your content.
However, youll still have to type in most of your content line by line. Again, most HTML editors dont include tools to validate content and to apply transformations. You can only expect that functionality from an XML editor.
An XML editor is a software program designed to work specifically with XML documents. Most XML editors include tools that auto-complete tags, check for well- formedness , and validate XML documents. You can use XML editors to create XSL style sheets, DTDs, and schemas.
The category XML editors includes both free and for-purchase software packages. With such a range of great XML tools available, youd have to wonder why people would want to create XML documents with a text or HTML editor.
Common XML editors include
SyncRO Soft <oXygen/>
WebX Systems UltraXML
RustemSoft XMLFox (freeware)
You can find a useful summary of XML editors and their features at www.xmlsoftware.com/editors.html.
Although it isnt mandatory to use an XML editor when creating XML documents, its likely to save you time, especially if you work with long documents.
Altova XMLSpy 2005 is one of the most popular XML editors for PCs. You can download a free home user edition of the software from www.altova.com/download_spy_home.html. You can also purchase a version with additional professional level features.
As well be using XMLSpy in this section of the book, its probably a good idea to download it and install it on your computer. If youre working on a Macintosh, youll need to get access to a PC if you want to try out the examples.
You can work with any type of XML content in XMLSpy, including XHTML documents. It includes a text editor interface as well as graphical features. XMLSpy offers features such as checking for well-formedness and validity. It also helps out with tag templates if youve specified a DTD or schema.
You can use XMLSpy to create DTDs and schemas as well as XSL style sheets. It also allows you to apply style sheets to preview transformations of your XML documents.
Well look at some of the features of this software package in a little more detail as an illustration of whats possible with XML editing software.
To start with, when you create a new document, XMLSpy allows you to choose from many different types. Figure 3-2 shows you some of the choices.
Depending on the type of document you choose, XMLSpy automatically adds the appropriate content. For example, choosing the type XML Document automatically adds the following line to the new file:
<?xml version="1.0" encoding="UTF-8"?>
When you create a new XML document, XMLSpy will ask you if you want to use an existing DTD or schema. Figure 3-3 shows the prompt.
If you choose either a DTD or schema and select a file, XMLSpy will create a reference to it in your XML document:
<phoneBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="addressSchema.xsd">
If you dont include a DTD or schema reference, you can always add one later by using the DTD/Schema menu.
You can use XMLSpy in Text view, like a text editor, or in Authentic view, which has WYSIWYG features. Schemas can also use the Schema/WSDL view, a graphical presentation that simplifies the creation process. The final option is Browser view, which simulates how a document would display in a web browser.
XML documents with a referenced DTD or schema will show you extra information when you work in Text view. Clicking on an element or attribute in the main window will display information about it in the Info panel on the left side. You can see this in Figure 3-4.
The Entry Helpers panel on the right shows a list of the available elements. The panel also shows you common entities. One very useful feature is the ability to add an element template to the main window from the Elements panel.
Position your cursor in the XML document, double-click the appropriate tag name , and XMLSpy adds an element template to the code. This is very handy if the element youve chosen contains child elements as XMLSpy adds the complete tree from that point, including attributes.
Open the resource file address.xml in XMLSpy to test these features. Click to the left of the closing </phoneBook> tag and press ENTER . Position your cursor in the blank line and double-click the <contact> element in the Elements panel. XMLSpy will insert a <contact> element, complete with child elements, into the document.
Another feature of XMLSpy is checking whether an XML document is well formed. If you are using a text editor, youd have to do this by loading the document into an XML parser and checking for errors. Not only is this time consuming, but the error messages are often not as detailed as youd like them to be!
In XMLSpy, you can check the document by clicking the button with the yellow tick or by using the F7 key. XMLSpy then checks all the requirements for well-formed documents, including a single root node, tag case, element ordering, and quotes on attributes. I covered the requirements for well-formed documents in Chapter 2.
If XMLSpy finds an error, youll see a message at the bottom of the screen with a Recheck button, as shown in Figure 3-5.
If you want to see it in action, change the address.xml file to introduce a deliberate mistake and check it again for well-formedness. You could change the case of one of the closing tags or remove the apostrophes from an attribute. Youll see a detailed error message that will help you to pinpoint where you went wrong.
XMLSpy can also check if an XML document is valid against a DTD or schema. Click the button with the green tick or use the F8 key. Figure 3-6 shows an invalid document after its been checked in XMLSpy.
You can test this feature by checking if address.xml is valid against its schema addressSchema.xsd . You might want to open up the schema file to have a look at the content. It will make a lot more sense to you later in the book!
Finally, if youre going to transform your XML document with XSLT, you can use XMLSpy to create the style sheet and to preview the transformation.
Once youve added a style sheet reference to your XML document, use the F10 key to apply the transformation. XMLSpy will create an XSLOutput.html file and display your transformed content.
You can add a style sheet reference by choosing XSL/XQuery Assign XSL and selecting the file listStyle.xsl . Make sure you check the Make path relative to address.xml check box before clicking OK . XMLSpy adds the style sheet reference to the XML document.
<?xml-stylesheet type="text/xsl" href="listStyle.xsl"?>
Press the F10 key to see the transformation. Figure 3-7 shows the XSLOutput.htm file created by XMLSpy.
Hopefully, some of the preceding examples have shown you how XML editors can help you to work with XML documents. A full-featured product like XMLSpy can save you a lot of time by validating and transforming your documents in the click of a button.
You can use content from any server-side file that generates XML. That means you can use a ColdFusion, PHP, or .NET file to create the XML content for you dynamically. For example, you might query a database and receive the response as an XML document. You might also use a server-side file to query the files and folders within your computer. Server-side code can create an XML document that describes the folder structures and file names.
The following listing shows some VB .NET code that generates a list of folders and files in XML format. The resource file MP3List.aspx contains the complete listing.
<%@ Page Language="vb" Debug="true" %> <%@ import Namespace="System" %> <%@ import Namespace="System.IO" %> <%@ import Namespace="System.XML" %> <script runat="server"> Dim strDirectoryLocation as String = "e:\mp3z\" Dim dirs As String(), fileInfos as String() Dim i as Integer, j as Integer sub Page_Load Dim MP3Xml as XmlDocument = new XmlDocument() Dim folderElement as XMLElement Dim songElement as XMLElement Dim writer As New XmlTextWriter(Console.Out) writer.Formatting = Formatting.Indented MP3Xml.AppendChild(MP3Xml.CreateXmlDeclaration("1.0", "UTF-8", "no")) Dim RootNode As XmlElement = MP3Xml.CreateElement("mp3s") MP3Xml.AppendChild(RootNode) if Directory.Exists(strDirectoryLocation) then dirs = Directory.GetDirectories(strDirectoryLocation) for i = 0 to Ubound(dirs) dirs(i) = replace(dirs(i), strDirectoryLocation, "") next Array.sort(dirs) for i=0 to Ubound(dirs) folderElement = MP3Xml.CreateElement("folder") folderElement.SetAttribute("name", dirs(i)) RootNode.AppendChild(folderElement) fileInfos = Directory.GetFiles(strDirectoryLocation & dirs(i) & "\", "*.mp3") for j = 0 to Ubound(fileInfos) fileInfos(j) = replace(fileInfos(j), strDirectoryLocation & dirs(i) & "\", "") next Array.sort(fileInfos) for j = 0 to Ubound(fileInfos) songElement = MP3xml.CreateElement("song") songElement.SetAttribute("filename", fileInfos(j)) folderElement.AppendChild(songElement) next next End If dim strContents as String = MP3Xml.outerXML response.write (strContents) end sub </script>
The server-side file returns a list of folders and MP3 files in an XML document. Figure 3-8 shows how the file looks when viewed in a web browser. Note that because the file contains server-side code, youll have to run it through a web server like Microsoft Internet Information Services (IIS). If you check the address bar in the screenshot, youll see that the file is running through http://localhost/.
This is an example of an XML document that doesnt exist in a physical sense. I didnt save a file with an .xml extension. Instead, the server-side file creates a stream of XML data. The VB .NET file transforms the file system into an XML document.
Believe it or not, Microsoft Office can be a source of XML content. For PCs, Microsoft Office 2003 has built-in XML support within Word, Excel, and Access. Unfortunately for Macintosh users, Office 2004 doesnt provide the same level of support. Macintosh users can use Excel 2004 to read and write XML documents, but they cant use schemas and style sheets.
Most people wouldnt think of Office documents as containers for structured XML information. Normally, when we work with Office documents we are more concerned with the appearance of data. Word, Excel, and Access 2003 all offer support for information exchange via XML. These applications can open, generate, and transform XML documents.
Word 2003 creates WordprocessingML (previously called WordML) while Excel writes SpreadsheetML . Both are markup languages that conform to the XML specification. You can find out more about these languages at www.microsoft.com/office/xml/default.mspx.
Whenever you use Save as and select XML format in Word or Excel, youre automatically generating one of those markup languages. Unfortunately, both languages are quite verbose as they include tags for everythingdocument properties and styling as well as the data itself. The resulting XML document can be quite heavy.
An alternative is to use a schema or XSL style sheet to format the output. You can extract the data to produce a much more concise XML document. Applying a schema to Word or Excel allows other people to update the content in Office without seeing a single XML tag.
Access also allows you to work with data in XML format, but it doesnt have its own built-in XML language. You just export straight from a table or query into an XML document that replicates the field structure.
In this section, Ill show you how to generate XML from Office 2003. The examples use sample files from the books resources, so you can open them and follow along if youd like. They are illustrations of the functionality that is available in Office 2003 rather than step-by-step tutorials. Well do some more hands-on work with Office 2003 XML in Chapters 5, 6, and 7.
The stand-alone and professional versions of Word 2003 provide tools that you can use to work with XML documents. The trial edition of Word doesnt give you the same functionality. Lets look at the different ways that you can create and edit XML information in Word.
Creating an XML document using Save As
The simplest way to generate an XML document from Word 2003 is to use the File Save As command and choose XML Document as the type. Figure 3-9 shows how to do this.
You can see a before and after example in your resource files. Ive saved the Word document simpledocument.doc as simpledocument.xml . The source Word file contains three lines, each a different heading type. You can open simpledocument.xml in Notepad or an XML editor, to see the WordprocessingML generated by Word.
The following listing shows the first few lines of simpledocument.xml :
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <?mso-application progid="Word.Document"?> <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/ 2003/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" w:macrosPresent="no" w:embeddedObjPresent="no" w:ocxPresent="no" xml:space="preserve"> <o:DocumentProperties><o:Title>Heading 1</o:Title> <o:Author>Sas Jacobs</o:Author>
The listing Ive shown doesnt display all of the content of the Word document; it only lists the introductory declarations. Feel free to repeat the test yourself to see the enormous amount of XML generated by Word.
Youll notice that there is a processing instruction on the second line of the XML document that instructs it to open in Word. If you double-click the file name, the XML document will probably open in Word 2003. As I have XMLSpy installed, this doesnt happen on my computer. However, if I tried to use this XML document within Flash, the document would probably open in Word 2003 and skip Flash altogether. Id have to delete the processing instruction first.
A number of namespaces are listed in the XML document. These identify the elements in the document. Each namespace has a unique prefix. For example, the prefix o refers to the namespace urn:schemas-microsoft-com:office:office . The elements <o:DocumentProperties> , <o:Title> , and <o:Author> use the prefix o so they come from this namespace. More information about namespaces is available in Chapter 2.
The document also includes a declaration to preserve space: xml:space="preserve" . The last lines in the listing are elements, and youll recognize the information contained in tags like <o:Title> and <o:Author> .
Scroll through the document and youll see that it has sections such as <o:DocumentProperties> , <w:fonts> , <w:styles> , and <w:docPr> . The actual content of the document doesnt start until the <w:body> tag. WordprocessingML is descriptive, but contains a lot of information about the styling applied within the document. It is concerned with both the data and the presentation of the data.
If you knew how to write WordprocessingML, you could create a document in an XML editor and open it in Word. You could also edit the WordprocessingML from the Word document in your XML editor as an alternative way to make changes to the document.
Working with the XML Toolbox
You can download a tool to work with XML directly in Word 2003. It is a plug-in called XML Toolbox, which you can download from the Microsoft website at www.microsoft.com/downloads/details.aspx?familyid=a56446b0-2c64-4723-b282-8859c8120db6&displaylang=en. Youll need to have a full version of Word 2003 and the .NET Framework installed before you can use the Toolbox. Installing the plug-in is very simple. You need to accept the license agreement and click the Install button.
Once youve installed the Toolbox, youll have an extra toolbar called the Word XML Toolbox. Figure 3-10 shows this toolbar. You can use XML Toolbox to view the XML elements within a document or to add your own content.
Choose the View XML command from the XML Toolbox drop-down menu to see the WordprocessingML from within Word 2003. Figure 3-11 shows the XML source.
You can use the XML document generated by Word 2003 in other applications. For example, you could use Word to manage content for a web application or a Flash movie.
Youre a little limited in the types of XML documents that Word 2003 can produce. Word doesnt handle data that repeats very well. Youd be better off to use Excel 2003 or Access 2003 instead. Its better to use Word 2003 documents as a template or form for XML data. You can create the document structure and set aside blank areas for the data.
Creating XML content by using a schema
If you have the stand-alone or professional versions of Word 2003, youll be able to use schemas to ensure that an XML document created in Word is valid according to your language rules. A schema will also allow you to reduce the number of XML elements created from the document.
You need to follow these steps to create an XML document in Word 2003 using a schema:
Create a schema for the XML document.
Create a Word 2003 template that uses the schema.
Create a new document from the template and save the data in XML format.
The result is a valid XML document that is much smaller than its WordprocessingML relative.
Lets look at this more closely in an example. Chapter 5 provides you with the step-by-step instructions that youll need to work through an example. The next section gives you an overview of the main steps and isnt intended as a tutorial.
Creating the schema
I used the following schema to describe the XML structure for my news item. The resource file newsSchema.xsd contains the complete schema. Youll learn how to create schemas a little later in this chapter.
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="news"> <xsd:complexType> <xsd:sequence> <xsd:element name="newsDate" type="xsd:string"/> <xsd:element name="newsTitle" type="xsd:string"/> <xsd:element name="newsContent" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
The schema describes the following structure. The root element <news> contains the <newsDate> , <newsTitle> , and <newsContent> elements. There can only be one of each of those elements, and they must be included in the order specified. The elements all contain string data.
Creating the Word 2003 template
Ive created a simple template called newsXML.dot to show a news item. It is made up of three form fields to capture the date, title, and content of the news item. If you have Word 2003, you can open the file to see how it looks. Use the CTRL-SHIFT-X shortcut key to toggle the display of the XML tags.
This template already has the schema applied, but Ive included the instructions here in case you want to re-create it yourself. Well cover this in more detail in Chapter 5. After you open the template, youll need to unlock it if you want to make any changes. Choose Tools Unprotect Document .
To apply the schema to a Word 2003 template, choose Tools Templates and Add-Ins and select the XML Schema tab. Click the Add Schema button and navigate to the schema file. Enter a URI or namespace for the schema and an alias, as shown in Figure 3-12.
When youve finished, the schema alias should appear in the Templates and Add-Ins dialog box, as shown in Figure 3-13.
To streamline the XML produced by this document, click the XML Options button and choose the Save Data Only option. This excludes formatting information from the output. Make sure that Validate document against attached schemas is also checked.
You can only apply the XML tags if you have selected the Show XML tags in the document option in the Task Pane. If you cant see the Task Pane, choose View Task Pane and choose XML Structure from the drop-down menu at the top.
First, you need to apply the root element to the entire document. Select all of the content, right-click, and choose Apply XML element . Select the news element. When prompted choose Apply to Entire Document . You should see the content surrounded by a shaded tag, as shown in Figure 3-14.
Then you can apply the other elements to each part of the Word document. Select the fields, one by one, and apply the tags by right-clicking and selecting Apply XML element . When youve finished, the document should look similar to the one shown in Figure 3-15.
The result is a template that maps to an XML schema. Dont forget to lock the fields before you save and close the template. Choose View Toolbars Forms and click the padlock icon.
Creating XML content from a new document
Once youve created the template, you can generate XML content from documents based on this template. Choose File New and select the news template. When the new document is created, all you have to do is fill in the fields. You can hide the XML tags by deselecting the Show XML tags in the document option in the Task Pane.
Output the XML by choosing File Save and selecting the XML document type. Make sure you check the Save data only option before you save. Youll see the warning shown in Figure 3-16. Click Continue .
The resource file NewsItem.xml contains the competed XML document from Word 2003. The following listing shows the content:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <news xmlns="newsSchema"> <newsDate>July 4, 2005</newsDate> <newsTitle>Fireworks extravaganza!</newsTitle> <newsContent>US expats in Australia celebrated the 4th of July with firework demonstrations throughout the country.</newsContent> </news>
Compare the structure and content of this XML document with the one that didnt use a schema, simpledocument.xml . The tag names in this document are more descriptive, and it is significantly shorter than the WordprocessingML document. It would be very easy to use this XML document within a Flash movie. Using simpledocument.xml would be much harder.
Well cover the step-by-step instructions for creating XML from Word 2003 in much more detail in Chapter 5.
If you own Excel Professional or Enterprise edition, youll be able to work with XML documents. Again, you cant use the trial edition of Excel 2003. As with Word, you can save an Excel file in XML format so that you can use it on the Web or in Flash. You can also use Excel to open an XML document so that you can update or analyze the information.
Excel document structures are very rigid. They always use a grid made up of rows and columns . This means that the structure of XML data generated from Excel will match this format. In Word, its possible for you to include elements within other elements or text. For example, you could display this XML structure using Word:
<title> This is a title by <author>Sas Jacobs</author </title>
In Excel, the smallest unit of data that we can work with is a cell . Cells cant contain other cells , so our XML document structure with mixed content cant display properly in Excel. Any XML document generated from Excel will include grid-like data.
Excel uses a document map to describe the structure of XML documents. A document map is like a simpler version of a schema.
In this section, Ill show you how to work with existing XML documents in Excel. Its an overview of the functionality thats available rather than a complete tutorial. Youll find more detailed information in Chapter 6.
Creating an XML document using Save As
As with Word, the easiest way to create an XML document from Excel is to save it using the XML document type. Choose File Save As and select XML in the Save as type drop-down box. Ive done this with the file simplespreadsheet.xls ; you can see the resulting XML document saved as simplespreadsheet.xml .
Youll notice that a simple Excel document has created a large XML document. This listing shows the first few lines of the XML document:
<?xml version="1.0"?> <?mso-application progid="Excel.Sheet"?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> <Author>Sas Jacobs</Author> <LastAuthor>Sas Jacobs</LastAuthor>
The second line of the file is a processing instruction that instructs the file to open in Excel:
As with Word, a number of namespaces are referenced in the XML document. The element names <DocumentProperties> and <Author> are self-explanatory. The XML document includes information about each sheet in a <Worksheet> element. There are descriptions for <Table> , <Column> , <Row> , <Cell> , and <Data> , and Excel methodically describes the contents of each worksheet by column and by row. This is how Excel translates the grid style of Excel documents into XML. What were most interested in is the contents of the <Data> elements; they contain the values from each cell.
As with Word, youll notice that Excel generates a long XML document. Its hard for humans to read, and extracting the content would be a lengthy process. Again, using a schema will reduce the quantity of data generated by Excel 2003.
You can use Excel to open an existing XML document. Before displaying the data, Excel will ask you how you want to open the file, as shown in Figure 3-17. The process will be a little different depending on whether the document references a schema.
Opening an XML document with a schema
You use these steps to work with an XML document in Excel:
Optionally create a schema for the XML document.
Open the file in Excel.
Make changes to the content and export the XML file.
If you open the file as an XML list, Excel will use any related schema to determine how to display data. The following listing shows address.xml . It uses the schema addressSchema.xsd . Figure 3-18 shows how this file translates when opened in Excel.
<?xml version="1.0" encoding="UTF-8"?> <phoneBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="addressSchema.xsd"> <contact id="1"> <name>Sas Jacobs</name> <address>123 Some Street, Some City, Some Country</address> <phone>123 456</phone> </contact> <contact id="2"> <name>John Smith</name> <address>4 Another Street, Another City, Another Country</address> <phone>456 789</phone> </contact> <contact id="3"> <name>Jo Bloggs</name> <address>7 Different Street, Different City, UK</address> <phone>789 123</phone> </contact> </phoneBook>
Excel automatically creates a document map for the elements from the schema. You can see the document map in the XML Source Task Pane. Excel has also added an automatic filter to the column headings. You can select specific content from the XML document by choosing values from the drop-down lists.
You can make changes to the existing data in Excel or even add new data. Be careful how you generate the XML document. If you use Save As and choose the XML type, youll re-create the current content using SpreadsheetML. It will produce a large document that doesnt match your schema. Instead, you should export the data as shown in the next section.
Exporting XML data with a document map
Before exporting the data, youll want to make sure the changes youve made are valid against the schema. Right-click inside your data and choose XML XML Map Properties . Check the Validate data against schema for import and export option. This option isnt checked by default. You can also find XML Map Properties in the Data XML menu. Figure 3-19 shows the XML Map Properties dialog box.
To export the XML document, right-click in the data and choose XML Export . Enter a file name, choose a location, and click Export . Excel will generate an XML document that is valid according to your schema.
I used Excel to update the address.xml file and exported the data to the resource file addressExportedFromExcel.xml . If you look at the XML structure, youll see that its almost identical to that of the address.xml file. Figure 3-20 shows them side by side in XMLSpy.
Opening an XML document without a schema
If you open an XML document that doesnt specify a schema, Excel will create one based on the data. Figure 3-21 shows the warning that Excel will display.
When the data is imported, Excel creates a document map and figures out how to display the data. You can try this with the resource file excelImport.xml . This listing shows a simple XML document without a schema:
<?xml version="1.0"> <ImportData> <Column> <title>Jan</title> <data>1234</data> </Column> <Column> <title>Feb</title> <data>5678</data> </Column> <Column> <title>Mar</title> <data>9123</data> </Column> </ImportData>
Figure 3-22 shows the XML document after importing it into Excel. Ive saved the imported file as resource file excelImport.xls .
The document map created by Excel displays in the XML Source Task Pane. If its not visible, you can show it by choosing View Task Pane and selecting XML Source from the drop-down menu at the top of the Task Pane.
Working with mixed elements
If you use Excel to open an existing XML document, make sure that it conforms to a grid structure. Excel will have difficulty interpreting the structure of an XML document that contains text and child elements together in the same parent element.
This listing shows the file addressMixedElements.xml . As you can see, this document includes mixed content in the <address> element. It contains both text and a child element, <suburb> .
<?xml version="1.0" encoding="UTF-8"?> <phoneBook> <contact id="1"> <name>Sas Jacobs</name> <address>123 Some Street, <suburb>Some City</suburb> , Some Country</address> <phone>123 456</phone> </contact> <contact id="2"> <name>John Smith</name> <address>4 Another Street, <suburb>Another City</suburb> , Another Country</address> <phone>456 789</phone> </contact> </phoneBook>
Figure 3-23 shows the file opened in Excel 2003. The <address> element and text is missing; only the child element <suburb> displays.
If you export the data to an XML document, Excel will only save the elements displayed. The following listing shows the exported file addressMixedElementsExported.xml :
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <phoneBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <contact id="1"> <name>Sas Jacobs</name> <address> <suburb>Some City</suburb> </address> <phone>123 456</phone> </contact> <contact id="2"> <name>John Smith</name> <address> <suburb>Another City</suburb> </address> <phone>456 789</phone> </contact> </phoneBook>
The text within the <address> element is missing. Excel has also added a namespace to the root element.
Using Excel VBA and XML
You can use VBA to work with XML documents. For example, you could handle the importing of XML documents automatically. Excel 2003 recognizes the XMLMaps collection, and you can use the Import and Export methods to work with XML documents programmatically.
Access 2003 works a little differently than the other Office applications when it comes to XML. The XML documents generated by Access come directly from the structure of your tables and queries. The names of the elements in the resulting XML document come from the Access field names.
This section gives you an overview of the XML functionality available within Access 2003. It isnt a complete reference or tutorial. Ill cover the topic in more detail in Chapter 7.
Exporting XML data
Getting data out of Access and into XML is easyyou just export it in XML format. You need to follow these steps:
Display the table or query objects.
Right-click a table or query and select Export .
Select XML as the file type and choose a destination and file name.
Optionally select options for export.
Figure 3-24 shows how to export a table.
After you chose the Export option, youll have to enter a file name and choose a destination for the XML file. Dont forget to select XML from the Save as type drop-down list. When you click Export , youll be asked to choose between exporting the data (XML), a schema (XSD), and presentation of the data (XSL). See Figure 3-25 for a view of the Export XML dialog box.
Setting export options
You have some extra options that you can view by clicking the More Options button. Figure 3-26 shows these options. You can also include related records from other tables and apply an XSL transformation to the data.
The Schema tab allows you to include or exclude primary key and index information. You can also embed the schema in your XML document or create an external schema. Figure 3-27 shows these options.
The Presentation tab, shown in Figure 3-28, allows you to generate HTML or ASP and an associated style sheet.
I used the Access database documents.mdb and exported the records from tblDocuments . I included the related records from other tables and created both an XML and an XSD file. The resulting XML documents are called accessDocumentsExport.xml and accessDocumentsExport.xsd , respectively. If you have Access 2003, you can use the documents.mdb database to create the XML files yourself.
This listing shows a section of the sample XML document created by the export:
<?xml version="1.0" encoding="UTF-8"?> <dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="accessDocumentsExport.xsd" generated="2005-03-04T18:10:06"> <tblDocuments> <documentID>1</documentID> <documentName>Shopping for profit and pleasure</documentName> <authorID>1</authorID> <documentPublishYear>2002</documentPublishYear> <categoryID>4</categoryID> </tblDocuments> <tblAuthors> <authorID>1</authorID> <AuthorFirstName>Alison</AuthorFirstName> <AuthorLastName>Ambrose</AuthorLastName> <AuthorOrganization>Organization A</AuthorOrganization> </tblAuthors> <categoryID>4</categoryID> <category>Shopping</category> </tblCategories> </dataroot>
The only thing added by Access is the <dataroot> element. It contains two namespace references and an attribute called generated . This is a timestamp for the XML document.
Because I included records from tables related to tblDocuments , Access added the table references as separate elements at the end of the XML document. The one-to-many relationships between the tables arent preserved. Figure 3-29 shows the relationships in the database.
Controlling the structure of XML documents
XML documents exported from Access are shorter than their Word and Excel equivalents. The elements in the XML document take their names from the field names in the table or query. Access replaces the spaces in field names with an underscore ( _ ) character.
If you dont want to use the default field names in the table, an alternative is to create a query first that joins all the data and then export that to an XML document. Access wont give you the option to export data in linked tables, but the rest of the process is much the same as for exporting tables.
The following listing shows a trimmed -down version of the XML document, accessQryBook-DetailsExport.xml . You can also look at the schema file, accessQryBookDetailsExport.xsd .
<?xml version="1.0" encoding="UTF-8"?> <dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="accessQryBookDetailsExport.xsd" generated="2005-03-04T18:50:47"> <qryBookDetails> <documentID>2</documentID> <documentName>Bike riding for non-bike riders</documentName> <authorID>4</authorID> <AuthorFirstName>Saul</AuthorFirstName> <AuthorLastName>Sorenson</AuthorLastName> <AuthorOrganization>Organization D</AuthorOrganization> <documentPublishYear>2004</documentPublishYear> <categoryID>5</categoryID> <category>Bike riding</category> </qryBookDetails> </dataroot>
This XML document organizes the data by document and shows the relationships between the related tables. You could also have organized the data by author or category.
For an example of documents organized by author, see the resource file accessQryAuthorDocuments.xml , shown in the listing that follows , and the resource file accessQryAuthorDocuments.xsd .
<?xml version="1.0" encoding="UTF-8"?> <dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="accessQryAuthorDocuments.xsd" generated="2005-03-04T18:53:15"> <qryAuthorDocuments> <authorID>1</authorID> <AuthorFirstName>Alison</AuthorFirstName> <AuthorLastName>Ambrose</AuthorLastName> <AuthorOrganization>Organization A</AuthorOrganization> <documentID>4</documentID> <documentName>Fishing tips</documentName> <documentPublishYear>1999</documentPublishYear> </qryAuthorDocuments> <qryAuthorDocuments> <authorID>1</authorID> <AuthorFirstName>Alison</AuthorFirstName> <AuthorLastName>Ambrose</AuthorLastName> <AuthorOrganization>Organization A</AuthorOrganization> <documentID>1</documentID> <documentName>Shopping for profit and pleasure</documentName> <documentPublishYear>2002</documentPublishYear> </qryAuthorDocuments> </dataroot>
Writing queries still doesnt quite solve our problem. A better structure for the XML file from Access would have been to group the documents within each <authorID> element. Access doesnt do this automatically.
Using Access VBA and XML
You can automate XML importing and exporting with Access 2003 VBA. Access recognizes the Application.ImportXML and Application.ExportXML methods. You can trigger them from buttons on a form. Its important to note that VBA cant transform an XML document during the import process.
Office 2003 for PCs includes a new product called InfoPath that allows people to create and edit XML documents by filling in forms. The forms allow you to collect XML information and use it with your other business systems.
InfoPath is included in Microsoft Office Professional Enterprise Edition 2003, or you can buy it separately. There is no equivalent product for Macintosh Office 2004 users.
Office 2003 can generate XML documents for use by other applications, including Flash movies. If you set up Word, Excel, or Access properly, your users can maintain their own XML documents using Office 2003. Most people are familiar with these software packages, so its not terribly demanding for them to use them as tools for maintaining their data.
As you can see from the previous sections, each of the Office applications works with particular data structures. Word 2003 works best with nonrepeating information, a bit like filling in a form to generate the XML elements. Excel 2003 is best with grid-like data structures that dont include mixed elements. Access 2003 works with relational data, and you can write queries to specify which data to export. Using a schema in Word and Excel greatly simplifies the XML documents that they produce. Well look at creating schemas a little later in this chapter.
Youve probably heard the term web services mentioned a lot. The official definition from the W3C at www.w3.org/TR/ws- gloss /#defs is
A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP-messages, typ ically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards.
In simpler terms, a web service is a way for you to access data on another system using an XML format. Web services operate over the Internet and are platform independent. In order to use a web service, you request information and receive a response in an XML document.
You can use web services to look up a variety of information, including television guides, movie reviews, and weather updates. As an author, I can use Amazons web service to find out the sales ranking and database details for any books that Ive written.
When you start reading about web services, youll see the terms UDDI, WSDL, SOAP, and REST. A glossary for the main terms associated with web services is at www.w3.org/TR/2004/NOTE-ws-gloss-20040211/.
You can find out what web services are available through a companys Universal Description, Discovery, and Integration (UDDI) registry. The UDDI contains a description of the web services that are available and the way that you can access them.
Web Services Description Language (WSDL) describes web services in a standard XML format. In case youre interested, most people pronounce this as whizdle . At the time of writing, the working draft for WSDL version 2 was available at www.w3.org/TR/2004/WD-wsdl20-20040803/.
The WSDL definition explains what is available through the web service, where it is located, and how you should make a request. It lists the parameters you need to include when requesting information, such as the fields and datatypes that the web service expects.
You can request information from a web service using a number of different protocols. The SOAP protocol is probably the most commonly used and has support within Flash. You can also use Representational State Transfer (REST), but Flash doesnt support this format natively.
SOAP, which stands for Simple Object Access Protocol, is a format for sending messages to web services. A SOAP message is an XML document with a specific structure. The request is contained within a part of the document called a SOAP Envelope.
You can find more about SOAP by viewing the note submitted to the W3C at www.w3.org/TR/2000/NOTE-SOAP-20000508/. This document isnt a W3C recommendation. At the time of writing, a working draft of SOAP version 1.2 was available at www.w3.org/TR/2002/WD-soap12-part1-20020626/.
REST is another way to work with web services. It is not a W3C standard; rather, REST is a style for interacting with web services. REST allows you to make requests through a URL rather than by sending an XML document request. Flash doesnt support REST requests , but youll see a little later on that they can be very useful if you need to add data from a web service to a Flash movie.
Amazon jumped into web services relatively early on. At the time of writing, the latest version of the Amazon E-Commerce Service (ECS) was version 4.0, which was released on October 4, 2004. You can find comprehensive information about ECS at www.amazon.com/gp/aws/landing.html. Its free to use, but you have to register with Amazon first to get a subscription ID before you can start making requests.
The ECS provides access to information about products, customer content, sellers, marketplace listings, and shopping carts. You could use ECS to build an Amazon search and purchase application on your own website.
The WSDL for the U.S. service can be found at webservices.amazon.com/AWSECommerceService/AWSECommerceService.wsdl. You can open the file in a web browser if you want to see what it contains. The schema for the U.S. service is at webservices .amazon.com/AWSECommerceService/AWSECommerceService.xsd. Again, you can view this file in a web browser. The other Amazon locations supported are the UK, Germany, Japan, France, and Canada.
The Application Programming Interface (API) for Amazon web services describes all the operations you can perform. This includes functions like ItemLookup and ItemSearch . You can also work with wish lists and shopping carts.
To make a REST query to search for an item at Amazon, you could use the following URL format:
http://webservices.amazon.com/onca/xml?Service=AWSECommerceService &SubscriptionId=[YourSubscription ID Here]&Operation=ItemSearch &SearchIndex=[A Search Index String]&Keywords=[A Keywords String] &Sort=[A Sort String]
The request can include other optional parameters, and you can find out more in the online documentation. You can also get help by using the Help operation.
In the sample request that follows, Im using my own name to search for books in the U.S. Amazon database. I have replaced my subscriptionID with XXXX; youll need to use your own ID if you want to run the query.
http://webservices.amazon.com/onca/xml?Service=AWSECommerceService &SubscriptionId=XXXX&Operation=ItemSearch&SearchIndex=Books &Author=Sas%20Jacobs
If I enter the URL into the address line of a web browser, the request will run and the results will display in the browser window. All Amazon responses have the same structure, as shown in this listing:
<?xml version="1.0" encoding="UTF-8"> <rootTag xmlns="http://webservices.amazon.com/AWSECommerceService/ 2004-03-19"> <OperationRequest> ... XML header and HTTP request information </OperationRequest> <Items> ... XML data here </Items> </rootTag>
The name of the root element will vary depending on the type of request that you made. For example, an ItemSearch request will use the root element name <ItemSearchResponse> . If there are any errors in your request, theyll be contained inside an <Errors> element.
When I made the preceding REST request, I received the response shown in the following listing. Note that Ive removed the sections containing the subscriptionID from the listing. Ive saved the results in the file AmazonQueryResults.xml .
<?xml version="1.0" encoding="UTF-8" ?> <ItemSearchResponse xmlns="http://webservices.amazon.com/ AWSECommerceService/2005-02-23"> <OperationRequest> <HTTPHeaders> <Header Name="UserAgent" Value="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)" /> </HTTPHeaders> <RequestId>05BXE60PQPM6P687J1PA</RequestId> <Arguments> <Argument Name="Service" Value="AWSECommerceService" /> <Argument Name="SearchIndex" Value="Books" /> <Argument Name="Author" Value="Sas Jacobs" /> <Argument Name="Operation" Value="ItemSearch" /> </Arguments> <RequestProcessingTime>0.0390307903289795</RequestProcessingTime> </OperationRequest> <Items> <Request> <IsValid>True</IsValid> <ItemSearchRequest> <Author>Sas Jacobs</Author> <SearchIndex>Books</SearchIndex> </Item SearchRequest> </Request> <TotalResults>2</TotalResults> <TotalPages>1</TotalPages> <Item> <ASIN>8931435061</ASIN> <DetailPageURL>http://www.amazon.com/exec/obidos/redirect? tag=ws%26link_code=xm2%26camp=2025%26creative=165953%26path= http://www.amazon.com/gp/redirect.html%253fASIN=8931435061%252 location=/o/ASIN/8931435061%25253F </DetailPageURL> <ItemAttributes> <Author>Sas Jacobs</Author> <Author>YoungJin.com</Author> <Author>Sybex</Author> <ProductGroup>Book</ProductGroup> <Title>Flash MX 2004 Accelerated: A Full-Color Guide</Title> </ItemAttributes> </Item> </Items> </ItemSearchResponse>
Its common to query web services using a SOAP request. Usually some kind of server-side script generates the request for you. The WebServiceConnector data component in Flash can also generate SOAP requests.
Google provides an example of a web service that you can query with SOAP. At the time of writing, Google provided three different operations: doGetCachedpage , doSpellingSuggestion , and doGoogleSearch . You can see the WSDL at http://api.google.com/GoogleSearch.wsdl.
The W3C provides a sample SOAP message for Google at www.w3.org/2004/06/03-google-soap-wsdl.html. This listing shows an example based on the W3C sample. It does a search for the term Flash XML books :
<?xml version='1.0' encoding='UTF-8'?> <soap11:Envelope xmlns="urn:GoogleSearch" xmlns:soap11="http://schemas.xmlsoap.org/soap/envelope/"> <soap11:Body> <doGoogleSearch> <key>00000000000000000000000000000000</key> <q>Flash XML books</q> <start>0</start> <maxResults>10</maxResults> <filter>true</filter> <restrict></restrict> <safeSearch>false</safeSearch> <lr></lr> <ie>latin1</ie> <oe>latin1</oe> </doGoogleSearch> </soap11:Body> </soap11:Envelope>
This listing shows a sample result XML document from the W3C site. Ive shown the first result only to simplify the output:
<?xml version='1.0' encoding='UTF-8'?> <soap11:Envelope xmlns="urn:GoogleSearch" xmlns:google="urn:GoogleSearch" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:soap11="http://schemas.xmlsoap.org/soap/envelope/"> <soap11:Body> <doGoogleSearchResponse> <return> <documentFiltering>false</documentFiltering> <estimatedTotalResultsCount>3</estimatedTotalResultsCount> <directoryCategories soapenc:arrayType= "google:DirectoryCategory"> </directoryCategories> <searchTime>0.194871</searchTime> <resultElements soapenc:arrayType="google:ResultElement"> <item> <cachedSize>12k</cachedSize> <hostName></hostName> <snippet>Snippet for the first result would appear here </snippet> <directoryCategory> <specialEncoding></specialEncoding> <fullViewableName></fullViewableName> </directoryCategory> <relatedInformationPresent>true</relatedInformationPresent> <directoryTitle></directoryTitle> <summary></summary> <URL>http://hci.stanford.edu/cs147/examples/shrdlu/</URL> <title><b>SHRDLU</b></title> </item> </resultElements> <endIndex>3</endIndex> <searchTips></searchTips> <searchComments></searchComments> <startIndex>1</startIndex> <estimateIsExact>true</estimateIsExact> <searchQuery>shrdlu winograd maclisp teletype</searchQuery> </return> </doGoogleSearchResponse> </soap11:Body> </soap11:Envelope>
Well look more closely at using Flash to generate SOAP requests later in the book.
You can use the data-binding capabilities in Flash to display the results from the web service. The downside to creating a SOAP request using Flash is that you have to include your key as a parameter in the movie. This is not really a very secure option.
For security reasons, you often cant query a web service using a REST request within Flash. You need some kind of server-side interaction to make the request and pass the results into Flash. You can also use Flash Remoting to work with web services.
REST is a useful tool for Flash developers. As part of its security restrictions, recent Flash players will only let you run SOAP requests on a web service that contains a cross-domain policy file specifying your address. You can imagine that Amazon isnt going to do this for every Flash developer in the world! REST requests are a good workaround; you can use a server-side language to work with the information locally or proxy the information. Again, well cover this in more detail in Chapter 9.