Section 24.3. Excel and XML


24.3. Excel and XML

XML is a great way to exchange data between different computer programs. But what does that have to do with Excel, which already has its own perfectly good file format? Here's the deal: More and more companies today use XML to pass data back and forth. For example, when companies exchange business orders, news organizations post stories, or real estate firms list properties for sale, chances are they're using an XML-based format. If you want to crack open these documents and analyze this data using all of Excel's features, including formulas and charts , then you'll need to use Excel's XML tools.

There's another side to this story. Instead of trying to get XML information into Excel, you may need a way to get your worksheet data out of Excel. You might want to take an expense worksheet, export it to XML, and then feed that XML into an automated expense-processing program. That program could then track your expenses, submit them to your supervisor for authorization, and notify the payroll department when a payment is required. In a small company, it could be just as easy to print out the expense report and deliver it by hand (or email it). But in a large company, an automated application can help the whole process flow seamlessly, without forcing anyone to sort through stacks of paper or dozens of email messages. In these situations, XML really shines.


Note: Experts estimate that Excel spreadsheets contain more data than all the world's relational databases combined. Excel's XML features can help you extract information that's trapped in your spreadsheet files and use it in other automated applications.

Because XML is so flexible, there's no single-step solution to importing and exporting XML. You can't just perform an Open XML command because Excel doesn't know which XML format you're using. Instead, you first need to give Excel some information about the specific format you're using, and tell Excel how to extract the data you need.

Excel makes this possible through a set of features called XML mapping . XML mapping lets you link a specific XML format to a specific spreadsheet. Once you set up this link, you can use it in two ways: to export data from your worksheet into an XML document, or to import the contents of an XML document into your worksheet.

Some of the options for Excel's XML features are tucked away on a special tab that doesn't ordinarily appear. Before you do anything with XML and Excel, you need to display this tab. To do so, choose Office button Excel Options. In the Excel Options dialog box, choose the Popular section, and then, under the "Top options for working with Excel heading, turn on the "Show Developer tab in the ribbon" checkbox.

24.3.1. Mapping a Simple Document

The simplest way to map an XML document is to link each element in the XML document to a corresponding cell in a worksheet. Then, when you import the document, the data flows out of the elements and into the linked cells .


Note: You can find all the XML files used in this chapter on the "Missing CD" page at www.missingmanuals.com. You can use these files to map your own worksheets.

To try this out, you can use the simple Student.xml document shown below, which stores the test and assignment scores for a single student.

 <?xml version="1.0" ?> <Student>     <Name>Lisa Chen</Name>     <StudentID>45349920</StudentID>     <Test1_Score>75</Test1_Score>     <Test2_Score>63.23</Test2_Score>     <Assignment1_Score>94</Assignment1_Score>     <Assignment2_Score>90</Assignment2_Score> </Student> 

Keep in mind that in real life, you wouldn't create this document by hand. Instead, it might be extracted from a database or, even more likely, generated by some sort of automated student grading program.

Before you can map an XML document to an Excel workbook, you need to prepare the workbook. Simply follow these steps:

  1. Create a new blank Excel document .

    You could perform mapping with an existing workbook, but, in this case, it's easier to start from scratch.


    Note: Excel worksheets can contain a mix of linked cells (those that'll be receiving the XML file's content) and non-linked cells containing other information (like your own descriptive labels or formulas that use the imported data). Excel doesn't include these non-linked cells in any import or export operations.
  2. Choose Office button Open .

    The Open File dialog box appears.

  3. Browse to the Student.xml file, and then open it .

    Excel shows an Open XML dialog box with three options.

  4. Select "Use the XML Source task pane", and then click OK .

    You have three choices when it comes to opening a basic XML document. You can import the data into a basic table, dump all its content into the worksheet in read-only mode, or take full control of the linking process. The last option is the best for this example because it lets you choose where you want to display the data.

  5. Excel warns you that you're mapping a document without a schema. Click OK to continue .

    A schema defines the structure of an XML document, as explained in Section 24.2.3. Using a schema is the best way to link XML documents to worksheets because its strict rules prevent errors. You can, however, get away without using a schema, and, because you don't have a schema file for the Student.xml file, you can proceed without it.

    When you click OK, Excel doesn't yet import the data into your document. Instead, the XML Source pane appears, showing a "tree" that includes all the elements Excel found in your XML document (see Figure 24-13). At this point, you're ready to start the mapping process.

To map the Student.xml elements to your spreadsheet, follow these steps:

  1. In the XML Source pane, select the element you want to link .

    Figure 24-13. Nothing's happened yet, but the XML Source pane shows a tree diagram with the structure of the Student.xml file. It's now up to you to start the linking process.


  2. Drag the element from the XML Source pane to the appropriate place on your spreadsheet .

    You may want to place the student's name in cell B1 (leaving room for a label in cell A1). Simply click Name in the XML Source pane, drag it over cell A1, and then release it, as shown in Figure 24-14. Excel outlines the cell in blue to indicate that it's linked.

  3. Optionally, add a caption to your element by clicking the smart tag icon that appears next to the newly mapped cell. Choose "Place XML Heading to the Left" or "Place XML Heading Above", as shown in Figure 24-15 .

    When you choose one of these options, Excel inserts a piece of static text with the name of the linked element in bold formatting. You can use this technique to insert the caption "Name" in cell A1 after you link cell B1 to the <Name> element.

    Figure 24-14. To map the <Assignment2_Score> element to cell B6, begin by dragging the element to cell B5, creating the link.



    Note: Excel doesn't immediately insert any information in a linked cell. You won't see the student's name appear when you drag the <Name> element to cell B1. Instead, you'll need to import the XML data once you've finished the mapping process. You'll learn how to do that in a moment.
  4. Return to step 1, and then repeat the process until you've mapped all the elements .

    You don't need to map every element. It's quite possible that you want to work with just a portion of your XML file's data. In that case, you'd want to map only the elements you need. Of course, if you don't map an element, you can't extract its data and put it into your worksheet. Similarly, if you eventually export the Excel file as an XML document, that element won't be included.

    If you change your mind while creating a mapping, you can remove any element by right-clicking the corresponding element in the XML Source pane, and then choosing Remove.

Once you've finished mapping the document and adding your headings, it's time to import the XML. Choose Date Connections Refresh. Figure 24-16 shows the worksheet with the XML data imported.

Figure 24-15. Once you link an element, a smart tag appears. You can use this tag to quickly insert a caption for the linked data. This step is optional because you can add your own labels. However, using the smart tag helps you quickly create the basic structure of your mapped worksheet. In this case, you still need to do some cleanup to make the XML element names more attractive; you might want to replace Assignment1_Score with Assignment Score #1.


24.3.2. Importing and Exporting XML

Now that you've mapped the Student.xml file, you've not only extracted some information from the XML document, you've also given yourself a range of options for managing the link between your spreadsheet data and the world of XML. The real magic of XML lies here.

Once you've defined a link between your worksheet and a specific XML format, you can perform three tasks :

  • Refresh . If the XML source changes, you can easily refresh your worksheet with the new content. Just click any linked cell, and then select Data Connections Refresh or Developer XML Refresh Data. (Both buttons have the same effect.)

  • Export . You can export the data to a new XML document. Click any linked cell, select Developer XML Export, and then choose a new XML file. Optionally, you can choose to replace the original XML source with the new XML file. (You might want to use this option if youve edited the content inside your Excel spreadsheet.)

    Figure 24-16. Once you've mapped the XML document, actually importing your data's a snap. In this example, the data's been imported into the worksheet. The cell coloring and borders have been added manually. XML doesn't support any formatting featuresit's just a container for raw data.


  • Import . You can import XML data from another file that has the same structure. In this case, click any linked cell, select Developer XML Import, and then choose a new XML file. The new XML data flows into the existing worksheet, replacing the original data.

This ability to import means that in the Student.xml example, you could import a student's information, modify it, and then save it back to a different XML document. In fact, you could use your spreadsheet as a way to create dozens of different student documents in separate XML files, just by exporting different data to different files. A crafty developer could create a custom application that scans student XML files and then automatically generates and mails report cards.

All you'd need to do is export the XML. In fact, the whole process could get even easier if someone develops a little piece of Excel macro code to handle the XML export. In that case, you'd just click a button on the worksheet, and Excel would export the student data to XML and submit it to the custom application. Developing this type of workflow takes a fair bit of work, and the first step's learning Excel's VBA macro language, which is introduced in Chapter 28.


Note: Remember, if you save a mapped workbook, you're saving only the Excel spreadsheet file, with whatever data it currently contains. If you want to save the content to an XML file, you need to Export the XML data. Choose Developer XML Export.
POWER USERS' CLINIC
XML Mapping with a Schema

Serious XML gurus don't map a worksheet using an XML document. There are too many possible problems, including optional elements that Excel might ignore, data type rules that it doesn't enforce, and certain types of structures that Excel might misunderstand.

A better way to map a worksheet to XML is to use an XML schema. A schema defines the structure of the XML document you want to import. If you're creating a worksheet to analyze lists of student grades, you could use a file called StudentList.xsd that defines the elements you'll use and the document structure.

XML mapping works more or less the same with a schema as it does with an actual document. The key difference is that the schema removes the possibility for error. To map a schema, follow these steps:

  1. Open the workbook in which you'd like to add the XML schema, or create a new workbook.

  2. Select Developer XML XML Source to show the XML Source pane. In the XML Source pane, click the XML Maps button.

  3. Browse to your XML schema file, and then click OK.

  4. Click OK to close the XML Maps dialog box. You'll now see the schema-defined elements in the XML Source pane.

  5. Drag the elements from the XML Source pane to the worksheet.

  6. When you're finished, you need to import an XML document that has the same structure as the schema. To do so, select Developer XML Import, and then browse to the XML file.

When using an XML schema, Excel can perform validation with the schema data types to prevent invalid input. If you want to use this feature, you have to turn it on for the document. Select Data XML Map Properties. In the list of options, turn on the first checkbox, "Validate data against schema for import and export. Now, every time you import or export XML data, Excel checks it against the rules defined in the schema.


24.3.3. Mapping Lists

Excel's XML features really get interesting when you need to map XML documents that contain lists of information, like product catalogs, order tables, andas in the following examplea class report. As you've no doubt noticed, most Excel documents use lists of some kind. Whether you're tracking student grades, monthly expenses, or employee contact information, you rarely have just one piece of information.

XML documents are often designed to hold repeating elements. The Student.xml file, in contrast, only held the information for a single student. But you can readily create a document that holds a list of students, each one in separate <Student> element containers.

Here's an example (available in the StudentList.xml file, which you can find on the "Missing CD" page at www.missingmanuals.com). Only two students are shown here, but the actual StudentList.xml file contains many more students:

 <?xml version="1.0" ?> <Students>     <Student>         <Name>Lisa Chen</Name>         <StudentID>45349920</StudentID>         <Test1_Score>75</Test1_Score>         <Test2_Score>63.23</Test2_Score>         <Assignment1_Score>94</Assignment1_Score>         <Assignment2_Score>90</Assignment2_Score>     </Student>     <Student>         <Name>Edwin Albott</Name>         <StudentID>45349921</StudentID>         <Test1_Score>85</Test1_Score>         <Test2_Score>73.23</Test2_Score>         <Assignment1_Score>94</Assignment1_Score>         <Assignment2_Score>95.6</Assignment2_Score>     </Student>      </Students> 

When you try to map this document, Excel quickly notices that the <Student> element repeats. Instead of mapping the <Student> element to a single cell (as it did in the previous section), Excel creates a mapped table that you can use to manage the list of students with filtering, sorting, and searching.


Note: This example, in fact, creates the same type of table you studied in Chapter 14. So why bother? It gives you all the fancy table tools, like sorting, filtering, alternating row formatting, and so on.

To map the StudentList.xml file, follow these steps:

  1. Choose Office button Open .

    The Open File dialog box appears.

  2. Browse to the StudentList.xml file, and then open it .

    If you haven't already gotten this file, download it from the "Missing CD" page at www.missingmanuals.com. Excel shows an Open XML dialog box with three options for the file.

  3. Select "Use the XML Source task pane", and then click OK .

    Excel warns you that you're mapping a document without a schema.

  4. Click OK to continue .

    The XML Source pane appears, with the structure of the StudentList.xml file. It looks similar to the previous example, but there's a difference now. When you drag an element onto the worksheet, Excel creates a table column complete with a header that has a drop-down list. Why? Because Excel recognizes that the StudentList.xml file contains multiple students. And Excel can't store all these students in your worksheet in a single cell.

  5. Click the Student element, and then drag it to cell A1 .

    When multiple columns belong to the same XML list, it's often easiest to drag these elements onto your spreadsheet in one operation. You can select multiple elements by holding down the Ctrl key while you select items in the XML Source pane or by selecting the parent element (the element that contains all the elements you want to insert).

    In the student list example, if you select the Student entry in the XML Source pane, you also select all the elements that contain student information. You can then drag them all at once. It's entirely up to you whether you create your table as a series of contiguous columns (the easiest approach) or as separate columns spread out over your spreadsheet. Either way, the data's equivalent.

  6. Choose Data Connections Refresh All .

    Now the student information flows into the table, filling it up automatically, as shown in Figure 24-17.

Figure 24-17. This example shows a mapped worksheet with the table of student information. You can insert new students, remove existing students, and edit student data before you export the information back to XML.



Tip: In this example, all the XML data is contained in a repeating list of students. However, XML documents often use hybrid structures where they include some repeatable information (like the list of students), and some information that occurs only once (like the name of the class and the instructor who is teaching it). In this case, you would probably link individual cells in the top portion of your worksheet, and then add the table a little lower down.

24.3.4. Gaining the Benefits of XML Mapping

Regardless of whether you want to import or export XML, you should always save a copy of your mapped spreadsheet file. Excel stores the mapping information in that file. Essentially, you should think of this spreadsheet as a window that lets you analyze any XML file, as long as it has the same structure as the XML document you mapped.

When a semester of classes finishes up, an automated student grading application might generate a new XML document with the most up-to-date information. Fortunately, you don't need to map this documentas long as it matches the structure of the grading document you used initially, you can import the new XML document with a couple of mouse clicks. The new information then flows seamlessly into your existing workbook.

In a very real sense, you can reuse a mapped workbook to examine different XML files in the same way you can use a database query to get and analyze the most up-to-date information from a table in a database (as shown earlier).

For example, consider the StudentList.xml workbook that you created earlier. To make it a better tool for analyzing student grades, you would probably add a few extra ingredients . Here are some possibilities:

  • A calculated column in the table that determines each student's overall grade

  • A calculated field outside the table that determines the average or median grade

  • A chart that shows the distribution of grades in the class

The beauty of XML mapping is that once you've added these extra touches, you can reuse them with the data in other XML documents, provided these documents have the same structure as the document you used to map your worksheet. For example, you might receive a new file, called StudentList_Geography2007.xml, with a whole series of grades for another class. This document uses the same elements, so you don't need to go through the whole mapping process again.

Instead, you can just import this new information into your existing worksheet. All you need to do then is move to a mapped cell, select Developer XML Import, and then choose the StudentList_Geography2007.xml file. The student information flows into the linked cells on your worksheet, and Excel immediately updates the other information on your worksheet, including the total grade calculation, average grade calculation, and the chart that shows grade distribution (see Figure 24-18).

Figure 24-18. This worksheet enhances the linked table with a new column that calculates a student's final grade, performs a few summary calculations, and adds an attractive chart. Best of all, you can import another XML document with the exact same structure into your worksheet, potentially saving hours of work.




Excel 2007[c] The Missing Manual
Excel 2007[c] The Missing Manual
ISBN: 596527594
EAN: N/A
Year: 2007
Pages: 173

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net