Section 23.3. Linking Spreadsheets to XML

23.3. Linking Spreadsheets to XML

Because XML is so flexible, there's no single-step solution to importing and exporting XML. You can't, for example, just perform an Open XML command since Excel doesn't know which XML format you're using. Instead, you first need to give Excel some information about the specific format you're using and tell Excel how to extract the data you need.

Excel makes this possible through a set of features called XML mapping . XML mapping lets you link a specific XML format to a specific spreadsheet. Once you set up this link, you can use it in two ways: to export data from your worksheet into an XML document, or to import the contents of an XML document into your worksheet.

23.3.1. Mapping a Simple Document

The simplest way to map an XML document is to link each element in the XML document to a single cell in a worksheet. Then, when you import the document, the data flows out of the elements and into the linked cells .


Note: You can find all the XML files used in this chapter on the "Missing CD" page at MissingManuals.com . You can then use these files to map your own worksheets.

To try this out, you can use the simple Student.xml document shown below, which stores the test and assignment scores for a single student.

 <?xml version="1.0" ?> <Student>     <Name>Lisa Chen</Name>     <StudentID>45349920</StudentID>     <Test1_Score>75</Test1_Score>     <Test2_Score>63.23</Test2_Score>     <Assignment1_Score>94</Assignment1_Score>     <Assignment2_Score>90</Assignment2_Score> </Student> 

Keep in mind that in real life, you wouldn't create this document by hand. Instead, it might be extracted from a database or, even more likely, generated by some sort of automated student grading program.

Before you can map an XML document to an Excel workbook, you need to prepare the workbook. Simply follow these steps:

  1. Create a new blank Excel document.

    You could perform mapping with an existing workbook, but, in this case, it's easier to start from scratch.


    Note: Excel worksheets can contain a mix of linked cells (those that'll be receiving the XML file's content) and non-linked cells containing other information (like your own descriptive labels or formulas that use the imported data). These non-linked cells aren't included in any import or export operations.
  2. Choose File Open.

    The Open File dialog box appears.

  3. Browse to the Student.xml file, and open it.

    Excel shows an Open XML dialog box with three options.

  4. Select "Use the XML Source task pane", and click OK.

    You have three choices when it comes to opening a basic XML document. You can import the data into a basic list, dump all its content into the worksheet in read-only mode, or take full control of the linking process. The last option is the best for this example because it lets you choose where you want to display the data.

  5. Excel warns you that you are mapping a document without a schema. Click OK to continue.

    A schema defines the structure of an XML document, as explained on Section 23.1.3. Using a schema is the best way to link XML documents to worksheets because its strict rules prevent errors. You can, however, get away without using a schema, and, because you don't have a schema file for the Student.xml file, you can proceed without it.

    When you click OK, Excel doesn't yet import the data into your document. Instead, the Task Pane appears, displaying the XML Source task. The XML Source task shows a tree that includes all the elements Excel found in your XML document (see Figure 23-1). At this point, you're ready to start the mapping process.

    Figure 23-1. Nothing has happened yet, but the XML Source task shows a tree with the structure of the Student.xml file. It's now up to you to start the linking process.


    To map the Student.xml elements to your spreadsheet, follow these steps:

    In the XML Source pane, select the element you want to link.

  6. Drag the element from the XML Source task to the appropriate place on your spreadsheet.

    For example, you might want to place the student's name in cell B1 (leaving room for a label in cell A1). Simply click Name in the XML Source task, drag it over cell A1, and release it. Excel outlines the cell in blue to indicate that it's linked.

    Every time you create a link, a smart tag icon appears next to the newly mapped cell. You can use this icon to quickly insert a caption for the linked data. Just click the icon, and choose "Place XML Heading to the Left" or "Place XML Heading Above." When you do so, Excel inserts a piece of static text with the name of the linked element in bold formatting. For example, you can use this technique to insert the caption "Name" in cell A1 after you link cell B1 to the <Name> element. Figure 23-2 shows how this procedure works.

    Figure 23-2. Top : You need to use two steps to map the <Assignment1_Score> element to cell B5. First, drag the element to cell B5, creating the link.
    Bottom : Next, you use the smart tag to quickly insert a caption in cell A5. This step is optional, and you can add your own labels. However, using the smart tag helps you quickly create the basic structure of your mapped worksheet. In this case, you still need to do some cleanup to make the XML element names more attractive; for example, you might want to replace Assignment1_Score with Assignment Score #1.



    Note: Excel doesn't immediately insert any information in a linked cell. For example, you won't see the name of the student appear when you drag the <Name> element to cell B1. Instead, you'll need to import the XML data once you've finished the mapping process. You'll learn how to do that below.
  7. Return to step 1 and repeat the process until you've mapped all the elements.

    You don't need to map every element. It's quite possible that you want to work just with a portion of your XML file's data. In that case, you'd want to map only the elements you need. Of course, if you don't map an element, you won't be able to extract its data and put it into your worksheet. Similarly, if you eventually export the Excel file as an XML document, that element won't be included.

    If you change your mind while creating a mapping, you can remove any element by right-clicking the corresponding element in the XML Source pane, and then choosing Remove.

Once you've finished mapping the document and adding your headings, it's time to import the XML. Choose Data XML Refresh, or click the Refresh button on the List toolbar (which looks like an exclamation mark next to a globe). Figure 23-3 shows the worksheet with the XML data imported.

Figure 23-3. Once you've mapped the XML document, actually importing your data is a snap. In this example, the data has been imported into the worksheet. The cell coloring and borders have been added manually. XML doesn't support any formatting featuresit's just a container for raw data.


23.3.2. Importing and Exporting XML

Now that you've mapped the Student.xml file, you've not only extracted some information from the XML document, you've also given yourself a range of options for managing the link between your spreadsheet data and the world of XML. This is where the real magic of XML lies.

Once you've defined a link between your worksheet and a specific XML format, you can perform three tasks :

  • Refresh . If the XML source changes, it's easy to refresh your worksheet with the new content. Just click any linked cell and select Data XML Refresh XML Data.

  • Export . You can export the data to a new XML document. Click any linked cell, select Data XML Export, and then choose a new XML file. Optionally, you can choose to replace the original XML source with the new XML file. (You might want to do that if you've edited the content inside your Excel spreadsheet.)

  • Import . You can import XML data from another file that has the same structure. In this case, click any linked cell, select Data XML Import, and then choose a new XML file. The new XML data flows into the existing worksheet, replacing the original data.

This ability to import means that in the Student.xml example, you could import a student's information, modify it, and then save it back to a different XML document. In fact, you could use your spreadsheet as a way to create dozens of different student documents in separate XML files, just by exporting different data to different files. A crafty developer could create a custom application that scans student XML files and then generates and mails report cards automatically.

All you'd need to do is export the XML. In fact, the whole process could get even easier if someone develops a little piece of Excel macro code to handle the XML export. In that case, you'd just click a button on the worksheet, and Excel would export the student data to XML and submit it to the custom application. Developing this type of workflow takes a fair bit of work, and the first step is learning Excel's VBA macro language, which is introduced in Chapter 26.


Tip: Remember, if you save a mapped workbook, you're saving only the Excel spreadsheet file, with whatever data it currently contains. If you want to save the content to an XML file, you need to Export the XML data. Choose Data XML Export.
POWER USERS' CLINIC
XML Mapping with a Schema

Serious XML gurus don't map a worksheet using an XML document. There are too many possible problems, including optional elements that Excel might ignore, data type rules that it won't enforce, and certain types of structures that Excel might misunderstand.

A better way to map a worksheet to XML is to use an XML schema . A schema defines the structure of the XML document you want to import. For example, if you're creating a worksheet to analyze lists of student grades, you might use a file called StudentList.xsd that defines the elements you'll use and the document structure.

XML mapping works more or less the same with a schema as it does with an actual document. The key difference is that the schema removes the possibility for error. To map a schema, follow these steps:

  1. Open the workbook in which you'd like to add the XML schema, or create a new workbook.

  2. Select XML Data XML Source.

  3. In the XML Source pane, click the Workbook Maps button.

  4. In the XML Maps dialog box, click the Add button.

  5. Browse to your XML schema file, and click OK.

  6. Click OK to close the XML Maps dialog box.

  7. You'll now see the schema-defined elements in the XML Source pane.

  8. Drag the elements from the XML Source pane to the worksheet.

  9. When you're finished, don't click the Refresh button on the toolbar; there's no data in the schema for you to import! Instead, you need to import an XML document that has the same structure as the schema. To do so, select Data XML Import, and browse to the XML file.

When using an XML schema, Excel can perform validation with the schema data types to prevent invalid input. If you want to use this feature, you have to turn it on for the document. Select Data XML XML Map Properties. In the list of options, turn on the first checkbox, "Validate data against schema for import and export." Now, every time you import or export XML data, Excel checks it against the rules defined in the schema.


XML documents are often designed to hold repeating elements. The Student.xml file, in contrast, only held the information for a single student. But you can readily create a document that holds a list of students, each one in separate <Student> element containers.

Here's an example (available in the StudentList.xml file, which you can find on the "Missing CD" page at MissingManuals.com). Only two students are shown here, but the actual StudentList.xml file contains many more students:

 <?xml version="1.0" ?> <Students>     <Student>         <Name>Lisa Chen</Name>         <StudentID>45349920</StudentID>         <Test1_Score>75</Test1_Score>         <Test2_Score>63.23</Test2_Score>         <Assignment1_Score>94</Assignment1_Score>         <Assignment2_Score>90</Assignment2_Score>     </Student>     <Student>         <Name>Edwin Albott</Name>         <StudentID>45349921</StudentID>         <Test1_Score>85</Test1_Score>         <Test2_Score>73.23</Test2_Score>         <Assignment1_Score>94</Assignment1_Score>         <Assignment2_Score>95.6</Assignment2_Score>     </Student>     ... </Students> 

When you try to map this document, Excel quickly notices that the <Student> element repeats. Instead of mapping the <Student> element to a single cell (as it did in the previous section), Excel creates a mapped data list that you can use to manage the list of students with filtering, sorting, and searching.


Note: This example, in fact, creates the same data list you studied in Chapter 13. So why bother? This technique is handy if you have an XML document with list information because a data list lets you analyze it. If you don't have an XML document, you'll probably be happier creating a list on your own without getting XML involved at all.

To map the StudentList.xml file, follow these steps:

  1. Choose File Open.

    The Open File dialog box appears.

  2. Browse to the StudentList.xml file, and open it.

    If you haven't already gotten this file, download it from the "Missing CD" page at MissingManuals.com. Excel shows an Open XML dialog box with three options for the file.

  3. Select "Use the XML Source task pane", and click OK.

  4. Excel warns you that you are mapping a document without a schema. Click OK to continue.

    The XML Source task pane appears, with the structure of the StudentList.xml file. It looks similar to the previous example, but there's a difference now. When you drag an element onto the worksheet, Excel creates a list column complete with a header that has a drop-down list. Why? Because Excel recognizes that the StudentList.xml file contains multiple students. And Excel can't store all these students in your worksheet in a single cell.

  5. Click the Student element, and drag it to cell A1.

    When multiple columns belong to the same XML list, it's often easiest to drag these elements onto your spreadsheet in one operation. You can select multiple elements by holding down the Ctrl key while you select items in the XML Source pane or by selecting the parent e lement (the element that contains all the elements you want to insert).

    In the student list example, if you select the Student entry in the XML Source pane, you also select all the elements that contain student information. You can then drag them all at once. It's entirely up to you whether you create your XML list as a series of contiguous columns (the easiest approach) or as separate columns spread out over your spreadsheet. Either way, the data is equivalent.

  6. Choose Data XML Refresh.

    Now the student information flows into the list, filling it up automatically, as shown in Figure 23-4.

Figure 23-4. This example shows a mapped worksheet with the list of student information. You can insert new students, remove existing students, and edit student data before you export the information back to XML.



Note: In this example, all the XML data is contained in a repeating list of students. However, XML documents often use hybrid structures where they include some repeatable information (like the list of students), and some information that occurs only once (like the name of the class and the instructor who is teaching it). In this case, you would probably link individual cells in the top portion of your worksheet, and then add the data list a little lower down.

23.3.4. Gaining the Benefits of XML Mapping

Regardless of whether you want to import or export XML, you should always save a copy of your mapped spreadsheet file. Excel stores the mapping information in that file. Essentially, you should think of this spreadsheet as a window that lets you analyze any XML file, as long as it has the same structure as the XML document you mapped.

For example, when a semester of classes rounds up, an automated student grading application might generate a new XML document with the most up-to-date information. Fortunately, you don't need to map this documentas long as it matches the structure of the first grade document, you can import with a couple of mouse clicks. The new information then flows into your existing workbook seamlessly.

In a very real sense, you can reuse a mapped workbook to examine different XML files in the same way you can use a database query to get and analyze the most up-to-date information from a table in a database (as shown in Chapter 22).

In the next example, you'll round out the earlier StudentList.xml workbook to make it a better tool for analyzing student grades. To complete this spreadsheet, you would probably add a few extra ingredients . Here are some possibilities:

  • A calculated column in the list that determines each student's overall grade

  • A calculated field outside of the list that determines the average or median grade

  • A chart that shows the distribution of grades in the class

The beauty of XML mapping is that once you've added these extra touches, you can reuse them with the data in other XML documents, provided these documents have the same structure as the document you used to map your worksheet. For example, you might receive a new file, called StudentList_Geography2005.xml, with a whole series of grades for another class. Because this document uses the same elements, you don't need to go through the whole mapping process again.

Instead, you can just import this new information into your existing worksheet. All you then need to do is move to a mapped cell, select Data XML Import, and choose the StudentList_Geography2005.xml file. The student information flows into the linked cells on your worksheet, and Excel immediately updates the other information on your worksheet, including the total grade calculation, average grade calculation, and the chart that shows grade distribution. Figure 23-5 shows an example.

Figure 23-5. This worksheet enhances the linked XML list with a new column that calculates a student's final grade, performs a few summary calculations, and adds an attractive chart. Best of all, you can import another XML document with the exact same structure into your worksheet, potentially saving hours of work.




Excel. The Missing Manual
Excel 2010: The Missing Manual
ISBN: 1449382355
EAN: 2147483647
Year: 2003
Pages: 185

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net