An XML Walkthrough


There are a myriad of ways in which you might encounter XML files and Smart Documents, and it would be impossible to cover all of them. One likely possibility, however, is that your company will make the decision to start using XML and you will be asked to help with this process. This section uses this scenario to walk you through using Word 2003 to perform some common XML-related tasks . If you are not facing the prospect of converting documents to XML yourself and don't feel a need to investigate the XML- related nooks and crannies of the Word interface, you can skip to the "Where to Go Next " section at the end of this hour . If, however, you have been, or think you might be, asked to do some of this work, you will probably want to read this section and poke around the Word interface as you go.

To set the scene for this section, let's say you are a tech writer at a software company, and one of your tasks is to prepare technical notes for your field engineers and customers. The technical notes are usually distributed via your Web site and on paper at trade shows. Until now, you have used a Word template for these documents and have managed them by hand, tracking when the software they describe changes, tracking new releases of the software, getting approval for drafts, determining whether a technical note is ready to be publicly released, and so on.

Your company has now decided to start basing your technical notes on XML. A consultant or someone from your documentation support group has already created the schemas and style sheets you will need. Your job is to convert your template to XML, associate the appropriate files with it, protect the regions of the template that you do not want to be modified, and so on.

After you have converted your template to XML, you will use it to create a technical note. As a part of this process, you will learn how to work with an XML-structured document, fixing validation errors you encounter, editing the document with track changes, and saving the document with and without a transform.

Converting a Standard Word Template to an XML Template

The general steps for converting a standard Word template to an XML template are as follows :

  1. Open the Word template and attach the XML schema you want to associate with the new XML template.

  2. Insert XML tags in the template to label each element of the document. This enables Word to check that documents based on the template are properly structured and to perform some validation on the element contents.

  3. Protect the XML tags so that they cannot be revised or deleted.

  4. Optionally, replace the default placeholder text for each element with text that will provide element-specific guidance when using the template.

  5. Save the template.

Depending on the length of your template, this process can be pretty laborious. Luckily you only have to do it once.

graphics/bookpencil_icon.gif

In this section and the three that follow, you'll notice that the Word interface (and hence the text in this book) uses the term document in a general sense to refer to XML documents and templates.


These steps walk you through this process in more detail:

  1. After opening the template you're converting to XML, choose Tools, Templates and Add-Ins . In the Templates and Add-Ins dialog box, click the XML Schema tab. Mark the check box for the schema that you want to attach to your template, in this case Technical Notes (see Figure 24.1). (These steps assume that one or more schemas have already been created for you and stored in your schema library, so they appear in this list; this would typically be part of the work done by the folks managing the migration to XML.) Under Schema Validation Options, mark the Validate Document Against Attached Schemas check box.

    Figure 24.1. Select the schema that you want to attach to your template.

    graphics/24fig01.jpg

  2. Click OK . The selected schema is now attached to the template and the XML Structure task pane appears (see Figure 24.2). This task pane is organized in two sections. The top section, Elements in the Document, displays a map of the structure of the document. Initially there are no elements in the document, so the message "No XML elements have been applied to this document" appears.

    Figure 24.2. When the XML Structure task pane first appears, the Elements in the Document list is empty.

    graphics/24fig02.jpg

  3. If necessary, mark the Show XML Tags in the Document check box so that the XML tags you add to the template will be visible.

  4. The bottom section of the task pane contains a list box from which you select an XML element to apply to a selected region of the document. Initially, this box lists only the top-level element (also known as the root element ) defined in the attached schema. If it does not, mark the List Only Child Elements of Current Element check box under the list box. When this check box is marked , the list of elements that appears in the list box is narrowed to those that could be used to tag the current selection. In this view, the elements are presented in the order in which they are declared in the schema. When the check box is cleared, all of the elements defined in the schema(s) are listed in alphabetical order, with a red slashed circle icon next to those elements that are not children of the element in which the current selection is located, and thus cannot be used in the current context. For complex schemas, this list can be very large.

    graphics/bookpencil_icon.gif

    Although Word does full validation of the XML elements applied to a document, in this release of Office 2003 it does not do the analysis necessary to show you only the elements that would be correct to add to the document at the insertion point, even when you have marked the List Only Child Elements of Current Element check box. This means that when this check box is marked, you may still see elements in the element selection list that will cause validation errors when you apply them to your document. For example, in the schema used in this section, the author element includes the child elements name , title , and email . Each of these elements can appear only once in the author element. When the insertion point is within the author element, Word will include all of the child elements even if one or more has already been applied, and it would be an error to include it again.


  5. Select the whole document ( Edit, Select All ), and then click the top-level (root) element displayed in the element selection list. The Apply to Entire Document dialog box appears because this is the first element to be applied to the document (see Figure 24.3). The top-level element is the "container" for the entire document, so click the Apply to Entire Document button.

    Figure 24.3. Click the Apply to Entire Document button to enclose the entire document in the root element tag.

    graphics/24fig03.jpg

    graphics/lightbulb_icon.gif

    If you have made a selection smaller than the whole document (perhaps because some of the text in the old template will not be included in the new one), be sure to click the Apply Only to Selection button insteadand remember that the root element must enclose the whole document by the time you are done.


  6. Click anywhere to deselect the document. A pair of purple XML tags representing the root element of the schema now bracket the document, and the name of the root element appears in the Elements in the Document list. The element solution list changes to display the children of the root element. Because all but the simplest of documents will be incomplete, and thus violate their schema, a schema violation marker (a purple wavy line) will appear in the document window, and a yellow schema violation icon will appear next to the root element's name in the task pane (see Figure 24.4). If you find the schema violation markers distracting, you can turn them off by marking the Hide Schema Violations in This Document check box in the XML Options dialog (accessible from the bottom of the XML Structure task pane).

    Figure 24.4. XML tags now bracket the top-level element of your document.

    graphics/24fig04.jpg

  7. To investigate the schema violation, right-click the purple line. A message box appears describing the violation and offering some possible actions (see Figure 24.5). When the schema violation is within an individual element, Word may display the marker under the element (though if the element is in a table, the marker remains in the margin). In the task pane, you can hover the mouse pointer over the schema violation icon to display a brief description of the error.

    Figure 24.5. A message box gives you information and suggestions about the schema violation.

    graphics/24fig05.jpg

    graphics/bookpencil_icon.gif

    As you work through the document tagging individual elements, the purple line will eventually disappearthe way in which the schema was written determines the exact manner in which the marker behaves. If the schema specifies constraints on the contents of individual elements, those elements will continue to be marked as violations until their contents are acceptable. You can either choose to fill in the elements with an acceptable default or leave the element empty so that Word will flag the omission.


    graphics/lightbulb_icon.gif

    It can be difficult to position the insertion point between the tags for two adjacent elements by clicking the mouse. Word seems to prefer to place the insertion point within one of the elements. However, you can easily move the insertion point in between two tags using the arrow keys.


  8. Continue marking up the document by selecting elements in the document window and then clicking the element names in the element selection list. Each time you do this, the element selection list changes to reflect the list of element tags that can be added to the document in the context of the current element. In some cases, the element selection list will be empty. As you add tags, the element names will appear in the Elements in the Document list (see Figure 24.6), showing the structure of the document. When you are finished, continue with the next set of steps to protect the tags you've added.

    Figure 24.6. This document has been marked up.

    graphics/24fig06.jpg

graphics/bookpencil_icon.gif

Word uses a variety of icons to indicate errors in the structure of an XML document. If one or more errors are present in your document, you can't save the document as an XML document unless you mark the Allow Saving as XML Even If Not Valid check box in the XML Options dialog (accessible from the bottom of the XML Structure task pane). You can still save the file as a Document Template ( *.dot ) or as a Word Document ( *.doc ) when there are validation errors. Until a document validates , it cannot be processed by other XML-aware applications.


Protecting the XML Template

After you have tagged all of the XML elements in the template and decided how you want to resolve any schema violations (by supplying acceptable defaults, or by leaving them to ensure that someone creating a document based on the template will have to supply valid contents for the elements), you could save your work, but because you are creating a document template, it makes sense to protect the XML tags first. By doing this you ensure that the tags will not be accidentally deleted when they are hidden from view. (For most users, you will want to hide the tags from view to make the document "friendlier.") To protect the template, follow these steps:

  1. Choose Tools, Protect Document to display the Protect Document task pane.

  2. Under Editing Restrictions, mark the Allow Only This Type of Editing in the Document check box, and leave the default option selected in the drop-down list below the check box (No Changes [Read Only]).

  3. Create exceptions where you want the user to be able to enter text. One by one, click a tag for each element in the document where text must be entered. This will select the area between the element's tags. Then, in the Protect Document task pane, mark the Everyone check box in the Groups list under Exceptions. In the case of nested elements, in most cases you will want to only create exceptions for the innermost elements (for example the name element within the author element). This ensures that text can be entered between the tags, while protecting the tags themselves .

  4. After you have marked all of the exceptions, scroll down to the Start Enforcement section, and click the Yes, Start Enforcing Protection button.

  5. In the resulting Start Enforcing Protection dialog box (see Figure 24.7), select the desired protection method. Password requires you to enter a password (twice) to restrict the users that are allowed to remove document protection. For many purposes this will be adequate. If your site is using Microsoft Information Rights Management technology, you can also choose the User Authentication option. This provides significantly stronger protection for your document. When you are finished, click OK .

    Figure 24.7. Specify the type of protection you want to use and enter a password.

    graphics/24fig07.jpg

  6. The Protect Document task pane now displays information about what regions the user can edit (see Figure 24.8). When you click in different areas of the document, the text at the top of the task pane updates to inform the user whether the area is editable.

    Figure 24.8. The Protect Document task pane tells the user whether he or she can edit at the location of the insertion point.

    graphics/24fig08.jpg

graphics/bookpencil_icon.gif

To preserve protection, you must save the document as an XML document ( *.xml ) with WordML included (the Save Data Only check box must be cleared in the Save As dialog box), as a Word document ( *.doc ), or as a document template ( *.dot ). If you save the document as an XML document with the Save Data Only check box marked, the protection information will be lost.


Displaying and Customizing Placeholder Text

By default, when you hide the XML tags in your document, placeholder text appears to help the user edit the document. Follow these steps to display the placeholder text.

  1. Display the XML Structure task pane. (If the Protect Document task pane is still displayed, click the Back button at the top of the task pane. Otherwise, choose View, Task Pane and if necessary, click the down arrow in the upper-right corner of the task pane and choose XML Structure from the Other Task Panes list that appears.)

  2. Click the XML Options link at the bottom of the task pane. In the resulting XML Options dialog box (see Figure 24.9), mark the Show Placeholder Text for All Empty Elements check box and click OK .

    Figure 24.9. The XML Options dialog box gives you options for customizing the behavior of your XML document.

    graphics/24fig09.jpg

  3. Clear the Show XML Tags in the Document check box in the XML Structure task pane (or press Ctrl+Shift+X). For many users this view may be easier to work with. By default, the empty XML tag pairs contain placeholder text (highlighted in purple) that lists the name of the element. If you marked the Highlight the Regions I Can Edit check box in the Protect Document task pane (refer back to Figure 24.8), the regions of the document that can be edited will be enclosed in beige square brackets, as shown in Figure 24.10. If an element is empty and placeholder text is not displayed, the square brackets will overlap, forming an I-beam shape.

    Figure 24.10. The XML tags are hidden and the default placeholder text appears instead.

    graphics/24fig10.jpg

  4. Click the Forward button to return to the Protect Document task pane.

graphics/bookpencil_icon.gif

If you clear the Show XML Tags in the Document check box without marking the Show Placeholder Text for All Empty Elements and Highlight the Regions I Can Edit check boxes, you will be unable to tell where to enter text in the document.


In some cases you might want to use more informative placeholder text. For example, the XML date format would display the date October 30, 2003 like this: 2003-10-30. You might want to provide placeholder text that reminds users of the form of the format, or provide any other information that would help them in working with the template. To create custom placeholder text, follow these steps:

  1. Choose Tools, Unprotect Document (or click the Stop Protection button at the bottom of the Protect Document task pane) to temporarily remove protection from the document. Enter the document password if you are using password protection.

  2. Press Ctrl+Shift+X to display the XML tags if they are not already showing.

  3. Right-click the tag to which you want to add custom placeholder text, and choose Attributes in the context menu (you can also right-click on the element in the Elements in the Document list of the XML Structure task pane).

  4. In the Attributes for [element name] dialog box (see Figure 24.11), type the text you want to appear as a prompt for the element in the Placeholder Text text box. You can enter up to 253 characters . When you are finished, click OK . The text you entered is now displayed as a placeholder when the element is empty and XML tags are not displayed.

    Figure 24.11. Use the Attributes for [element name] dialog box to specify custom placeholder text.

    graphics/24fig11.jpg

graphics/bookpencil_icon.gif

To preserve custom placeholders, you must save the document as an XML document ( *.xml ) with WordML included (the Save Data Only check box must be cleared in the Save As dialog box), as a Word document ( *.doc ), or as a document template ( *.dot ). If you save the document as an XML document with the Save Data Only check box marked, the custom placeholders will be lost.


At this point you have added the XML structure to your template, protected the template, hidden the XML tags, and added custom placeholders. Now all that's left to do is save the template in the usual way (the template will remain a *.dot file). After you've saved the template, you can put it into production, as described in the next section.

Creating a Source Document Based on the XML Template

When you work with an XML template, you'll typically want to use it to create one document that you'll fill in, save, and then use as both your print output and as the basis for other documents that you'll create using File, Save As with various transforms to create different types of output.

To create this source document, start a new document based on your XML template (choose File, New, click On My Computer in the New Document task pane, select the template in the Templates dialog box, and click OK). You'll see a document that looks much like a normal Word document, with the editable regions highlighted and/or enclosed in brackets (depending on whether the Highlight the Regions I Can Edit check box is marked in the Protect Document task pane, and on whether the document is displaying placeholder text). Each of the regions that you can edit is between a pair of hidden XML tags. The tags themselves are protected, so you don't need to worry about accidentally deleting a tag.

Start filling in the editable regions. Because the XML schema of the document may specify constraints on the content of the tags, you might see schema violations (purple wavy lines) as you fill in the various elements. To investigate a violation, follow these steps:

  1. Display the XML Structure task pane.

  2. In the Elements in the Document list, point to the yellow schema violation icon. A ScreenTip appears with a description of the problem (see Figure 24.12).

    Figure 24.12. A ScreenTip gives you information about the nature of the violation. This tip shows the effect of marking Show Advanced XML Error Messages in the XML Options dialog (compare to Figure 24.5).

    graphics/24fig12.jpg

graphics/bookpencil_icon.gif

You can also investigate a schema violation by right-clicking the purple wavy line in the document instead of using the XML Structure task pane.

The amount and type of detail about the violation that Word provides is controlled by the Show Advanced XML Error Messages check box in the XML Options dialog.


If your document contains regions of "free" textfor example in an annual report, or a technical manual, or even in the descriptive text of a catalogyou will want to tag any elements that you add to this section. To do this, follow these steps:

  1. Select the text that you want to tag.

  2. Either display XML tags (Ctrl+Shift+X) or display the XML Structure task pane.

  3. If XML tags are displayed, you can right-click the selection. If the attached schema contains elements that can be applied to the selection, the Apply XML Element command will appear in the context menu. Point to this command to display the submenu of available elements, and then click the one you want to tag (see Figure 24.13). If you're using the XML Structure task pane, select the item from the Choose An Element To Apply To Your Current Selection box in the XML Structure task pane.

    Figure 24.13. Select the element you want to tag in the Apply XML Element submenu.

    graphics/24fig13.jpg

Saving Your Source Document

When you have finished editing your source document, follow these steps to save it:

  1. Choose File, Save As .

  2. In the Save As dialog box, select either Word Document (*.doc) or XML Document (*.xml) in the Save As Type drop-down list. This choice depends on two factors:

    • If the document will be processed by other XML-aware applications, you must save it as an XML document.

    • If the document is partially edited and contains schema violations, you must save it as a Word document (unless you have allowed the option of saving documents that contain schema violations as XML documents).

  3. Navigate to the folder in which you want to save the file, and type the filename in the File Name text box.

  4. If you have chosen XML Document (*.xml) in the Save As Type drop-down list, the Apply Transform and Save Data Only check boxes will appear in the Save As dialog box (see Figure 24.14). Mark the Save Data Only check box if you want Word to discard all of the information contained in WordML. Because this includes document metadata such as tracked changes, XML placeholder text, document properties, and document protection information (in addition to Word's formatting), you will rarely want to do this when saving your source document. Marking the Apply Transform check box activates the default transform and also allows you to select an alternate transform using the Transform button (which displays the Choose an XML Transform dialog box). In most cases you will want to save the source document "as is" to preserve all of its content and then apply any necessary transforms later.

    Figure 24.14. You can select options for saving an XML document in the Save As dialog box.

    graphics/24fig14.jpg

    graphics/lightbulb_icon.gif

    If the Save Data Only check box is marked by default in the Save As dialog, you can change this by going to the XML Options dialog and clearing the Save Data Only check box.


  5. When you've made your selections, click the Save button.

Using Transforms to Output Your Source Document in Different Ways

After you have created a source document, you can use transforms (also referred to as XSLTs, Extensible Stylesheet Language Transforms) to create additional views of your document. This process enables you to use one document for many purposes. For example, you could use the same source document to create a printed catalog, populate a database for an online catalog, and load short product descriptions into your company's order processing system. The process for doing this is simple (when the style sheets that define the transforms are created and installed on your computer).

graphics/alarmclock_icon.gif

When you apply a transform to an XML document, any content not used by the transform is omitted in the resulting document. Thus you always want to take care when applying a transform to a document so that you don't unintentionally overwrite your source document with the transformed one.


To create an output document, follow these steps:

  1. Open the source document in Word and (if you want) edit it.

  2. Choose File, Save As .

  3. Type a new name for the file so that you don't overwrite your source document.

  4. Choose XML Document (*.xml) in the Save As Type drop-down list.

  5. Mark the Apply Transform check box.

  6. If you only use one transform, a default transform may be available. If a default transform is not available or it is not the one you want to use, click the Transform button. In the Choose an XML Transform dialog box, navigate to and select the transform file that you want to use, and click Open .

  7. Click the Save button. If you have marked Save Data Only or Apply Transform in the Save As dialog box, you will be prompted to confirm that you want to do this (see Figure 24.15). Click Continue to confirm your choice.

    Figure 24.15. Click Continue to proceed with your save.

    graphics/24fig15.jpg



Sams Teach Yourself Microsoft Office Word 2003 in 24 Hours
Sams Teach Yourself Microsoft Office Word 2003 in 24 Hours
ISBN: 067232556X
EAN: 2147483647
Year: 2003
Pages: 315
Authors: Heidi Steele

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net