Getting to Know the Office Open XML Formats


This section will show you how to access the ZIP package for an Office Open XML Format document and how to begin to make sense of what you find there. For the best results, I suggest that you take each subsection that follows step by step and be sure you understand and feel comfortable with the content before continuing onto the next.

Breaking into Your Document

Because each of your 2007 Office release Word, Excel, and PowerPoint documents is actually a ZIP package in disguise, you can just change the file extension to .zip to no access all of the files in the package. There are a few ways to go about this.

Caution 

If you have software installed that extracts files from a ZIP package, you might be able to look at the files in the ZIP package by using that extraction software, without first changing the file extension. However, you’re unlikely to see the folder structure of the package when you do this, which is an essential part of the package integrity. Changing the extension takes just a second and enables you to view and manage your files in Windows Explorer, for familiar file access options.

  • Append the .zip file extension to the existing file name. To do this in Windows Explorer, or on the Windows desktop, just click to select the file and then click again on the file name (this is slower than a double-click) to enter editing mode for the file name. For the same result, you can also press F2 once you select the file. Leave the existing file name and extension intact and just add .zip, so that you can open the package in Windows Explorer to see its content.

    When you change the file extension in Windows Explorer or from the Windows desktop, you’ll see a warning that changing the file extension may make the file unusable. Just disregard this message and click Yes to confirm that you want to continue. (However, to protect your files, it’s a good idea to save a copy of the document with its original file extension before appending the .zip extension or beginning to make changes in the XML.)

    Note 

    Renaming the file to a .zip extension is easier to do if you are viewing file extensions. If you don’t see the extension for your Office Open XML Format file (such as .docx), change your setting in Windows Explorer to view all file extensions. To do this in Windows Vista, in any Windows Explorer window, click the Organize button and then click Folder And Search Options. On the View tab, turn off the option Hide Extensions Of Known File Types and then click OK. To find the same option when working in Windows XP, in Windows Explorer, on the Tools menu, click Folder Options and then click View.

  • You can save a copy of your file with the .zip extension, while it’s open in its source program, to bypass the step of changing the extension later. In the Save As dialog box, type the entire file name followed by .zip inside quotation marks. The file is still saved in whatever format is listed (so you still need to choose a macrofree or macro-enabled file format, for example, as needed), just as if you saved it first and appended the .zip extension later. The only difference is that the file’s ZIP package is immediately available to you without taking an additional step after you close the file. For example, to save a file named sample.docx as sample.zip, type “sample.zip” in the File Name box of the Save As dialog box.

Inside Out-That ZIP Package Is Still a Document 

When you’re editing the files in the ZIP package, you might not want to spend the time switching back and forth between the Office Open XML file extension (such as .docx) and the .zip extension. Well, you don’t have to!

From the Open dialog box in Word, Excel, or PowerPoint, you can open documents that belong to the applicable program even when they’re using the .zip file extension. To see your ZIP package file, just select All Files from the Files Of Type drop-down list beside the File Name box and then select and open the file as you would when using its original extension. There’s nothing else to it. Word, Excel, and PowerPoint know that the Office Open XML Formats are ZIP packages and read the XML within those packages whether the file is saved using .zip or a file extension that belongs to the program.

Note that you can also open the ZIP package in the appropriate program through the Open With options available when you right-click the ZIP package on the Windows desktop or in Windows Explorer. If you do this, just be careful not to accidentally set the applicable program as the default for opening this file type, or you’ll add an extra step for yourself every time you want to access the document parts in the ZIP package.

However, for ease of use as well as for sharing documents with Microsoft Office users of all experience levels, it’s a good idea to make sure the file extension is changed back to its original state once you’ve finished editing the files in the ZIP package.

The Office Open XML File Structure

Once you change the file name to have the .zip extension, open the file in Windows Explorer. The example that follows walks you through the ZIP package of a simple Word document, originally saved with the .docx extension..

When you first view the ZIP package for a Word document in Windows Explorer, it will look something like the following.

image from book

Note that, at the top level of the ZIP package that you see in the preceding example, Excel and PowerPoint files look very similar except that the folder named word in the example is named xl or ppt, respectively, for the applicable program.

  • The docProps folder is exactly what it sounds like-it contains the files for the document properties and application properties, ranging from author name to word count and software version.

  • The _rels folder contains a file named .rels, which defines the top-level relationships between the folders in the package. Note that additional relationship files may exist, depending on the document content, for files within a specific folder of the package (explained later in this section).

    The relationship files are among the most important in the package because, without them, the various document parts in the package don’t know how to work together.

  • The file [Content_Types].xml also exists at the top level of every document’s ZIP package. This file identifies the content types included in the document. For example, in a Word document, this list typically includes such things as the main document, the fonts, styles, Theme, document properties, and application properties. Files with additional content types, such as diagrams or other graphics, will have additional content types identified.

Exploring a bit further, when you open the folder named word, you see something similar to the following image.

image from book

  • A new Word document contains XML files for the fonts, styles, settings (such as the saved zoom setting and default tab stops), and Web settings, whether or not formatting related to these items has been applied in the document. If headers, footers, footnotes, graphics, comments, or other content types have been added, each of them will have its own XML document part as well.

    In the ZIP packages for Excel and PowerPoint files, you’ll see a similar organization, with XML document parts for file components (such as styles.xml in Excel or tableStyles.xml in PowerPoint). Additionally, the xl folder in an Excel ZIP package contains a worksheets folder by default, because there is a separate XML document part for each sheet in the workbook. The ppt folder in a PowerPoint ZIP package also contains folders named slides, slideLayouts, and slideMasters, by default.

  • In addition to the XML document parts you see in the preceding image, notice the theme folder-which exists in the program-specific folder (word, xl, or ppt) for Word, Excel, and PowerPoint ZIP packages. The file contained in this theme folder contains all document Theme settings applied in the document. It is because of this file that you’re able to share custom Themes by sharing documents, using the Browse For Themes feature at the bottom of each Themes gallery.

  • The _rels folder inside the program-specific folder defines the relationships between the parts inside the program-specific folder. The relationship file contained in this _rels folder is called document.xml.rels for Word documents, presentation.xml.rels for PowerPoint documents, and workbook.xml.rels for Excel documents.

    Depending on the content in a given folder, its _rels folder might contain more than one file. For example, if a header exists in a Word document, the word folder contains a part named header.xml, and its _rels folder contains a file named header.xml.rels.

  • Content in your document from other sources (such as embedded objects, media files, or macros) are either stored in their original format (as is the case for picture files) or as a binary file (.bin file extension). Because of this, you can save time on many tasks related to working with media files (such as pictures), as discussed in “Editing and Managing Documents Through XML" on page 1164.

  • As mentioned at the beginning of this section, the ZIP package shown in the two preceding images is for a .docx file. Remember that the x at the end of the file extension indicates that it’s a macro-free file format. If this were, instead, the package for a .docm file, you would also see a file named vbaData.xml and one named vbaProject.bin.

If you return to the top level of the ZIP package and then open the docProps folder, the following is what you’ll see.

image from book

By default, this folder contains the files app.xml (for application properties such as word count and program version) and core.xml (for document properties such as the Document Properties summary information like author and subject). Additionally, if you use the options to save a preview picture or a thumbnail for your document, you see a thumbnail image file in the docProps folder. For Word and Excel, this will be a .wmf file and for PowerPoint it will be a jpeg file.

Note 

If you’re running the 2007 Office release on Windows Vista, you’ll find an option in the Save As dialog box in Word or Excel to save a thumbnail image of your document. In PowerPoint, or in all three programs when running Windows XP, you’ll see the option Save Preview Picture in the Document Properties dialog box.

Taking a Closer Look at Key Document Parts

Let’s take a look at the XML contained in a few of the essential document parts, to help accustom you to reading this file content.

The image you see below is the [Content_Types].xml file for the sample ZIP package shown under the preceding heading, as seen in Windows Explorer.

  <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> -<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">    <Default Extension="wmf" ContentType="image/x-wmf" />    <Default Extension="rels" ContentType="application/vnd.openxmlformats-     package.relationships+xml" />    <Default Extension="xml" ContentType="application/xml" />    <Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-     officedocument.wordprocessingml .document .main+xml" />    <Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-     officedocument.w wordprocessingml .styles+xml" />    <Override PartNarne="/docProps/app.xml" ContentType="application/vnd.openxmlformats-     officedocument .extended-properties+xml" />    <Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-     officedocument.wordprocessingml .settings+xml" />    <Override PartName="/word/theme/theme1.xml"     ContentType="application/vnd.openxmlformats-officedocument.theme+xml" />    <Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-     officedocument.wordprocessingml .fontTable+xml" />    <Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-     officedocument.wordprocessingml.webSettings+xml" />    <Override PartNarne="/docProps/core.xml" ContentType="application/vnd.openxmlformats-     package.core-properties+xml" />  </Types>

  • The first line that you see in this or any XML file in an Office Open XML Format ZIP package will look very much like the first line in the preceding image. This line simply defines the type of XML structure being used.

  • Notice that the second line, which begins <Types, is the first half of a paired code for which the end code is at the bottom of this document. All other lines in this file are the definitions of the content types in this document.

    • On the second line, inside the Types code, you see xmlns followed by a URL. The reference xmlns refers to an XML namespace, which is a required component in several document parts. Technical though this term might sound, a namespace is nothing more than a way to uniquely identify a specified item. The reason for this is that there can be no ambiguous names in the ZIP package (that is, the same name can’t be used to refer to more than one item). So, the namespace essentially attaches itself to the content it identifies to become part of that content’s name.

  • It’s standard to use a Web address as the namespace, but note that the file doesn’t attempt to read any data from the specified address. In fact, if you try to access some of the URLs you see in the files of an Office Open XML ZIP package, you’ll find that some are not even valid addresses. Typically, the address in a namespace identifies the location of the source schema or other definitions used to define the structure of the items assigned to that namespace, and the Web page associated with that address may actually contain those definitions. But, any URL can be used as a namespace-the address itself is actually irrelevant to the code.

  • For the lines between the paired Types codes, notice that each defines one of the document parts you saw in the images of this sample ZIP package, under the preceding heading.

    • The first three lines in that group define the three file extensions included in this particular package, .rels (the relationship files), .wmf (the Windows metafile picture used for the document thumbnail), and .xml.

    • The remaining lines in that group, each named Override PartName, define the content type for each of the XML document parts that you saw in the word and docProps folders for this ZIP package. Take a look at just the first of the Override PartName lines, shown below. This one is for the main document content-the file document.xml that resides in the word folder.

       <Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-  officedocument.wordprocessingml .document .main+xml" />

  • Notice that the definition of the Override PartName that appears in quotation marks is actually the path to the specified file within the ZIP package. The content type definition that appears in quotation marks as the second half of that line of code is a reference to the content type definition defined in the applicable schema.

The following image shows you the content of the .rels file in the top-level _rels folder shown earlier for the sample ZIP package.

   <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> - <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">     <Relationship bold">rId3"     Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core     properties" Target="docProps/core.xml" />   <Relationship bold">rId2"     Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/thumbnail"     Target="docProps/thumbnail.wmf" />   <Relationship bold">rId1"     Type="http ://schemas.openxmlf Formats .org/off ficeDocument/2006/relationships/off ficeDocument"     Target="word/document.xml" />   <Relationship bold">rId4"     Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended     properties" Target="docProps/app.xml" />   </Relationships>

Note 

The .rels file should open without issue in Internet Explorer. But, if this doesn’t work for you, append the .xml file extension to a copy of the .rels file, just for viewing purposes. Also note that, when in the ZIP package, files will only open in their default assigned program. To be able to open a document part in both Internet Explorer and Notepad, as needed, copy the file out of the ZIP package. Then, right-click the file and point to Open With to select the program you need.

  • Notice that, although the content of the .rels file is very different from the content of the [Content_Types].xml file, the concept of the structure is the same. That is, the first line defines the XML standard being used, and the second line opens the paired code that stores the core file content and specifies a namespace for the content that appears between the lines of the paired code.

  • Take a look at one of the relationship definitions from the .rels file-the one for the main document.xml document part. Notice that each relationship contains three parts-the ID, the Type, and the Target.

     <Relationship bold">rId1"   Type="http://schemas .openxmlf Formats .org/off ficeDocument/2006/relationships/off ficeDocument"   Target="word/document.xml" /> 

    • An ID is typically named rID#. This structure is not required, however, so you might occasionally see relationships with different IDs.

    • The Type uses a type defined in the applicable schema, which appears as a Web address. As with an XML namespace, the document doesn’t need to read data from that address. However, in this case, the Type is a specified element of the applicable schema and does need to be a content type recognized by the Office Open XML structure.

    • The Target, as you likely recognize, is the address within the package, where the referenced file appears. When you create a relationship yourself, it’s essential that this be correct, because the relationship will do no good if it can’t find the specified file.

Depending on the content in your files, you might run across defined relationships in your .rels files that aren’t used to specify files in the ZIP package and therefore might take on a slightly different structure for the relationship target. For example, notice the following relationship from a document.xml.rels file for a document that contains a hyperlink to the Microsoft home page.

 <Relationship bold">rId4"   Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink"   Target="http://www.microsoft.com" TargetMode="External" />

Though the relationship ID and Type have the same structure as a relationship to a document part, notice that the target in this case is to an external hyperlink instead of a file in the package.

When you open a file in its originating program (Word, Excel, or PowerPoint), keep in mind that the .rels files are the first place the program looks to know how to put the pieces together for the purpose of opening that file.

Building a Basic Word Document from Scratch

The document shown in the preceding portions of this section is a simple Word document with all the defaults you get when you use Word to create a new document in the DOCX file format. Now, it’s time to build a .docx file yourself, without using Word.

If you’re thinking about skipping over the rest of this section because it sounds either too complicated or unnecessary for your needs, please wait. This exercise is important for three reasons.

  1. You might be amazed at how easy this is to do. And, discovering the simplicity for yourself can help you master the tasks in this chapter that you want to learn.

  2. You can find all the XML code you need for this section in a provided sample file (explained in a note that precedes the first part of the following exercise), if you prefer not to type out the XML for yourself.

  3. This exercise is included early in the primer because doing a similar exercise when I was first learning about the Office Open XML Formats was the most helpful thing I did toward understanding the basics of how the parts in an Office Open XML ZIP package work and fit together.

That said, the exercise that follows walks you through creating a simple, essentialsonly Word document. Though it’s good practice for anyone creating Office Open XML Format documents through code to include all of the defaults that the source program (Word, Excel, or PowerPoint) includes when it creates a new document, only a few of those files are actually required for the source program (Word in our example) to be able to recognize and open the file. If you create a file that only contains the required bare basics, Word will recognize the missing pieces and add the document parts and relationships needed as you begin to use Word features in your document.

Every Office Open XML document requires [Content_Types].xml as well as a top-level _rels folder containing the .rels file. Each file also requires its program-specific folder with the main program-specific content file that goes in that folder (document.xml in a folder named word, in the case of a Word document). For a Word document, such as we’re about to build, these are the only three files you must have in your ZIP package to create a .docx file that Word will recognize and open without an error. In Excel and PowerPoint, a few other files are required.

  • An Excel .xlsx file also requires the sheets folder inside the xl folder, with an XML document part for at least one sheet. This is because an Excel workbook must contain at least one worksheet. Because of that sheets folder, the xl folder also needs its own _rels folder containing a workbook-level .rels file that defines the relationship between worksheets and workbook.

  • A PowerPoint .pptx file also requires the slideLayouts, slideMasters, and theme folders (each of which contain required files), because a presentation must contain a Theme, at least one slide master, and at least one slide layout. These folders, all of which reside in the ppt folder, also require a _rels folder in that ppt folder to define the relationships between the presentation, slide master, and Theme. Note that the master and layout folders contain their own _rels folders, which is why there is no reference to the slide layouts in the presentation-level relationships file.

To create your first Word document from scratch, you’ll need to create the files [Content_Types].xml, .rels, and document.xml, and place them in the correct folder structure. The steps that follow will walk you through getting this done.

Note 

image from book In the sample files that are included on this book’s CD, find the image from book Copy XML.txt file, which contains all of the code in this and subsequent sections of this chapter, that you can copy into the files you create in Notepad if you prefer not to type the XML yourself.

Create the Folder Structure

On your Windows desktop, or in any convenient location, create a folder named First Document (or any name you like; this name is for identification purposes in this exercise only). This folder will store the structure for your new .docx file. In that folder, create two subfolders, one named _rels and the other named word. It is essential that these two folders are correctly named. When you’re done with this step, your folder structure should look like the following image.

image from book

Create the Main Document File

The main document file, document.xml, needs to reside in the word folder you just created. To create this file, do the following.

  • Open Notepad and save a new file as document.xml, inside the word folder you created. Be sure to type the .xml file extension as part of the file name, so that Notepad doesn’t save the file in the .txt file format. (Notepad will save the file correctly when you type document.xml in the File Name box, even though the Save As Type list indicates a .txt file.)

  • In Notepad, add the following code to the document.xml file. This code is shown below first in Internet Explorer, so that you can see the organization of it, and then in Notepad, to see how it looks without the tree structure applied.

    If you’re typing this text from scratch, it’s easier to copy from the version shown in the tree structure. If you do, note that you need a space between each XML namespace (xmlns definition) because those definitions all appear together inside the same code (the same pair of angle brackets). However, you don’t need spaces between any codes that are enclosed in their own pair of angle brackets. Remember, however, that you can copy this code from the sample file image from book Copy XML.txt, if you prefer.

       <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> - <w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"    xmlns: o="urn:schemas-microsoft-com:office:office"    xmlns: r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"    xmlns: m="http://schemas.openxmlformats.org/officeDocument/2006/math"    xmlns: v="urn:schemas-microsoft-com:vml"    xmlns: wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"    xmlns: w10="urn:schemas-microsoft-com:office:word"    xmlns: w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"    xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">  - <w:body>    - <w:p>      - <w:r>         <w:t>This is the first Word document I've created from scratch.</w:t>        </w:r>      </w:p>    - <w:sectPr>       <w:pgSz w:w="12240" w:h="15840" />       <w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="720"         w:footer="720" w:gutter="0" />        <w:cols w:space="720" />        <w:docGrid w:linePitch="360" />      </w:sectPr>     </w:body>    </w:document>

Caution 

To accommodate the page layout for the book, code in the unstructured XML samples throughout this chapter may break to a new line in the middle of a term or use a hyphen to start a new line. When you view code in Notepad, it might appear to break in the middle of a word as well, but it won’t use hyphens. Remember that all of the code between a single paired code (such as the <document> code shown here) is considered a single line and should not get manual line breaks when you type the code.

If you are typing this code yourself, double-check your syntax against the structured version of the same code that appears along with each unstructured sample. If copying the code instead of typing it, do so from the sample file named image from book Copy XML.txt referenced earlier.

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibi1 ity/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http:// schemas.openxmlformats.org/officeDocument/2006/relationshi ps" xmlns:m="http:// schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-mi crosoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/ wordprocessi ngDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns: wne="http://schemas.microsoft.com/office/word/2006/wordml"><w:body><w:p><w: r><w:t>This is the first Word document I've created from scratch.</w:t></w: r></w:p><w:sectPr><w:pgSz w:w="12240" w:h="15840"/><w:pgMar w:top="1440" w: right="1440" w:bottom="1440" w:left="1440" w:header="720" w:footer="720" w: no gutter="0"/><w:cols w:space="720"/><w:docGrid w:linePitch="360"/></w:sectPr></ w: body></w: document>

Once you’re satisfied that your code is accurate, you can save and close this file. Notice that this code contains items you saw in the example from the preceding chapter section.

  • The first line of code provides the XML version definition.

  • The second line is the open code for the overall document content, where the namespaces are defined. Notice that a document file contains multiple namespaces to cover different content types.

  • The document content in this file is the single-line paragraph of text contained inside the paired <body> code.

  • The last piece of content is the paired <w:sectPr> code, which you can see stores the basic section formatting (page setup) information. You can omit this information and the document will open in Word using default settings. Note that the formatting settings and values you see here are explained in the next section of this chapter.

Let’s look at that document in one more format to help clarify the content. The image below shows the same document.xml file opened on the Tree View tab of the XML Notepad editor. 1160 Chapter 32 Office Open XML Essentials

image from book

Create the Content_Types File

In Notepad, save a file named exactly [Content_Types].xml to the root of your First Document folder. As with the document.xml file, following are two versions of the code that you need to add to this file, first shown in Internet Explorer so that you can clearly see the tree structure, and next shown as run-of-text, similar to the way code appears in Notepad.

   <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> - <Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">     <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml" />     <Default Extension="xml" ContentType="application/xml" />     <Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-      officedocument.wordprocessingml.document.main+xml" />   </Types> <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"><Default Extension="rels" ContentType="application/vnd.openxmlformats-package. relationships+xml "/><Default Extension="xml" ContentType="application/xml"/><Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocu- ment.wordprocessingml.document.main+xml"/></Types>

As you see, this is a very simple file, containing the XML version statement at the top as well as the open code named Types that is the code in which all codes in the file are nested and where the namespace for the content types is defined. After that, you see the following.

  • The only file extensions present in your basic Word document are .xml and .rels, so they are the only file extensions that require definition here.

  • The only part name that requires definition as a content type is the main document (document.xmt) because that is the only document part currently included, aside from the two structure-related files [Content_Types].xml and .rels.

Create the .rels File

The relationship file for this new document is the simplest of the three you need to create. In Notepad, create a new file and save it as .rels, inside the _rels subfolder you created within the First Document folder. Then, add the following content to that file (shown in both structured format and in Notepad run-of-text format).

   <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> - <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">     <Relationship bold">rId1"       Type="http ://schemas .openxmlf Formats .org/off ficeDocument/2006/relationships/off ficeDocument"       Target="word/document.xml" />   </Relationships>

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"> <Relationship  Type="http://schemas.openxmlformats.org/officeDocument/2006/ relati onshi ps/officeDocument" Target="word/document.xml"/></Relationships>

In the preceding code, the XML version is defined (as it is in every .xml or .rels format file within the package); followed by the open code for the relationships content along with its namespace definition; followed by the single required relationship in this case, which is to the part named document.xml. See “The Office Open XML File Structure” on page 1150 for details on the three-part structure (ID, Type, and Target) of a relationship definition.

Compile and Open Your New Document

Once you save and close the .rels file, you can exit Notepad. You’re now ready to put your ZIP package together and open the file in Word, using the following steps.

  1. Open the First Document folder in Windows Explorer.

  2. Select the file [Content_Types.xml] as well as the two subfolders (_rels and word).

  3. Right-click, point to Send To, and then click Compressed (Zipped) Folder.

  4. When the .zip folder is created, change the name (including the file extension) to First document.docx. Then, press Enter to set the new name and click Yes to confirm when you see the warning about changing file extensions.

Double-click to open your new Word document. It should open in Word without error. If it does not, see the Troubleshooting and Inside Out tips at the end of this section for help finding the problem.

Add More Content Types, Document Parts, and Relationships

Even though you didn’t add all of the default content types and relationships that Word adds to a new document, all Word functionality is available to your new file. Make any edit (you can even type just a space if you like) and then save the file while it’s open in Word. Then, close it, change the file extension to .zip, and take a look at what Word did to your files.

What you’ll find is that Word added the default files it provides when it creates a new .docx file, and it added the necessary content type and relationship definitions to go along with them. Review the changes that Word made to your file. Once you’re comfortable with the ZIP package content, you’re ready to start working directly with the XML behind your Office Open XML documents.

Troubleshooting

image from book

How can I find the error when my ZIP package won’t open in Word, Excel, or PowerPoint?

When an Office Open XML Format document won’t open in Word, Excel, or PowerPoint, the problem can be as simple as a missing space, angle bracket, or another single character. But, when you have ZIP packages with multiple long files, how do you even begin to find the problem? Actually, in most cases you don’t have to-Word, Excel, or PowerPoint will do it for you.

When you try to open the file and an error message appears, click the Details button on the error message. In most cases, the precise location of the error will be listed, and the error type might be included as well. Take a look at the following example.

image from book

In this example, I left the quotation mark off following one of the namespace definitions in the document.xml part. Notice that the detail here shows you the document part, the line within that part, and the location in that line where the error occurs. See the Inside Out tip that follows for more on interpreting the location references.

Note that, if you’re using Internet Explorer to view and Notepad to edit your XML document parts, if there’s an error in one of the parts, Internet Explorer will most likely be unable to open it in the tree structure. Because of this, if you use the error detail to lead you to the error location and try to correct it in Notepad, you can confirm that the error is corrected before returning the file to the ZIP package and changing the file extension back to its original state, just by trying to open it in Internet Explorer.

See the Inside Out tip that follows for some help on how to locate the error in your code without any wasted time or effort.

image from book

Inside Out-Using XML Notepad and Word to Help Find Syntax Errors 

Perhaps you tried to open a file in Word, as discussed in the preceding Troubleshooting tip, and got an error. Or, maybe you just created one of the XML parts for a new document, such as a document.xml file, and then tried to open it in Internet Explorer only to get a syntax error at that point.

The error message you see may indicate a line and position number, or it may indicate a line and column number. Note that column and position are not the same thing. Position is the easier of the two to identify, as it corresponds to characters. no

One easy way to find the line and position number of the error is to try to open the file in XML Notepad, the free utility program mentioned earlier in this chapter (this is not the same as the Windows Notepad utility). So, if the Word error message tells you that the error occurred in the document.xml part, for example-or the error occurred in Internet Explorer when trying to open an individual XML part-you can try opening that document part in XML Notepad to instantly see the line and position number where the problem exists.

Keep in mind that everything within a paired code is considered a line of code. So, for example, in document.xml, line 2 refers to everything inside the paired <w: document> code. Line 1 is the code that indicates the XML version. If then, for example, the XML Notepad error tells you that the error is located at line 2 and position 645, you’re looking for character 645 in the second line of code. Copy that line of code (you can open it in Windows Notepad to do this) and paste it into a blank Word document. Then, open the Visual Basic Editor (Alt+F11), press Ctrl+G to open the Immediate window, and type the following code in that window. (You may want to turn off Word Wrap from the Format menu in Notepad before copying text to Word, to avoid copying unwanted formatting marks.)

 ActiveDocument.Characte rs(645).Select

Substitute 645, of course, for the position of the error in your code. Press Enter from that line of code and then switch back to the document (Alt+F11), and you’ll see the character causing the error selected on screen. No fuss, no muss, and no tearing your hair out because you can’t find the error when you look at the amorphous blob of code that appears in Windows Notepad.




2007 Microsoft Office System Inside Out
2007 MicrosoftВ® Office System Inside Out (Bpg-Inside Out)
ISBN: 0735623244
EAN: 2147483647
Year: 2007
Pages: 299

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net