Over the last several decades, organizations have created thousands, if not millions of documents, resulting in large volumes of unstructured content. SharePoint Server 2007 includes functionality to manage this existing content and to organize future content. In addition, it can serve as an Official File Repository, known as a Records Center. This Records Center enables organizations to meet legal and regulatory needs for managing business records. Finally, SharePoint Server can be used to manage Web content, effectively allowing an organization to manage all content, whether documents or Web pages, with the same set of tools.
Content types allow you to separate the declaration of list metadata from the list itself so you can re-use the same metadata in more than one list. List metadata are the collection of fields that are associated with each column in the list. Content types consist of site columns, which in turn are bound to fields. To understand content types in the context of document management, it is helpful to think of each document as an item in a list in which the list columns map to document properties. This is a fundamental concept-that document properties map directly to site column definitions. Windows SharePoint Services uses the site column definitions to create document properties, to copy data to and from documents as they move into and out of SharePoint document libraries, to associate information management policies and templates with documents, and to manage the state of workflow instances which may be associated with a given document. This ability to capture workflow state extends the scope of content types to include document behavior as well as static properties.
You can create a content type by completing the following steps:
Browse to the Site Content Type Gallery, which is located under Site Actions > Site Settings > Modify All Site Settings > Site Content Types and click the Create button.
You should now see the New Site Content Type page, as shown in Figure 10-1.
Enter the desired name and description for the new content type into the appropriate fields.
Choose an existing content type as the parent by selecting it from the drop-down list labeled Parent Content Type. To locate the desired parent type and then to filter the list by group, use the group drop-down menu labeled Select Parent Content Type From. Every content type is derived either from the base system content type or from one of its children. This built-in inheritance mechanism enables one content type to extend its functionality by incorporating all of the columns declared in its parent.
To make it easier for users to find your new content type, select an existing group or enter the name of a new group that best describes how your content type is to be used. Press the OK button to return to the Site Content Type summary page.
Figure 10-1: Create a new content type from Site Content Types in Site Settings.
Once the content type has been created, you can add the columns that best describe the metadata you want to use in your documents. Use the Add From Existing Site Columns link to select from the existing site columns, or create a new column if the existing columns do not meet your needs.
SharePoint Server ships with a default collection of predefined site columns. These site columns are organized into groups that map loosely to the way each column is typically used. To see the available site columns, browse to the Site Column Gallery, which is found under Site Actions > Site Settings > Modify All Site Settings > Site Columns.
It is possible to select or create site columns directly from the content type creation page, skipping the Site Column Gallery altogether. However, it's a good idea to browse through the gallery a few times initially to become familiar with the available columns before creating new ones, thus avoiding unnecessary duplication of columns.
From the Site Column Gallery, you can view or edit the definition of existing columns or create new columns for the site you are currently viewing. The name of the column appears as a hyperlink under the Site Column heading. If it does not appear as a hyperlink, it means that the column is declared in a parent site of the site you are viewing. To modify column definitions declared within a parent site, you must first go to the parent site and then to its Site Column Gallery. To modify an existing column definition, click its hyperlink. To create a new column definition, click the Create button. Columns created in the Site Column Gallery are available only within the current site collection. To create a site column that is available across multiple site collections, you can declare the field in an XML definition file and then deploy the field definition using a Windows SharePoint Services Feature.
From the New Site Column page, you can specify the name, data type, and group affiliation for the new column. Although every column belongs to a group, the groups are used only to organize the columns. You can change the group affiliation at any time. In actual practice, it is often necessary to use columns from many different groups when creating a new content type. Although the New Site Column page contains a description text box that can hold informative text about how a given field should be used, the built-in site columns do not make use of this property. It is good practice, however, to include a brief description when creating new site columns to make it easier to match a given column to its intended use.
When choosing an existing site column, you should be aware that this list includes both sealed and unsealed columns that have been added by various features that have been enabled on your site. Using sealed site columns may cause problems with your content type declarations because they cannot be removed through the user interface once they have been added to a content type. This problem is exacerbated when modifying an existing content type from which other content types have been derived. Table 10-1 lists some of the sealed columns that are added by the publishing feature. Use caution when adding them to custom content types.
Gathering document metadata is an important part of an effective document management solution. However, most users focus on the document content and not the metadata. Consequently, important metadata are often captured inconsistently or not at all. Document Information Panels (DIPs) help avoid this problem by enabling users to enter metadata at any time during the editing process and also by enforcing the entry of data values for required fields. Document Information Panels are displayed in professional and enterprise versions of the Microsoft Office System. These client applications support integrated enterprise content management features, and automatically generate a default DIP for any document that is created or opened from within a Windows SharePoint Services document library. The data fields in the form are derived from the content type associated with the document. SharePoint Server adds the option of creating a custom DIP using Microsoft Office InfoPath 2007. Although you are limited to one custom DIP per content type, each DIP may contain multiple views.
You can create a custom DIP either from the SharePoint user interface or from within the Microsoft Office InfoPath 2007 application. To create a DIP from the SharePoint user interface, perform the following steps:
Go to the Content Type Settings page for the content type you wish to edit. Click the Change Document Information Panel Settings link.
From the Document Information Panel Settings page, click Create A New Custom Template to launch InfoPath 2007.
When InfoPath opens, the Data Source Wizard opens automatically. Click Finish to enter edit mode.
InfoPath displays the auto-generated form that contains the data fields defined by the content type along with a default set of controls.
Edit and save the form.
Publish the form to update the content type.
You have the option of publishing DIP templates directly to the content type resource folder or to whatever location you want. SharePoint Server updates the content type to reference the location you choose. If you publish to a location other than the content type resource folder, then additional security restrictions may be applied to the form that cause it to open in Restricted mode. In that case, the data connections between the form and SharePoint Server may not work properly. To ensure that the form opens in Full Trust mode, either digitally sign the form or create a Windows Installer that registers the form on each client machine.
The ability to modify a content type independently of any documents that were derived from it means that in some situations the schema that describes the metadata associated with those documents may get out of synch with the metadata associated with the content type. By default, SharePoint Server regenerates the form template used by the DIP that is displayed in Microsoft Office client applications so that the fields declared in the associated content type match the fields that are displayed in the form. If you have created a custom DIP, then you must update the form manually.
To update the DIP schema from within InfoPath, perform the following steps:
From within InfoPath 2007, select File > Open in Design Mode.
Go to the template form associated with the DIP and click Open.
Select Tools > Convert Main Data Source to open the Data Source Wizard.
Click Next to advance through all of the wizard screens and then click Finish on the last screen. The form schema is then re-synchronized with the content type.
Edit and save the form as usual.
Republish the form to make it available to users.
To update the DIP schema from within SharePoint Server, perform the following steps:
From the Content Type Settings page, click Change Document Information Panel Settings.
On the Document Information Panel Settings page, click Edit This Template to launch Microsoft InfoPath 2007, which opens the Data Source Wizard automatically.
Click Next to advance through all of the wizard screens and then click Finish.
If the content type has changed since the form was last edited, you are prompted to confirm the update. Click Yes to confirm.
Edit and save the form.
Republish the form to make it available to users.
Content types may be based on other content types. When changes are made to a parent content type, those changes are not reflected automatically in child content types that derive from it unless those changes are explicitly pushed down to the derived content types. Pushing down the changes from a parent content type to its children means that the schema associated with each child is overwritten with the new schema defined in the parent. Because the DIP is stored as an embedded XML document within the content type schema, pushing down the schema also overwrites any custom DIP that may be associated with the child content type. Prevent this overwriting by marking the child content type as sealed. Sealed content types are not affected by push-down operations. To mark a content type as sealed, open the Site Content Type page from the Content Type gallery and click the Advanced Settings link. From the Site Content Type Advanced Settings page, select the Yes button in the Read Only section.
To enable support for metadata stored in custom file types, Windows SharePoint Services 3.0 provides a special framework for handling the transfer of data between any type of document and the document library in which it is stored. A document parser is a custom Common Object Model (COM) component that extracts data from a given document type and passes that data to SharePoint Server, which then promotes the data into columns of the document library. The document parser is also responsible for receiving data from SharePoint Server and demoting them into properties within the document. You configure a custom document parser by establishing an association between it and a given file type. This association is maintained in the DocParse.xml file, which is located in %systemDrive\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\CONFIG folder. The default associations are shown in Figure 10-2. To add a custom document parser, you can edit this file to include a reference to the PROGID or CLSID of the COM component that implements the document parser interface.
Figure 10-2: The DocParse.xml file maps file extensions to the PROGIDs of document parsers recognized by SharePoint Server.
It is important to understand how SharePoint Server parses metadata for documents that are associated with a content type. It may be the case that the content type associated with a document does not match the content types associated with the document library to which the document is being uploaded. This can happen if a document is checked out from one document library and then is uploaded to another one. Promoting and demoting a content type are explained as follows.
Promoting Promoting a content type onto a document library means that the content type is added to the collection of content types that the document library accepts.
Demoting Demoting a content type into a document means that the columns of the content type are added as metadata fields within the document properties.
The basic rule of thumb is that SharePoint Server never promotes the content type from a document onto a document library, but instead attempts to demote the content type from the document library into the document before it is uploaded. The rules governing this process are as follows:
If the document is not associated with any content type, then SharePoint Server demotes the default list content type into the document.
If the document is associated with a content type, then SharePoint Server checks whether that content type is already associated with the document library. If it is, then SharePoint Server demotes the designated content type from the list into the document. If the content type definition has changed since the creation of the document, the new fields are added to the document properties.
If the content type is not already associated with the document library, then SharePoint Server checks whether the document library allows any content type to be uploaded. If it does, then the content type is left intact within the document. If not, SharePoint Server demotes the default list content type into the document.