The Information Lifecycle | Data Protection and Information Lifecycle Management

< Day Day Up >

Information has a lifecycle. It is created, changes, and finally is destroyed. ILM manages this lifecycle to optimize the use of resources, meet regulatory requirements, and ensure the integrity of the information. When a lifecycle has been developed for a class of information, it can be expressed as a series of policies.

A General Model

There is no set information lifecycle. Some products will impose a particular lifecycle on an organization, but ILM does not dictate this. An information lifecycle is dependent on the needs of an organization and the nature of the information.

All information lifecycles can be derived from a general model (Figure 8-4). The model states that information is created, its state changes in some fashion, an action may occur due to that state change, and eventually it is destroyed. Creation is the initial action and destruction the final one

Figure 8-4. The general case of the information lifecycle

ILM policy must define which state changes trigger actions and what those actions will be. Some changes that may trigger an action are

Aging. The difference between the current state's timeframe and a previous one has exceeded a threshold.
Copied. There is a new, additional information path associated with the information.
Moved. An information path has changed.
Transformed. The information has been changed from one class to another.
Relationship. A relationship with another piece of information has been changed, added, or removed.
Content. Any alteration in the content of the information should trigger an event, even a null event. Comparing the current hash with the previous one shows that content has changed.

Changes in metadata or content represent a change in state. This in turn may trigger actions under ILM policies. This continues with changes in state and new actions until the last action possible is taken: destruction.

A Standard Lifecycle

The difficulty with using a general model for a lifecycle is that it generates a lot of work for the people designing it. Analysis has to be performed to determine the lifecycles for different classes of information. Policies then need to be developed to express these in concrete terms. Developing policies can be tough enough, but having to determine individual lifecycles for many classes of information adds time and complexity to policy development.

Set lifecycles have been proposed, mostly by vendors. These were fairly simple affairs, with set stages and actions for all information. The Storage Networking Industry Association's Information Lifecycle Management Initiative group (www.snia.org/tech_activities/dmf/ilm/ilm) is also working on a definition of ILM and the information lifecycle. This will provide a good starting point but should not be adhered to religiously. Information lifecycles are unique to organizations and classes of information.

Life and Death of Information

What if Widget Corporation, a maker of high-quality widgets, is no longer happy with the results of its Data Lifecycle Management e-mail policy? Too often, e-mails that should be retained are not, and others that were supposed to be destroyed have not been. Now the company has angry customers and upset lawyers. The costs of storage and e-mail management continue to rise, though at a slower rate.

The problem is not that Widget Corporation can't manage e-mails in general. What it cannot control, with the DLM policies in place, is information that doesn't fit the rules the company has set up. Widget Corporation has discovered, for example, that many employees in Sales copy e-mails into documents not covered by the e-mail policy. Conversely, many e-mails are destroyed, but not the original documents attached to the e-mails. The company also realizes that many customer e-mails really aren't important and should not be protected. Attention must be turned to what the e-mails mean to lower costs and better protect the company.

Widget Corporation turned to ILM to solve some of these problems. The object "e-mail" is too coarse for Widget's purposes. Instead, e-mails and other documents must be classified, a lifecycle determined, and policies written.

IT and Customer Service have decided that only three categories will be needed initially: Orders, Proposals, and Other. Classification is based on content, especially specific clues inside the e-mail text. Orders can be identified by the order number in the e-mail, for example. Other metadata items that IT and Customer Service feel are important to the ILM process are

Location. Information paths will help identify copies.
Type. Object types will be tracked to look for transformations from e-mail to documents.
Relationships. This is especially important for tracking the source of attachments.
State. Being able to compare changes in content and metadata at different points in time will allow for more directed actions. The company can also guard against changes in Order e-mails after they have arrived.

With this in hand, Widget will be able to apply different levels of protection to different types of e-mails. Rules can be applied to attachments and their source documents (and vice versa). A history of changes in state will show when content and other metadata has changed. Finally, when it is time to make decisions about destroying e-mail, all copies and references to the e-mail can be considered in the decision-making process.

< Day Day Up >