Unstructured and Structured Information

 < Day Day Up > 

Operating systems and file systems know only about blocks and files. They cannot tell what is in those files. They may surmise that a file is a word processor document by its extension or MIME type but cannot be certain. Extensions and MIME types can be changed. Even within applications, information can be recognized only in a gross fashion, as a document or spreadsheet. Applications do not know whether a file is an important document, a financial report, or letter to a friend. Files and similar constructs are considered unstructured. Operating systems, file systems, and applications have no external means of understanding the meaning of the data.

The most difficult type of unstructured data to deal with is images. Photographs, x-rays, scanned documents, and other images do not have any internal clues to help determine what the information is. All validation of the object is external and provided by an outside source. This makes management of images through traditional means, such as keyword searching, almost meaningless.

Databases, XML files, and other structured systems are different. They arrange data into information by using a schema. A schema is a description of the data that provides context. By applying the schema, order is imposed on the data, and it becomes information. Anyone looking at the schema (at a well-designed schema, at least) and applying it to the data will understand what the data represents.

The advantage of a structured system, as far as ILM is concerned, is that context is already provided. Description of the information is not needed, because the schema provides the necessary context. ILM policies rarely have the luxury of dealing only with structured data.

     < Day Day Up > 


    Data Protection and Information Lifecycle Management
    Data Protection and Information Lifecycle Management
    ISBN: 0131927574
    EAN: 2147483647
    Year: 2005
    Pages: 122

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net