TECHNIQUES FOR HANDLING INCOMPLETENESS

In this section we present techniques for handling incompleteness in the base data, derived data, and metadata.

In the base data, a measure could be incomplete. The basic problem posed by incomplete measures is arithmetic. For instance, what is the result of an aggregate, such as max, computed on a set of values if some of those values are unknown or otherwise incomplete? The two basic strategies are to replace the incomplete measure with a complete value and then compute the aggregate, or to manufacture an incomplete result and insert it into the hierarchy. The chief advantage of a replacement strategy is that it is easy to implement since multidimensional databases are designed to handle complete data. The disadvantage is that no matter what replacement value is used, some useful information is lost.

The base data could also have incomplete "grouping" attributes. This will result in facts that group at more than one base node. For example, if a multidimensional database is constructed to count how many apples are sold each day and the database records a sale on an unknown date, to which base node should the sale belong? We discuss techniques for calculating "bounds" on groups and for modifying the hierarchy to introduce new nodes for incompletely specified groups.

The derived data (i.e., the values in the hierarchy) could be incomplete. The chief source of incomplete derived data is that techniques for repairing incomplete base data often generate incomplete values. The basic problem and solution is similar to that for incomplete measures. While there are many similarities with incomplete base data, additional techniques can utilize complete values that are "nearby" in the hierarchy.

Finally, the metadata can be incomplete. We present several examples of incomplete metadata and discuss techniques for completing incomplete metadata specifications.



Multidimensional Databases(c) Problems and Solutions
Multidimensional Databases: Problems and Solutions
ISBN: 1591400538
EAN: 2147483647
Year: 2003
Pages: 150

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net