We have focused on dimension issues, which are a common cause of low-quality data. There are situations where the measures are of poor quality. Here are some tips that you might use to solve some of these problems.
Out of Range Values
Data quality might be affected by out-of-range values, but you need to have clear business rules regarding the ranges. You would use a Conditional Split transform to direct a fact row to the normal data flow or to a flow to handle the out-of-range condition. For an example of how to implement this, look at how the customer merge was handled, where we used a Conditional Split transform to decide whether a sufficient match existed between two customer records.
You might not necessarily know the true range for each measure, or the range might depend on some complex combinations of the values of some of the dimensions. Often the rules are far too complex to be explicitly stated. In these cases, using a data mining query to assess the probability of a value being correct based on known occurrences of good values and related dimension members is recommended. Because this only provides you with a probability, you will want to route low-probability rows through a manual process for review. The threshold of "low probability" should in most cases be fairly high to minimize false positives. Chapter 10, "Data Mining," provides an introduction to data mining techniques.
Null and Zero
A null measure is considered empty by Analysis Services. Depending on the settings of your cube browser, empty cells might not display. Zero values, on the other hand, are not empty and will display. You should consider whether to convert null measures to zero. You can use a Derived Column transform to check for and convert nulls to zeros.