Additive Measures


The design of a BI system needs to address the difference between additive measures, which can be stored relationally and then aggregated (either in SQL queries or by Analysis Services when building a cube), and nonadditive measures, which must be handled separately as calculations in Analysis Services. Without storing all of the transaction detail in a relational table, it would be difficult to use nonadditive measures by means of only a relational data warehouse.

This section will look at the difference between additive and nonadditive measures. As you know, an aggregation is the simple calculation of a series of values into a single, scalar value. Common aggregations are summed, counted, and averaged. In less-common cases, aggregation can be the first, last, lowest, or highest value in the series. Aggregation can also be performed using statistical algorithms like standard deviation or variance over the entire range or over a population.

Sums can be calculated from any level. In other words, you can take the sum of a sum and yield an accurate total. The same is true of counts. Once you count something, you can derive a total count by summing all the counts. However, the same is not true of a Distinct Count, in which you do not want to count any duplicate values in the series. If you rolled up your detailed transaction records with order counts at the quarterly level, you could still get an accurate total for the year. Figure 12-2 shows four quarterly sales totals that add up to the total sales for the year.

2003

Qtr 1

3800

Qtr 2

5300

Qtr 3

5500

Qtr 4

4500

2003 Total

19100

Open table as spreadsheet


Figure 12-2: Quarterly sales sums add up to the sum of sales for the year

Instead of counts and sums, if your measure were the average orders by quarter and, if you are not storing the detailed order records, you could have a problem. Assuming that you have arrived at the quarterly average figure correctly, you cannot use these numbers to calculate an average at the year mark, overall, or at any other level accurately.

If you pre-calculate and store the sum for four quarters, it is easy to add them together to get the annual total. Alternatively, averages cannot be stored and then summed to the annual average. This would give you an average of averages and wouldn't properly account for weightings, as Figure 12-3 shows.

2003

Qtr 1

950

Qtr 2

1325

Qtr 3

1375

Qtr 4

1125

2003 Total

1194

Open table as spreadsheet


Figure 12-3: The average of averages yields an inaccurate yearly total

In a relational database, you don't typically store a number that can be calculated in real time because it would require redundant data and storage. Analysis Services, by contrast, is focused on analyzing data rather than on storing data, and it is much more efficient at storage, too. During analysis, you need speed, so it is often faster to retrieve a stored aggregated amount, such as sales by quarter, than it is to calculate it on the fly from daily details.

In Figure 12-3, you can see the average sales orders in the four quarters of 2003. Here, the system might store the quarterly average, but it would need to calculate the true average for the annual average rather than to rely on the stored quarterly averages. The 2003 quarterly average is not 1,194, as shown here; it should be computed as 19,100/4 quarters, or 4,775. When designing a BI system, you need to distinguish between additive measures, which can be stored relationally and aggregated by Analysis Services, and nonadditive measures, which must be handled separately as calculations in Analysis Services.




Microsoft SQL Server 2005 Integration Services Step by Step
MicrosoftВ® SQL Server(TM) 2005 Integration Services Step by Step
ISBN: 0735624054
EAN: 2147483647
Year: 2007
Pages: 152

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net