Aggregate Awareness

 < Day Day Up > 



An aggregate table is a summary table that DBAs build to execute queries faster. Aggregate tables are smaller than detail tables and can be aggregated in a number of ways. Most DBAs will strive for a certain compression ratio-for example, 10:1; the aggregate or summary table should be ten times smaller than the detail table or for every ten detail rows, there is one summary row.

Figure 10-3 shows two sample dimensions, Time and Product. In the Time dimension, there are five years of history, with four levels going from Year to Day. Within a given year, there are 365 days. If the Time dimension contains five years of history, this results in 1,825 rows of data. The Product dimension has four levels ranging from Total Product to SKU. There are 210 SKUs (this is a very small product dimension).

click to expand
Figure 10-3: Two sample dimensions showing the number of potential rows at each level

A DBA may create an aggregate table summarizing data to any of these levels, across any dimension, as shown in Figure 10-4. The SALES_FACT_DAY table contains daily sales figures for five years, at the SKU level for all customers. The SALES_FACT_ MONTH table aggregates sales by month and by product line. Customer is not aggregated in any way.

click to expand
Figure 10-4: Aggregate tables are summary tables that allow for faster queries.

The following table shows the potential number of rows in such a fact table, if you have 30,000 customers. Typically, not every product sells daily, nor does every customer buy every product on a daily basis, so the potential rows represent strictly a worse-case scenario. However, the table illustrates the purpose of using aggregate tables for faster queries. If a user wants to analyze customer or product line sales only on a monthly or quarterly basis, the queries will run much faster against a 21.6 million row table than an 11.497 billion row table.

 

Time

Product

Customer

Potential Rows

Day, SKU

1825

210

30,000

11,497,500,000

Month, product line

60

12

30,000

21,600,000

The existence of aggregate tables does not help user queries. Users must access the summary tables, ideally automatically with awareness built into the BI tool or the RDBMS to force the query to be processed in the most efficient way. In recent years, BI vendors have debated if aggregate awareness belongs in the BI tool or in the RDBMS. BusinessObjects is the only BI tool that has built-in functionality to take advantage of RDBMS aggregate tables in a way that is seamless to users. Relational OLAP (ROLAP) vendors such as MicroStrategy and Microsoft can use aggregate tables, but tools such as Cognos Impromptu, Brio, and Crystal can use them only if a user specifically tells the tool to access a summary table. Leading RDBMS vendors now include an aggregate navigator in the database. However, it is a feature the DBA must tell the RDBMS to use; it is not automatic, and for extremely large databases, some customers have struggled to enable it. Fortunately, Business Objects has kept aggregate awareness in the tool and improved it to work also with dimension tables. Figures 10-1 and 10-2 focus on creating an aggregate for the fact tables. For extremely large dimensions, you also may have a smaller dimension table that removes dimension details lower than month and product line. The following example will show how to add aggregate awareness in both facts and dimension objects:

click to expand

Creating Aggregate Fact Tables

Business Objects has provided two aggregate tables as part of the EFASHION.MDB. You will focus on the one table:

AGG_YR_QT_MT_MN_WK_RG_CY_SN_SR_QT_MA

The naming of the table is a bit cumbersome, so hereafter, I will refer to it as the aggregate fact table. The table name reveals some information on the contents of the table:

  • AGG Aggregate

  • YR Year

  • QT Quarter

  • MT Month Text

  • MN Month Number

  • WK Week

  • RG Maybe region was intended but not in the final table?

  • CY City

  • SN Store Name

  • SR Sales Revenue

  • QT Quantity

  • MA Margin

I do have some concerns with the design of this table, but for demonstration purposes it will work fine. Notice that the number of rows for the aggregate table is 1,982 compared to 89,171 in the detailed SHOP_FACTS table. Whenever possible, you want queries to run against the smaller AGG_YR_QT_MT_MN_WK_RG_CY_SN_SR_QT_MA rather than the larger, detailed SHOP_FACTS.

Identifying Dimension Tables Irrelevant to Aggregate Tables

In looking at Figure 10-5, you see that the aggregate table does not include any information on Articles. So these two dimension tables will become incompatible with the summary objects. The fact table includes a lot of dimensional information. Time information is aggregated to the week level. You can either retrieve dimensional information from CALENDAR_YEAR_LOOKUP or the aggregate fact table. Outlet information is not aggregated; the aggregate tables contain details on the STORE_NAME only.

click to expand
Figure 10-5: A detail fact table and a summary aggregate table may share some of the same dimension tables.

Adding Aggregate Awareness to Measure Objects

You are now ready to define aggregate awareness to the individual measure objects Sales Revenue, Quantity Sold, and Margin. Aggregate awareness uses a BusinessObjects internal function:

 @Aggregate_Aware(sum(smallest_table.column), sum(medium_table.column), sum(biggest_table.column))

You can have multiple tables in the SELECT statement, with the smallest table first and the largest or most detailed table last. The @Aggregate_Aware function allows you to use any of the SQL Aggregate commands (SUM, COUNT, AVG, MIN, MAX); however, the aggregate command must be specified for each column as shown here.

Right:

@Aggregate_Aware(sum(Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Sales_revenue), sum(Shop_facts.Amount_sold))

Wrong:

@Aggregate_Aware(sum(Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Sales_revenue,  Shop_facts.Amount_sold))

To follow the steps, use the Test Fashion universe created in earlier chapters.

  1. If you normally have Designer automatically create objects when you add a table, turn this feature off for adding aggregate tables: Tools | Options | Database. Remove the check mark from Create Default Classes And Objects From Tables.

  2. Insert the aggregate table into the universe structure. Click Table Browse or use the pull-down menu to select Insert | Table | AGG_YR_QT_MT_ MN_WK_RG_CY_SN_SR_QT_MA .

  3. Expand the class Test Fashion Measures.

  4. Select the object Sum of Amount Sold or Revenue.

  5. Click the >> next to sum(Shop_facts.Amount_sold)to invoke the SQL Editor.

  6. If you wish to save re-creating the SQL for the detail SHOP_FACTS table, ensure your mouse is positioned at the start of the SQL statement.

  7. Under Functions, click the + sign next to @Functions to expand the group and display the BusinessObjects internal functions.

  8. Double-click @Aggregate_Aware to insert the syntax in the SQL statement box.

  9. Under Functions, click the + sign next to Number to display the SQL RDBMS functions. Scroll to sum() and click to insert the syntax into the SQL statement.

  10. Under Tables, expand the aggregate table AGG_YR_QT_MT_MN_WK_RG_CY_ SN_SR_QT_MA and double-click SALES_REVENUE to insert the table .column between the sum().

  11. Delete the right parenthesis and move it to the end to close the statement. Your SQL statement should appear as follows:

    click to expand

  12. Click Parse to verify you have entered the correct syntax.

    Note 

    According to Business Objects, Parse on aggregate aware objects only partially checks the SQL syntax. If you use @Select with @Aggregate_Aware, the parsing may be slower and less accurate. For complete universe integrity, be sure to test the objects by running queries and viewing the full SQL generated.

  13. Click OK to close the SQL Editor then OK again to save the object definitions.

  14. Repeat steps 4-13 for Quantity Sold and Margin, selecting the appropriate objects and columns.

  15. At this point, there are no joins between the aggregate fact table and the dimension tables, nor is the aggregate part of a context. The aggregate table would normally have keys to join to the dimension tables that Designer may have detected; however, the design of this table is a bit unusual, so you must manually add the following two joins:

    Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Store_name=Outlet_Lookup.Shop_name Calendar_year_lookup.Year=Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Year And Calendar_year_lookup.Week_in_Year=Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Week

  16. Either draw join lines between the respective tables and columns or use the pull-down menu Insert | Join. Note that the second join is a complex join that must be entered manually.

  17. Add these two new joins to a new context. If you have correctly defined the complex join, you can use the Detect Context button or use the pull-down menu Insert | Context to include the two joins in a new context. Complete the entries as follows:

  18. Click OK to accept the modified context.

  19. From the pull-down menu, select File | Save or click Save to save the changes to the universe.

Specifying Which Objects Are Not Compatible with the Aggregate Table

The table now exists in the universe, and you have told BusinessObjects to use this table for certain measure objects. You now have to tell BusinessObjects that the aggregate table cannot be used when certain dimension objects exist in the query. This can be very confusing. Try to stay focused on two things:

  1. You are essentially telling BusinessObjects when to use the detail table.

  2. You need to worry about only one context or star schema.

Refer to Figure 10-4. Note that all article objects are incompatible with the aggregate table. If a user includes any article information in a query, it is not available in the aggregate table, and you must tell BusinessObjects to use the detail SHOP_FACTS table. You are only concerned with this one context and do not need to worry about the promotion objects. (Wouldn't it be a nice enhancement request if Designer only displayed those objects that were part of the relevant contexts!) This second point is especially confusing, because when Designer detects incompatibility, it unnecessarily and incorrectly marks objects from other contexts. If you mark promotion objects as incompatible here, you prevent users from constructing a query that compares sales by store (using the aggregate table) with promotions by store.

Remember 

You only need to mark objects as incompatible that exist in the one context or star schema.

  1. To define incompatible objects, use the pull-down menu Tools | Aggregate Navigation.

  2. Highlight the table for which you will define incompatible objects, AGG_YR_ QT_MT_MN_WK_RG_CY_SN_SR_QT_MA.

  3. Under Associated Incompatible Objects, you can select entire classes or individual objects. Click the Article Lookup class.

  4. If you created a measure Number of Article Codes in Chapter 6, set this object to incompatible as well.

    click to expand

    Tip 

    If you toggle to display tables that contain incompatible objects, only aggregate tables should be displayed. If more than just the aggregates are displayed, you have a problem.

  5. You can now use the toggle button Incompatibles Only to filter those objects that have been marked as incompatible against the aggregate table.

  6. Click OK to close the Aggregate Navigation box.

  7. From the pull-down menu, select File | Save or click Save to save the changes to the universe.

Testing and Viewing SQL Statements

Now that you have defined which objects are incompatible with the aggregate fact table, you need to make sure that BusinessObjects generates the correct SQL. To test this, use Figure 10-4 to develop a test plan as shown in Table 10-1. Take each measure that contains @Aggregate_Aware and add a dimension object from each of the dimension tables to the query. As you create a query for each scenario, verify that the SQL generated uses the appropriate table.

Table 10-1: Test Plan for Aggregate Awareness

Scenario

If Your Query Contains
These Objects

BusinessObjects Should Use This Table

1

Sum of Amount Sold

AGG_YR_QT_MT_MN_WK_RG_CY_SN_SR_QT_MA

2

Sum of Amount Sold, Month Name

AGG_YR_QT_MT_MN_WK_RG_CY_SN_SR_QT_MA

3

Sum of Amount Sold, Month Name, Shop Name

AGG_YR_QT_MT_MN_WK_RG_CY_SN_SR_QT_MA

4

Sum of Amount Sold, Promotion Cost, Month, Shop Name

2 SQL Select statements:
AGG_YR_QT_MT_MN_WK_RG_CY_SN_SR_QT_MA PRODUCT_PROMOTION_
FACTS

5

Sum of Amount Sold, Month Name, Store Name, Article

SHOP_FACTS

6

Sum of Amount Sold, Article Color

SHOP_FACTS

7

Sum of Amount Sold, Number of Article Code

SHOP_FACTS

  1. To view the BusinessObjects SQL statement, launch BusinessObjects from Designer by selecting Tools | Run | BusinessObjects.

  2. When prompted, Select Generate A Standard Report and click Begin.

  3. Select Universe as the way you want to access the data and click Next.

  4. Choose the universe that contains the aggregate objects, Test Fashion, andclick Finish.

  5. Follow the test plan outlined in Table 10-1 to add the Result Objects to the query. The following screen shows a query based on Scenario 3 in Table 10-1:

    click to expand

  6. To verify the correct fact table is used in the SQL, click the SQL button on the Query Panel toolbar. Note that for Scenario 3, the aggregate fact table was correctly used.

    SELECT   sum(Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Sales_revenue),   Calendar_year_lookup.Month_Name,   Outlet_Lookup.Shop_name FROM   Calendar_year_lookup,   Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma,   Outlet_Lookup WHERE   ( Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Store_name=Outlet_Lookup.Shop_name  )   AND  ( Calendar_year_lookup.Year=Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Year And Calendar_year_lookup.Week_in_Year=Agg_yr_qt_mt_mn_wk_rg_cy_sn_sr_qt_ma.Week  ) GROUP BY   Calendar_year_lookup.Month_Name,   Outlet_Lookup.Shop_name

  7. When you add Article Code to the query as in Scenario 5, BusinessObjects automatically selects the SHOP_FACTS table without any user intervention. (Isn't it wonderful!)

    SELECT   sum(Shop_facts.Amount_sold),   Calendar_year_lookup.Month_Name,   Outlet_Lookup.Shop_name,   Article_lookup.Article_code FROM   Shop_facts,   Calendar_year_lookup,   Article_lookup,   Outlet_Lookup WHERE   ( Shop_facts.Article_code=Article_lookup.Article_code  )   AND  ( Calendar_year_lookup.Week_key=Shop_facts.Week_key  )   AND  ( Shop_facts.Shop_code=Outlet_Lookup.Shop_code  ) GROUP BY   Calendar_year_lookup.Month_Name,   Outlet_Lookup.Shop_name,   Article_lookup.Article_code

For Scenario 4 in Table 10-1, BusinessObjects will generate two SQL statements and seamlessly stitch the results together in the report. This step can go wrong for two reasons: 1) if your File | Parameters,  SQL tab, Multiple SQL statements for each context are not enabled, or 2) if you used Tools | Aggregate Navigation | Detect Incompatibility; in the latter case, Designer incorrectly marked objects from other contexts as being incompatible with an aggregate table.

click to expand

The most obvious error in this step of building aggregate awareness is if the wrong table is used in any instance. However, the ultimate goal is to get correct results. So as a final test, you should run a query to ensure that you get the same data when either the aggregate table or the detail table is used. In the preceding examples, Sales for September 2001 is $300,848, according to both the aggregate table and the SHOP_FACTS table.

There are a few reasons why you may not get the same result for an aggregate and a detail table:

  • One of the tables is incorrect If the DBA has not built the aggregate table correctly, you may not be able to fix it, but you must communicate this issue to both DBAs and end users. When something is wrong, BusinessObjects will always get the blame, because most end users don't know and don't care about which component in the information flow actually has the problem. If the summary table is incorrect in all circumstances, don't include it in the universe design. If it is correct in most circumstances and the DBA is working to resolve one minor inconsistency, use the table but clearly explain in the object description when the data may be incorrect.

  • The aggregate table contains dimensional information that is different from details in the lookup or dimension tables As a general rule, I don't like when dimensional information is stored in the fact table. It's even worse when there is a difference between dimensional information in a fact table and a dimension table. Yes, in a perfect world the data is clean and consistent. However, I've yet to see a company that had such clean data. In the example of AGG_YR_QT_ MT_MN_WK_RG_CY_SN_SR_QT_MA, the store name exists in both this aggregate fact table and the dimension table OUTLET_ LOOKUP. What if the fact table contained data for a store that did not exist in OUTLET_LOOKUP? Users would get different answers between queries that use the aggregate table and queries that use the detail table that joins to the dimension table that is missing information for a store.

  • The universe contains a mistake in the join or in a dimension definition An error in the design is completely within your control and must be resolved before deploying to end users.



 < Day Day Up > 



Business Objects(c) The Complete Reference
Cisco Field Manual: Catalyst Switch Configuration
ISBN: 72262656
EAN: 2147483647
Year: 2005
Pages: 206

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net