Chapter 16: Multi-Path and Ragged Hierarchies


This chapter discusses the multi-path hierarchy, expanding on the single-path hierarchy you learned in Chapter 15, “Dimension Hierarchies.” You might recall from Chapter 15, the month dimension in our data warehouse has one hierarchy path, the year-quarter-month path. In this chapter we will add a new level, campaign session, and add a new hierarchy path of year-campaign-month. The month dimension will then have two hierarchy paths and therefore has a multi-path hierarchy.

Another topic of discussion in this chapter is the ragged hierarchy, which is a hierarchy that does not have data in one or more of its levels.

Adding A Hierarchy

I explain in this section how to add a hierarchy in a dimension that already has a hierarchy, thus forming a multi-path hierarchy. I also show how to populate the new hierarchy and verify that the population is successful.

First of all, you need to add a new column called campaign_session to the month_dim table. Figure 16.1 shows the schema after the addition.

image from book
Figure 16.1: The schema after adding campaign_session

You can use the script in Listing 16.1 to add the new column.

Listing 16.1: Adding the campaign_session column

image from book
 /*****************************************************************/ /*                                                               */ /* add_campaign_session.sql                                      */ /*                                                               */ /*****************************************************************/ USE dw; ALTER TABLE month_dim ADD campaign_session CHAR (30) AFTER month ; /* end of script                                                 */
image from book

Run the script in Listing 16.1 now.

 mysql> \. c:\mysql\scripts\add_campaign_session.sql

You should see this on your console upon running the script.

 Database changed Query OK, 82 rows affected (0.63 sec) Records: 82  Duplicates: 0  Warnings: 0

To understand how the campaign session works, look at the sample campaign sessions in Table 16.1.

Table 16.1: 2005 Campaign Sessions
Open table as spreadsheet

Campaign Session

Month

2005 First Campaign

January-April

2005 Second Campaign

May-July

2005 Third Campaign

August-August

2005 Last Campaign

September-December

Each campaign session lasts one or more months. A campaign session might not run exactly in one quarter. This means, campaign session levels do not roll up to the quarter (the campaign session’s next higher level). Rather, the campaign sessions roll up to the year level.

Now you need to populate the campaign_session column. I’ve provided the following campaign session data for 2006 in the campaign_session.csv file.

 CAMPAIGN SESSION, MONTH, YEAR 2006 First Campaign, 1, 2006 2006 First Campaign, 2, 2006 2006 First Campaign, 3, 2006 2006 First Campaign, 4, 2006 2006 Second Campaign, 5, 2006 2006 Second Campaign, 6, 2006 2006 Second Campaign, 1, 2006 2006 Third Campaign, 8, 2006 2006 Last Campaign, 9, 2006 2006 Last Campaign, 10, 2006 2006 Last Campaign, 11, 2006 2006 Last Campaign, 12, 2006

As usual, you don’t load data from a text file directly to a data warehouse table. Instead, you use a staging table. Listing 16.2 shows a script that creates a campaign_session_stg table.

Listing 16.2: Creating the campaign_session_stg table

image from book
 /*****************************************************************/ /*                                                               */ /* create_campaign_stg. sql                                      */ /*                                                               */ /*****************************************************************/ USE dw; CREATE TABLE campaign_session_stg ( campaign_session CHAR (30) , month CHAR (9) , year INT (4) ) ; /* end of script                                                 */
image from book

Run the script by calling the script name this way.

 mysql> \. c:\mysql\scripts\create_campaign_stg.sql

Now you can load the 2006 campaign sessions into the month dimension. Listing 16.3 shows the script to do that.

Listing 16.3: Campaign session population

image from book
 /*****************************************************************/ /*                                                               */ /* campaign_session.sql                                          */ /*                                                               */ /*****************************************************************/ USE dw; TRUNCATE campaign_session_stg; LOAD DATA INFILE 'campaign_session.csv' INTO TABLE campaign_session_stg FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY "" LINES TERMINATED BY '\r\n' IGNORE 1 LINES (   campaign_session , month Adding A Hierarchy , year ) ; UPDATE month_dim a, campaign_session_stg b SET a.campaign_session = b.campaign_session WHERE     a.month = b.month AND a.year = b.year ; /* end of script                                                 */
image from book

Run the script using this command.

 mysql> \. c:\mysql\scripts\campaign_session.sql

Here is what you should see on the console.

 Database changed Query OK, 1 row affected (0.05 sec) Query OK, 12 rows affected (0.09 sec) Records: 12  Deleted: 0  Skipped: 0  Warnings: 0 Query OK, 12 rows affected (0.05 sec) Rows matched: 12  Changed: 12  Warnings: 0

Now query the month_dim table to confirm the table has been correctly populated.

 mysql> select month_sk, month_name, year, campaign_session     -> from month_dim     -> where year = 2006;

Here is the result.

 +----------+-----------+------+----------------------+ | month_sk | month_name| year | campaign_session     | +----------+-----------+------+----------------------+ |       11 | January   | 2006 | 2006 First Campaign  | |       12 | February  | 2006 | 2006 First Campaign  | |       13 | March     | 2006 | 2006 First Campaign  | |       14 | April     | 2006 | 2006 First Campaign  | |       15 | May       | 2006 | 2006 Second Campaign | |       16 | June      | 2006 | 2006 Second Campaign | |       17 | July      | 2006 | 2006 Second Campaign | |       18 | August    | 2006 | 2006 Third Campaign  | |       19 | September | 2006 | 2006 Last Campaign   | |       20 | October   | 2006 | 2006 Last Campaign   | |       21 | November  | 2006 | 2006 Last Campaign   | |       22 | December  | 2006 | 2006 Last Campaign   |  +----------+-----------+------+----------------------+ 12 rows in set (0.00 sec)

Note 

You load the campaign session CSV file in January every year when you get the data from the user and must do so before the population of the month_end_sales_order_fact table.



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

Similar book on Amazon
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleanin
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleanin
Successful Business Intelligence: Secrets to Making BI a Killer App
Successful Business Intelligence: Secrets to Making BI a Killer App

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net