A hierarchy that does not have data in one or more level is called a ragged hierarchy. For example, if your users don’t have any campaign session for certain months, then the month dimension is said to have a ragged campaign hierarchy. In this section I explain the ragged hierarchy and how to apply it on campaign sessions.
Here’s a sample ragged campaign (in the ragged_campaign.csv file) that does not have January, April, September, October, November, and December 2006 campaign sessions.
CAMPAIGN SESSION,MONTH,YEAR NULL,1,2006 2006 Early Spring Campaign,2,2006 2006 Early Spring Campaign,3,2006 NULL,4,2006 2006 Spring Campaign,5,2006 2006 Spring Campaign,6,2006 2006 Last Campaign,7,2006 2006 Last Campaign,8,2006 NULL,9,2006 NULL,10,2006 NULL,11,2006 NULL,12,2006
The script in Listing 16.6 loads the campaign session into the month_dim table.
Listing 16.6: Ragged campaign session
/*****************************************************************/ /* */ /* ragged_campaign.sql */ /* */ /*****************************************************************/ USE dw; TRUNCATE campaign_session_stg; LOAD DATA INFILE 'ragged_campaign.csv' INTO TABLE campaign_session_stg FIELDS TERMINATED BY ', ' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( campaign_session , month , year ) ; UPDATE month_dim a , campaign_session_stg b SET a.campaign_session = b.campaign_session WHERE a.month = b.month AND a.year = b.year AND b.campaign_session IS NOT NULL ; UPDATE month_dim a , campaign_session_stg b SET a.campaign_session = a.month_name WHERE a.month = b.month AND a.year = b.year AND b.campaign_session IS NULL ; /* end of script */
To see how it works, clean up the existing campaign session column (set its value to NULL) by running the script in Listing 16.7.
Listing 16.7: Nullifying the campaign_session column
/*****************************************************************/ /* */ /* nullify_campaign_session.sql */ /* */ /*****************************************************************/ USE dw; UPDATE month_dim SET campaign_session = NULL ; /* end of script */
Run the script by using this command:
mysql> \. c:\mysql\scripts\nullify_campaign_session.sql
You should see a message similar to this on your console.
Database changed Query OK, 12 rows affected (0.05 sec) Rows matched: 96 Changed: 12 Warnings: 0
Now, run the ragged_campaign.sql script in Listing 16.7 to load the campaign session CSV file by calling the script file name.
mysql> \. c:\mysql\scripts\ragged_campaign.sql
Here is what should be printed on the console.
Database changed Query OK, 12 rows affected (0.05 sec) Query OK, 12 rows affected (0.05 sec) Records: 12 Deleted: 0 Skipped: 0 Warnings: 0 Query OK, 6 rows affected (0.07 sec) Rows matched: 6 Changed: 6 Warnings: 0 Query OK, 6 rows affected (0.06 sec) Rows matched: 6 Changed: 6 Warnings: 0
To verify that the month_dim table was populated, query the table using this SQL statement.
mysql> select month_sk, month_name, year, campaign_session -> from month_dim -> where year = 2006;
The correct result is as follows.
+----------+------------+------+----------------------------+ | month_sk | month_name | year | campaign_session | +----------+------------+------+----------------------------+ | 11 | January | 2006 | January | | 12 | February | 2006 | 2006 Early Spring Campaign | | 13 | March | 2006 | 2006 Early Spring Campaign | | 14 | April | 2006 | April | | 15 | May | 2006 | 2006 Spring Campaign | | 16 | June | 2006 | 2006 Spring Campaign | | 17 | July | 2006 | 2006 Last Campaign | | 18 | August | 2006 | 2006 Last Campaign | | 19 | September | 2006 | September | | 20 | October | 2006 | October | | 21 | November | 2006 | November | | 22 | December | 2006 | December | +----------+------------+------+----------------------------+ 12 rows in set (0.00 sec)
You can then run the campaign_session_path.sql script using this command.
mysql> \. c:\mysql\scripts\campaign_session_path.sql
Here is the query result.
+------------+----------------------------+----------+------+ | pc | time | amt | qty | +------------+----------------------------+----------+------+ | Monitor | 2005 | 44500.00 | 125 | | Monitor | January | 1000.00 | NULL | | Monitor | January | 1000.00 | NULL | | Monitor | April | 2500.00 | NULL | | Monitor | April | 2500.00 | NULL | | Monitor | 2005 Last Campaign | 4000.00 | NULL | | Monitor | July | 4000.00 | NULL | | Monitor | October | 5000.00 | NULL | | Monitor | October | 5000.00 | NULL | | Monitor | November | 32000.00 | 125 | | Monitor | November | 32000.00 | 125 | | Peripheral | 2005 | 25000.00 | 70 | | Peripheral | November | 25000.00 | 70 | | Peripheral | November | 25000.00 | 70 | | Storage | 2005 | 69000.00 | 285 | | Storage | 2005 Early Spring Campaign | 3000.00 | NULL | | Storage | February | 1000.00 | NULL | | Storage | March | 2000.00 | NULL | | Storage | 2005 Spring Campaign | 6500.00 | NULL | | Storage | May | 3000.00 | NULL | | Storage | June | 3500.00 | NULL | | Storage | 2005 Last Campaign | 4500.00 | NULL | | Storage | August | 4500.00 | NULL | | Storage | September | 1000.00 | NULL | | Storage | September | 1000.00 | NULL | | Storage | October | 8000.00 | NULL | | Storage | October | 8000.00 | NULL | | Storage | November | 37500.00 | 200 | | Storage | November | 37500.00 | 200 | | Storage | December | 8500.00 | 85 | | Storage | December | 8500.00 | 85 | +------------+----------------------------+----------+------+ 31 rows in set (0.00 sec)
The query output shows that the rows of year and month levels are the same as the output of the non-ragged hierarchy’s campaign path. However, the months with no campaign are listed right above the months themselves. This is to say, the months roll themselves up to their non-existing campaign levels. For example, January does not have a campaign, so you see two January rows for Monitor (2nd and 3rd rows). The third row is for the month, the second represents the non-existing campaign row. The sales order amount (the amt column in the query output) for the non-existing campaign is the same as the month amount.
February and March for Storage belong to the same campaign, namely ‘2005 Early Spring Campaign’. Therefore, both months have one row each that roll up to their campaign; the sum of their sales order amounts is the amount for the campaign.