This chapter teaches you dimension role-playing, a technique you use when a fact needs a dimension more than once. For example, if the sales order fact has more than one date, say an order date and a request delivery date, you need to use the date dimension more than once.
In this chapter you also learn two types of dimension role playing implementations, table alias and database view. Both types use MySQL functions. The table alias type uses the dimension more than once in an SQL statement by assigning an alias for each use. As for the database view type, you create as many views as the number of roles you need the dimension on the fact.
In this section I show you how to add a request delivery date, revise the data warehouse regular population, and test the revised regular population script. This preparation is for showing the implementation of date dimension role-playing in the next section.
The first thing you need to do is add a request_delivery_date column to the sales_order_fact table. The schema now looks like that in Figure 13.1.
Figure 13.1: Adding request_delivery_date to sales_order_fact
You can use the script in Listing 13.1 to add the request_delivery_date column.
Listing 13.1: Adding the request_delivery_date_sk column
![]() |
/**********************************************************************/ /* */ /* request_delivery_date_sk.sql */ /* */ /**********************************************************************/ USE dw; ALTER TABLE sales_order_fact ADD request_delivery_date_sk INT AFTER order_date_sk ; /* end of script */
![]() |
You run the script in Listing 13.1 using this command.
mysql> \. c:\mysql\scripts\request_delivery_date_sk.sql
MySQL will indicate that there are 44 rows affected by the command.
Database changed Query OK, 44 rows affected (0.39 sec) Records: 44 Duplicates: 0 Warnings: 0
Since the structure of the sales_order_fact table has changed, you also need to update the script for regularly populating this table. The new script is shown in Listing 13.2.
Listing 13.2: The revised daily DW regular population
![]() |
/**********************************************************************/ /* */ /* dw_regular_13.sql */ /* */ /**********************************************************************/ USE dw; /* CUSTOMER_DIM POPULATION */ TRUNCATE customer_stg; LOAD DATA INFILE 'customer.csv' INTO TABLE customer_stg FIELDS TERMINATED BY ' , ' OPTIONALLY ENCLOSED BY "" LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state ) ; /* SCD 2 ON ADDRESSES */ UPDATE customer_dim a , customer_stg b SET a.expiry_date - SUBDATE (CURRENT_DATE, 1) WHERE a.customer_number = b.customer_number AND ( a.customer_street_address <> b.customer_street_address OR a.customer_city <> b.customer_city OR a.customer_zip_code <> b.customer_zip_code OR a.customer_state <> b.customer_state OR a.shipping_address <> b.shipping_address OR a.shipping_city <> b.shipping_city OR a.shipping_zip_code <> b.shipping_zip_code OR a.shipping_state <> b.shipping_state OR a.shipping_address IS NULL OR a.shipping_city IS NULL OR a.shipping_zip_code IS NULL OR a.shipping_state IS NULL) AND expiry_date - '9999-12-31' ; INSERT INTO customer_dim SELECT NULL , b.customer_number , b.customer_name , b.customer_street_address , b.customer_zip_code , b.customer_city , b.customer_state , b.shipping_address , b.shipping_zip_code , b.shipping_city , b.shipping_state , CURRENT_DATE , '9999-12-31' FROM customer_dim a , customer_stg b WHERE a.customer_number = b.customer_number AND ( a.customer_street_address <> b.customer_street_address OR a.customer_city <> b.customer_city OR a.customer_zip_code <> b.customer_zip_code OR a.customer_state <> b.customer_state OR a.shipping_address <> b.shipping_address OR a.shipping_city <> b.shipping_city OR a.shipping_zip_code <> b.shipping_zip_code OR a.shipping_state <> b.shipping_state OR a.shipping_address IS NULL OR a.shipping_city IS NULL OR a.shipping_zip_code IS NULL OR a.shipping_state IS NULL) AND EXISTS ( SELECT * FROM customer_dim x WHERE b.customer_number = x.customer_number AND a.expiry_date = SUBDATE (CURRENT_DATE, 1)) AND NOT EXISTS ( SELECT * FROM customer_dim y WHERE b.customer_number = y.customer_number AND y.expiry_date - '9999-12-31') ; /* END OF SCD 2 */ /* SCD 1 ON NAME */ UPDATE customer_dim a, customer_stg b SET a.customer_name = b.customer_name WHERE a.customer_number = b.customer_number AND a.expiry_date = '9999-12-31' AND a.customer_name <> b.customer_name ; /* ADD NEW CUSTOMER */ INSERT INTO customer_dim SELECT NULL , customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state , CURRENT_DATE , '9999-12-31' FROM customer_stg WHERE customer_number NOT IN( SELECT a.customer_number FROM customer_dim a , customer_stg b WHERE b.customer_number = a.customer_number ) ; /* RE-BUILD PA CUSTOMER DIMENSION */ TRUNCATE pa_customer_dim; INSERT INTO pa_customer_dim SELECT customer_sk , customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state , effective_date , expiry_date FROM customer_dim WHERE customer_state = 'PA' ; /* END OF CUSTOMER_DIM POPULATION */ /* PRODUCT_DIM POPULATION */ TRUNCATE product_stg ; LOAD DATA INFILE 'product.txt' INTO TABLE product_stg FIELDS TERMINATED BY '' OPTIONALLY ENCLOSED BY '' LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( product_code , product_name , product_category ) ; /* SCD2 ON PRODUCT NAME AND GROUP */ UPDATE product_dim a , product_stg b SET expiry_date = SUBDATE (CURRENT_DATE, 1) WHERE a.product_code = b.product_code AND ( a.product_name <> b.product_name OR a.product_category <> b.product_category AND expiry_date = '9999-12-31' ; INSERT INTO product_dim SELECT NULL , b.product_code , b.product_name , b.product_category , CURRENT_DATE , '9999-12-31' FROM product_dim a , product_stg b WHERE a.product_code = b.product_code AND ( a.product_name <> b.product_name OR a.product_category <> b.product_category ) AND EXISTS ( SELECT * FROM product_dim x WHERE b.product_code = x.product_code AND a.expiry_date = SUBDATE (CURRENT_DATE, 1)) AND NOT EXISTS ( SELECT * FROM product_dim y WHERE b.product_code = y.product_code AND y.expiry_date = '9999-12-31') ; /* END OF SCD 2 */ /* ADD NEW PRODUCT */ INSERT INTO product_dim SELECT NULL , product_code , product_name , product_category , CURRENT_DATE , '9999-12-31' FROM product_stg WHERE product_code NOT IN( SELECT y.product_code FROM product_dim x, product_stg y WHERE x.product_code = y.product_code ) ; /* END OF PRODUCT_DIM POPULATION */ /* ORDER_DIM POPULATION */ INSERT INTO order_dim ( order_sk , order_number , effective_date , expiry_date ) SELECT NULL , order_number , order_date , '9999-12-31' FROM source.sales_order WHERE entry_date = CURRENT_DATE ; /* END OF ORDER_DIM POPULATION */ /* SALES_ORDER_FACT POPULATION */ INSERT INTO sales_order_fact SELECT order_sk , customer_sk , product_sk , e.date_sk , f.date_sk , order_amount , order_quantity FROM source.sales_order a , order_dim b , customer_dim c , product_dim d , date_dim e , date_dim f WHERE a.order_number = b.order_number AND a.customer_number = c.customer_number AND a.order_date >= c.effective_date AND a.order_date <= c.expiry_date AND a.product_code = d.product_code AND a.order_date >= d.effective_date AND a.order_date <= d.expiry_date AND a.order_date = e.date AND a.request_delivery_date = f.date AND a.entry_date = CURRENT_DATE ; /* end of script */
![]() |
Before you can test the revised regular population for the sales_order_fact table, you need to modify the sales_order table in the source database. To be precise, you need to add a request_delivery_date column to this table. The script in Listing 13.3 can help you achieve this.
Listing 13.3: Adding the request_delivery_date column to the sales_order table
![]() |
/*********************************************************************/ /* */ /* request_delivery_date.sql */ /* */ /*********************************************************************/ USE source; ALTER TABLE sales_order ADD request_delivery_date DATE AFTER order_date ; /* end of script */
![]() |
Run the script in Listing 13.3 using this command.
mysql> \. c:\mysql\scripts\request_delivery_date.sql
MySQL will indicate that there are 46 rows affected.
Database changed Query OK, 46 rows affected (0.41 sec) Records: 46 Duplicates: 0 Warnings: 0
In addition to changing the sales_order table structure, you also need to add more sales orders to the sales_order table using the script in Listing 13.4.
Listing 13.4: Adding three sales orders with request delivery dates
![]() |
/**********************************************************************/ /* */ /* request_delivery_date_source.sql */ /* */ /**********************************************************************/ USE source; INSERT INTO sales_order VALUES (47, 1, 1, '2007-03-04', '2007-03-30', '2007-03-04', 7500, 75) , (48, 2, 2, '2007-03-04', '2007-03-30', '2007-03-04', 1000, 10) , (49, 3, 3, '2007-03-04', '2007-03-30', '2007-03-04', 1000, 10) ; /* end of script */
![]() |
Now run the script in Listing 13.4.
mysql> \. c:\mysql\scripts\request_delivery_date_source.sql
Finally, set your MySQL date to March 4, 2007 (the order date of the three recently added sales orders), and run the dw_regular_13.sql script in Listing 13.2.
mysql> \. c:\mysql\scripts\dw_regular_13.sql
Here is how the response should look like.
Database changed Query OK, 9 rows affected (0.07 sec) Query OK, 9 rows affected (0.05 sec) Records: 9 Deleted: 0 Skipped: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 19 rows affected (0.08 sec) Query OK, 19 rows affected (0.06 sec) Records: 19 Duplicates: 0 Warnings: 0 Query OK, 4 rows affected (0.05 sec) Query OK, 4 rows affected (0.06 sec) Records: 4 Deleted: 0 Skipped: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 3 rows affected (0.07 sec) Records: 3 Duplicates: 0 Warnings: 0 Query OK, 3 rows affected (0.12 sec) Records: 3 Duplicates: 0 Warnings: 0
If you query the sales_order_fact table, you will learn that the three new sales orders have values for the request_delivery_date_sk column, whereas the older records don't.
Here is a command for querying the sales_order_face table.
mysql> select a.order_sk, request_delivery_date_sk -> from sales_order_fact a, date_dim b -> where a.order_date_sk = b.date_sk;
Here is the result of the query.
+----------+--------------------------+ | order_sk | request_delivery_date_sk | +----------+--------------------------+ | 1 | NULL | | 2 | NULL | | 3 | NULL | | 5 | NULL | | 4 | NULL | | 6 | NULL | | 7 | NULL | | 8 | NULL | | 9 | NULL | | 10 | NULL | | 11 | NULL | | 12 | NULL | | 13 | NULL | | 14 | NULL | | 15 | NULL | | 16 | NULL | | 17 | NULL | | 18 | NULL | | 19 | NULL | | 20 | NULL | | 21 | NULL | | 22 | NULL | | 23 | NULL | | 24 | NULL | | 25 | NULL | | 26 | NULL | | 27 | NULL | | 28 | NULL | | 29 | NULL | | 30 | NULL | | 31 | NULL | | 32 | NULL | | 33 | NULL | | 34 | NULL | | 35 | NULL | | 36 | NULL | | 37 | NULL | | 38 | NULL | | 39 | NULL | | 40 | NULL | | 41 | NULL | | 42 | NULL | | 43 | NULL | | 44 | NULL | | 45 | 760 | | 46 | 760 | | 47 | 760 | +----------+--------------------------+ 47 rows in set (0.01 sec)
You can verify that the date for the request_delivery_date_sk 760 is March 30, 2007; using this query.
mysql> select date_sk, date from date_dim where date_sk = 760;
The response shows that 760 has a value of 2007-03-30.
+---------+------------+ | date_sk | date | +---------+------------+ | 760 | 2007-03-30 | +---------+------------+ 1 row in set (0.00 sec)
Now that you have revised the schema and the regular population script, you're ready to implement the date dimension role-playing. The next two sections show how you can use each type of dimension role playing implementation types, table alias and database view.