Accumulating Snapshots


In this section I demonstrate how to implement an accumulating snapshot on sales orders. This accumulating snapshot tracks five sales order milestones: Order, Allocate, Pack, Ship, and Receive.

The five dates and their quantities come from the sales_order table in the source database. An order with a completed life cycle is described by five rows: one for the time the order is placed, one for the time the ordered products are allocated, one for the time the products are packed, one for the time the order is shipped, and one for the time the order is received by the customer. Each milestone also contains a status, whose value can be one of these: N for New, A for Allocate, P for Pack, S for Ship, and R for Receive.

For the sales_order table to handle the five different statuses, its structure must be changed. The script in Listing 14.3 changes the order_date column to status_date, adds a new column called order_status, and changes the order_quantity column to quantity. As the name implies, the order_status column is used to store one of N, A, P, S, or R. The status describes the value of the status_date column. If a record has a status of N, the value in the status_date column contains the order date. If the status is R, the status_date column contains the received date.

Listing 14.3: Modifying the sales order table

image from book
 /*****************************************************************/ /*                                                               */ /* order_status.sql                                              */ /*                                                               */ /*****************************************************************/ USE source; ALTER TABLE sales_order   CHANGE order_date status_date DATE , ADD order_status CHAR (1) AFTER status_date , CHANGE order_quantity quantity INT ; /* end of script                                                 */
image from book

Run the order_status.sql script in Listing 14.3 by using this command.

 mysql> \. c:\mysql\scripts\order_status.sql

Here is how the response on your console should look like.

 Database changed Query OK, 49 rows affected (0.40 sec) Records: 49  Duplicates: 0  Warnings: 0

Having changed the source database, you now need to add four new quantities and four date surrogate keys to the existing sales_order_fact table. Here are the new columns:

  • allocate_date_sk

  • allocate_quantity

  • packing_date_sk

  • packing_quantity

  • ship_date_sk

  • ship_quantity

  • receive_date_sk

  • receive_quantity

Figure 14.2 shows the schema with the eight new columns.

image from book
Figure 14.2: The schema for accumulating snapshots

One sales order row in the fact table can now take the five milestones. In other words, the five milestone dates and quantities are accumulated in one sales order fact, hence the term accumulating snapshot.

You use the script in Listing 14.4 to add the four quantities and four date surrogate keys to the sales_order_fact table.

Listing 14.4: Adding four date surrogate keys

image from book
 /*****************************************************************/ /*                                                               */ /* add_four_milestones.sql                                       */ /*                                                               */ /*****************************************************************/ USE dw; ALTER TABLE sales_order_fact   ADD allocate_date_sk INT AFTER order_date_sk , ADD allocate_quantity INT , ADD packing_date_sk INT AFTER allocate_date_sk , ADD packing_quantity INT , ADD ship_date_sk INT AFTER packing_date_sk , ADD ship_quantity INT , ADD receive_date_sk INT AFTER ship_date_sk , ADD receive_quantity INT ; /* end of script                                                 */
image from book

Run the add_four_milestones.sql script to add the new fact columns using this command.

 mysql> \. c:\mysql\scripts\add_four_milestones.sql

The following should be printed on the console as the response.

 Database changed Query OK, 47 rows affected (0.36 sec) Records: 47  Duplicates: 0  Warnings: 0

You must now apply the database view role-playing technique on the date dimension for the new four dates. The script in Listing 14.5 creates the four date views you need.

Note 

You created the order_date_dim view in Chapter 13.

Listing 14.5: Creating four date views

image from book
 /*****************************************************************/ /*                                                               */ /* create_four_date_views.sql                                    */ /*                                                               */ /*****************************************************************/ USE dw; CREATE VIEW allocate_date_dim (   allocate_date_sk , allocate_date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date ) AS SELECT   date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date dim ; CREATE VIEW packing_date_dim (   packing_date_sk , packing_date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date ) AS SELECT   date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date dim ; CREATE VIEW ship_date_dim (   ship_date_sk , ship_date , month_name , month , quarter / year , promo_ind , effective_date , expiry_date ) AS SELECT   date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date_dim ; CREATE VIEW receive_date_dim (   receive_date_sk , receive_date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date ) AS SELECT   date_sk , date , month_name , month , quarter , year , promo_ind , effective_date , expiry_date FROM date dim ; /* end of script                                                 */
image from book

Run the script in Listing 14.5 using this command.

 mysql> \. c:\mysql\scripts\create_four_date_views.sql

Now, you need to revise your regular population script as well since the structure of the fact table has changed. The script in Listing 14.6 is the new regular population script. The five statements that handle the five milestone statuses and dates are commented. More than one transaction for a sales order can be recorded on the same date, in which case the relevant milestone dates are updated at the same time.

Listing 14.6: The revised daily DW regular population

image from book
 /*****************************************************************/ /*                                                               */ /* dw_regular_14.sql                                             */ /*                                                               */ /*****************************************************************/ USE dw; /* CUSTOMER_DIM POPULATION                                       */ TRUNCATE customer_stg; LOAD DATA INFILE 'customer.csv' INTO TABLE customer_stg FIELDS TERMINATED BY ' , ' OPTIONALLY ENCLOSED BY "" LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state ) ; /* SCD 2 ON ADDRESSES                                            */ UPDATE   customer_dim a , customer_stg b SET   a.expiry_date = SUBDATE (CURRENT_DATE, 1) WHERE   a.customer_number = b.customer_number AND (   a.customer_street_address <> b.customer_street_address      OR a.customer_city <> b.customer_city      OR a.customer_zip_code <> b.customer_zip_code      OR a.customer_state <> b.customer_state      OR a.shipping_address <> b.shipping_address      OR a.shipping_city <> b.shipping_city      OR a.shipping_zip_code <> b.shipping_zip_code      OR a.shipping_state <> b.shipping_state      OR a.shipping_address IS NULL      OR a.shipping_city IS NULL      OR a.shipping_zip_code IS NULL      OR a.shipping_state IS NULL) AND expiry_date = '9999-12-31' ; INSERT INTO customer_dim SELECT   NULL , b.customer_number , b.customer_name , b.customer_street_address , b.customer_zip_code , b.customer_city , b.customer_state , b.shipping_address , b.shipping_zip_code , b.shipping_city , b.shipping_state , CURRENT_DATE , '9999-12-31' FROM   customer_dim a , customer_stg b WHERE     a.customer_number = b.customer_number AND (   a.customer_street_address <> b.customer_street_address      OR a.customer_city <> b.customer_city      OR a.customer_zip_code <> b.customer_zip_code      OR a.customer_state <> b.customer_state      OR a.shipping_address <> b.shipping_address      OR a.shipping_city <> b.shipping_city      OR a.shipping_zip_code <> b.shipping_zip_code      OR a.shipping_state <> b.shipping_state      OR a.shipping_address IS NULL      OR a.shipping_city IS NULL      OR a.shipping_zip_code IS NULL      OR a.shipping_state IS NULL) AND EXISTS ( SELECT * FROM customer_dim x WHERE b.customer_number = x.customer_number AND a.expiry_date - SUBDATE (CURRENT_DATE, 1)) AND NOT EXISTS ( SELECT * FROM customer_dim y WHERE     b.customer_number = y.customer_number       AND y.expiry_date = '9999-12-31') ; /* END OF SCD 2                                                  */ /* SCD 1 ON NAME                                                 */ UPDATE customer_dim a, customer_stg b SET a.customer name = b.customer name WHERE a.customer_number = b.customer_number AND   a.expiry_date = '9999-12-31' AND   a.customer_name <> b.customer_name ; /* ADD NEW CUSTOMER                                              */ INSERT INTO customer_dim SELECT   NULL , customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state , CURRENT_DATE , '9999-12-31' FROM customer_stg WHERE customer_number NOT IN( SELECT a.customer_number FROM   customer_dim a , customer_stg b WHERE b.customer_number = a.customer_number ) ;  /* RE-BUILD PA CUSTOMER DIMENSION                                */  TRUNCATE pa_customer_dim;  INSERT INTO pa_customer_dim SELECT   customer_sk , customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , shipping_address , shipping_zip_code , shipping_city , shipping_state , effective_date , expiry_date FROM customer_dim WHERE customer_state = 'PA' ; /* END OF CUSTOMER_DIM POPULATION                                */ /* PRODUCT_DIM POPULATION                                        */ TRUNCATE product_stg ; LOAD DATA INFILE 'product.txt' INTO TABLE product_stg FIELDS TERMINATED BY '' OPTIONALLY ENCLOSED BY '' LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( product_code , product_name , product_category ) ; /* SCD2 ON PRODUCT NAME AND GROUP                                */ UPDATE   product_dim a , product_stg b SET   expiry_date = SUBDATE (CURRENT_DATE, 1) WHERE     a.product_code = b.product_code AND (   a.product_name <> b.product_name      OR a.product_category <> b.product_category      AND expiry_date - '9999-12-31' ; INSERT INTO product_dim SELECT   NULL , b.product_code , b.product_name , b.product_category , CURRENT_DATE , '9999-12-31' FROM   product_dim a , product_stg b WHERE     a.  product_code = b.product_code AND (   a.product_name <> b.product_name      OR a.product_category <> b.product_category ) AND EXISTS ( SELECT * FROM product_dim x WHERE     b.product_code = x.product_code              AND a.expiry_date = SUBDATE (CURRENT_DATE, 1) AND NOT EXISTS ( SELECT * FROM product_dim y WHERE    b.product_code = y.product_code              AND y.expiry_date = '9999-12-31') ; /* END OF SCD 2                                                  */ /* ADD NEW PRODUCT                                               */ INSERT INTO product_dim SELECT   NULL , product_code , product_name , product_category , CURRENT_DATE , '9999-12-31' FROM product_stg WHERE product_code NOT IN( SELECT y.product_code FROM product_dim x, product_stg y WHERE x.product_code = y.product_code ) ; /* END OF PRODUCT_DIM POPULATION                                  */ /* ORDER_DIM POPULATION                                           */ INSERT INTO order_dim (   order_sk , order_number , effective_date , expiry_date ) SELECT   NULL , order_number , status_date , '9999-12-31' FROM source.sales_order WHERE     order_status = 'N' AND entry_date = CURRENT_DATE ; /* INSERTING New ORDER                                           */ INSERT INTO sales_order_fact SELECT   order_sk , customer_sk , product_sk , e.order_date_sk , NULL , NULL , NULL , NULL , f.request_delivery_date_sk , order_amount , quantity , NULL , NULL , NULL , NULL FROM   source.sales_order a , order_dim b , customer_dim c , product_dim d , order_date_dim e , request_delivery_date_dim f WHERE     a.order_status = 'N' AND a.order number = b.order_number AND a.customer_number = c.customer_number AND a.status_date >= c.effective_date AND a.status_date <= c.expiry_date AND a.product_code = d.product_code AND a.status_date >= d.effective_date AND a.status_date <= d.expiry_date AND a.status_date = e.order_date AND a.request_delivery_date = f.request_delivery_date AND a.entry_date = CURRENT_DATE ; /* UPDATING the new sales order to Allocated status           */ UPDATE   sales_order_fact a , source.sales_order b , allocate_date_dim c , order_dim g SET   a.allocate_date_sk = c.allocate_date_sk , a.allocate_quantity = b.quantity WHERE     order_status = 'A' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND c.allocate_date = b.status_date ; /* UPDATING the allocated order to Packed status              */ UPDATE   sales_order_fact a , source.sales_order b , packing_date_dim d , order_dim g SET   a.packing_date_sk = d.packing_date_sk , a.packing_quantity = b.quantity WHERE     order_status = 'P' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND d.packing_date = b.status_date ; /* UPDATING the packed order to Shipped status                */ UPDATE   sales_order_fact a , source.sales_order b , ship_date_dim e , order_dim g SET   a.ship_date_sk = e.ship_date_sk , a.ship_quantity = b.quantity WHERE     order_status = 'S' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND e.ship_date = b.status_date ; /* UPDATING the shipped order to Received status              */ UPDATE   sales_order_fact a , source.sales_order b , receive_date_dim f , order_dim g SET   a.receive_date_sk = f.receive_date_sk , a.receive_quantity = b.quantity WHERE     order_status = 'R' AND b.entry_date = CURRENT_DATE AND b.order_number = g.order_number AND a.order_sk = g.order_sk AND f.receive_date = b.status_date ; /* end of script                                                 */
image from book



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

Similar book on Amazon
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleanin
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleanin
Successful Business Intelligence: Secrets to Making BI a Killer App
Successful Business Intelligence: Secrets to Making BI a Killer App

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net