Initial Population Script


Now that you have identified your data sources, it’s time to write a script that can be used for initial population. Assuming your data warehouse will start operation on March 1, 2007 and your user wants to load two years of historical data, you need to load the source data dated from March 1, 2005 to February 28, 2007.

The script in Listing 7.1 serves as an example. Note that:

  • The customer dimension's and the product dimension's effective date is March 1, 2007. No sales order loaded has an earlier date, meaning no earlier customer and product dimensions are required.

  • The effective date of the order dimension is, of course, the order date.

  • The surrogate key columns of the sales order fact table are populated from the dimension surrogate keys.

  • You need to pre-populate the date_dim table separately using pre-population with dates from March 1, 2005 to, say, December 31, 2010.

Listing 7.1: DW initial population

image from book
 /*****************************************************************/ /*                                                               */ /* dw_initial.sql                                                */ /*                                                               */ /*****************************************************************/ USE dw; LOAD DATA INFILE 'customer.csv' INTO TABLE customer_stg FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state ) ; INSERT INTO customer_dim SELECT   NULL , customer_number , customer_name , customer_street_address , customer_zip_code , customer_city , customer_state , '2005-03-01' , '9999-12-31' FROM customer_stg ; LOAD DATA INFILE 'product.txt' INTO TABLE product_stg FIELDS TERMINATED BY '' OPTIONALLY ENCLOSED BY '' LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( product_code , product_name , product_category ) ; INSERT INTO product_dim SELECT   NULL , product_code , product_name , product_category , '2005-03-01' , '9999-12-31' FROM product_stg ; INSERT INTO order_dim SELECT   NULL , order_number , order_date , '9999-12-31' FROM source.sales_order WHERE order_date >= '2005-03-01' AND order_date < '2007-02-28' ; INSERT INTO sales_order_fact SELECT   order_sk , customer_sk , product_sk , date_sk , order_amount FROM   source.sales_order a , order_dim b , customer_dim c , product_dim d , date_dim e WHERE     a.order_number = b.order_number AND a.customer_number = c.customer_number AND a.product_code = d.product_code AND a.order_date = e.date AND order_date >= '2005-03-01' AND order_date < '2007-02-28' ; /* end of script                                                 */
image from book



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net