Loading Dates from the Source


In this section, I explain and present an example of yet another date population technique. Using this technique you populate the date dimension by loading dates from the source.

When you populate the date_dim table by loading dates from the source, your date_dim table will store only the dates that are used, saving you disk space. Unfortunately, this method is more complex because you must load all dates to the date dimension from your data sources that have dates.

The script in Listing 6.3 loads the sales order dates from the sales_order table in the source database into the date_dim table. You use the DISTINCT keyword in the script to make sure no duplicates are loaded.

Listing 6.3: Loading dates from the source

image from book
 /*****************************************************************/ /*                                                               */ /* source_date.sql                                               */ /*                                                               */ /*****************************************************************/ USE dw; INSERT INTO date_dim SELECT DISTINCT   NULL , order_date , MONTHNAME (order_date) , MONTH (order_date) , QUARTER (order_date) , YEAR (order_date) , '0000-00-00' , '9999-12-31' FROM source.sales_order WHERE order_date NOT IN (SELECT date FROM date_dim) ; /* end of script                                                 */
image from book

Before you run the script in Listing 6.3, truncate the date_dim table. Then, run the source_date.sql script using this command.

 mysql> \. c:\mysql\scripts\source_date.sql

The response should be similar to this.

 Database changed Query OK, 1 row affected (0.23 sec) Records: 1 Duplicates: 0 Warnings: 0

Query the sales_order source table and the date_dim table to confirm correct population. All source dates must get into the date_dim table. You query the source table using this command.

 mysql> select * from source.sales_prder \G

The result is as follows.

 *************************** 1. row ***************************          order_number: 17       customer_number: 1          product_code: 1            order_date: 2007-02-06            entry_date: 2007-02-06          order_amount: 1000.00 *************************** 2. row ***************************          order_number: 18       customer_number: 2          product_code: 1            order_date: 2007-02-06            entry_date: 2007-02-06          order_amount: 1000.00 *************************** 3. row ***************************          order_number: 19       customer_number: 3          product_code: 1            order_date: 2007-02-06            entry_date: 2007-02-06          order_amount: 4000.00 *************************** 4. row ***************************          order_number: 20       customer_number: 4          product_code: 1            order_date: 2007-02-06            entry_date: 2007-02-06          order_amount: 4000.00 4 rows in set (0.05 sec)

Now, query the date_dim table using this command.

 mysql> select * from date_dim \G

You’ll see

 *************************** 1. row ***************************        date_sk: 1           date: 2007-02-06     month_name: February          month: 2        quarter: 1           year: 2007 effective_date: 0000-00-00    expiry_date: 9999-12-31 1 row in set (0.00 sec)

You happen to have only one date in this example source data, February 6, 2007, therefore only this date is copied to the date_dim table.

Now, add some sales orders by running the script in Listing 6.4, then run the population script (source_date.sql) again.

Listing 6.4: Adding more dates from additional sales orders

image from book
 /*****************************************************************/ /*                                                               */ /* more_sales_order.sql                                          */ /*                                                               */ /*****************************************************************/ USE source; INSERT INTO sales_order VALUES   (21, 1, 3, '2007-02-07', '2007-02-07', 1000) , (22, 2, 3, '2007-02-08', '2007-02-08', 1000) , (23, 3, 3, '2007-02-09', '2007-02-09', 4000) , (24, 4, 3, '2007-02-10', '2007-02-10', 4000) ; /* end of script                                                 */
image from book

You run the script in Listing 6.4 using this command.

 mysql> \. c:\mysql\scripts\more_sales_order.sql

Then, run the source_date.sql script again using this command.

 mysql> \. c:\mysql\scripts\source_date.sql

Finally, confirm that the four dates from the source are correctly loaded into the date_dim table.

 mysql> select * from date_dim \G

Here is how your result should look like.

 *************************** 1. row ***************************        date_sk: 1           date: 2007-02-06     month_name: February          month: 2        quarter: 1           year: 2007 effective_date: 0000-00-00    expiry_date: 9999-12-31 *************************** 2. row ***************************        date_sk: 2           date: 2007-02-07     month_name: February          month: 2        quarter: 1           year: 2007 effective_date: 0000-00-00    expiry_date: 9999-12-31 *************************** 3. row ***************************        date_sk: 3           date: 2007-02-08     month_name: February          month: 2        quarter: 1           year: 2007 effective_date: 0000-00-00    expiry_date: 9999-12-31 *************************** 4. row ***************************        date_sk: 4           date: 2007-02-09     month_name: February          month: 2        quarter: 1           year: 2007 effective_date: 0000-00-00    expiry_date: 9999-12-31 *************************** 5. row ***************************        date_sk: 5           date: 2007-02-10     month_name: February          month: 2        quarter: 1           year: 2007 effective_date: 0000-00-00    expiry_date: 9999-12-31 5 row in set (0 .00 sec)



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

Similar book on Amazon
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleanin
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleanin
Successful Business Intelligence: Secrets to Making BI a Killer App
Successful Business Intelligence: Secrets to Making BI a Killer App

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net