Running the Initial Population Script


Before you run the initial population script in Listing 7.1, you need to perform the following steps.

  • Clearing all the tables, including the staging tables

  • Pre-populating the date dimension

  • Preparing customer and product test data

  • Creating test data in the source database

These steps are discussed in the subsections below.

Clearing the Tables

To clear all tables, use the script in Listing 7.2.

Listing 7.2: Script for truncating the tables

image from book
 /*****************************************************************/ /*                                                               */ /* truncate_tables.sql                                           */ /*                                                               */ /*****************************************************************/ USE dw; TRUNCATE customer_dim; TRUNCATE product_dim; TRUNCATE order_dim; TRUNCATE date_dim; TRUNCATE sales_order_fact; TRUNCATE customer_stg; TRUNCATE product_stg; USE source; TRUNCATE sales_order; /* end of script                                                 */
image from book

You run the script in Listing 7.2 by invoking this command.

 mysql> \. c:\mysql\scripts\truncate_tables.sql

MySQL will respond by telling you how many records are deleted from each table.

 Database changed Query OK, 7 rows affected (0.17 sec) Query OK, 4 rows affected (0.05 sec) Query OK, 20 rows affected (0.06 sec) Query OK, 5 rows affected (0.06 sec) Query OK, 20 rows affected (0.05 sec) Query OK, 7 rows affected (0.06 sec) Query OK, 3 rows affected (0.06 sec) Database changed Query OK, 8 rows affected (0.07 sec)

Pre-populating the Date Dimension

Once all the tables are empty, you can pre-populate the date dimension for the period of March 1, 2005 to December 31, 2010 by executing the pre_populate_date stored procedure discussed in Chapter 6, “Populating the Date Dimension.” Make sure that the current database is dw by calling:

 mysql> use dw;

Then, call the stored procedure, passing the start date and the end date as arguments.

 mysql> call pre_populate_date ('2005-03-01', '2010-12-31');

Preparing the Customer and Product Test Data

The next step after pre-populating the date_dim table is to prepare the customer and product test data. I've prepared the customer data in a CSV file and the product data in a fixed-width text file.

Here is the customer source data.

 CUSTOMER NO,CUSTOMER NAME,STREET ADDRESS,ZIP CODE,CITY,STATE 1,Really Large Customers, 7500 Louise Dr.,17050, Mechanicsburg,PA 2,Small Stores, 2500 Woodland St.,17055, Pittsburgh,PA 3,Medium Retailers,1111 Ritter Rd.,17055,Pittsburgh,PA 4,Good Companies,9500 Scott St.,17050,Mechanicsburg,PA 5,Wonderful Shops,3333 Rossmoyne Rd.,17050,Mechanicsburg,PA 6,Loyal Clients,7070 Ritter Rd.,17055,Pittsburgh,PA 7,Distinguished Partners,9999 Scott St.,17050,Mechanicsburg,PA

And here is the product data.

 PRODUCT CODE,PRODUCT NAME,PRODUCT GROUP 1          Hard Disk Drive               Storage 2          Floppy Drive                  Storage 3          LCD Panel                     Monitor

Preparing the Sales Orders

The last set of test data you need is sales orders. The script in Listing 7.3 can be used to insert 21 sales orders into the sales_order table in the source database.

Listing 7.3: Sales orders for testing initial population

image from book
 /*****************************************************************/ /*                                                               */ /* sales_order_initial.sql                                       */ /*                                                               */ /*****************************************************************/ USE source; INSERT INTO sales_order VALUES   (1, 1, 1, '2005-02-01', '2005-02-01', 1000) , (2, 2, 2, '2005-02-10', '2005-02-10', 1000) , (3, 3, 3, '2005-03-01', '2005-03-01', 4000) , (4, 4/ 1, '2005-04-15', '2005-04-15', 4000) , (5, 5, 2, '2005-05-20', '2005-05-20', 6000) , (6, 6, 3, '2005-07-30', '2005-07-30', 6000) , (7, 7, 1, '2005-09-01', '2005-09-01', 8000) , (8, 1, 2, '2005-11-10', '2005-11-10', 8000) , (9, 2, 3, '2006-01-05', '2006-01-05', 1000) , (10, 3, 1, '2006-02-10', '2006-02-10', 1000) , (11, 4, 2, '2006-03-15', '2006-03-15', 2000) , (12, 5, 3, '2006-04-20', '2006-04-20', 2500) , (13, 6, 1, '2006-05-30', '2006-05-30', 3000) , (14, 7, 2, '2006-06-01', '2006-06-01', 3500) , (15, 1, 3, '2006-07-15', '2006-07-15', 4000) , (16, 2, 1, '2006-08-30', '2006-08-30', 4500) , (17, 3, 2, '2006-09-05', '2006-09-05', 1000) , (18, 4, 3, '2006-10-05', '2006-10-05', 1000) , (19, 5, 1, '2007-01-10', '2007-01-10', 4000) , (20, 6, 2, '2007-02-20', '2007-02-20', 4000) , (21, 7, 3, '2007-02-28', '2007-02-28', 4000) ; /* end of script                                                 */
image from book

Now, run the sales_order_initial.sql script by using this command.

 mysql> \. c:\mysql\scripts\sales_order_initial.sql

You’ll see that 21 new rows were inserted.

 Database changed Query OK, 21 rows affected (0.05 sec) Records: 21  Duplicates: 0  Warnings: 0 

Note 

The historical data to load is from March 1, 2005. Therefore, the first two orders will not be loaded.

Running and Confirming the Initial Population

Now that the test data is ready, it's time to perform the initial population. First of all, however, you need to set your MySQL date to 28 February, 2007, one day before the start of your data warehouse operation. You run the initial population at the end of the day, when no more data gets in and the source data is ready.

You are now ready to test the initial population. Run the dw_initial.sql population script in Listing 7.1 by using this command.

 mysql> \. c:\mysql\scripts\dw_initial.sql

On your console, you should see the following messages.

 Database changed Query OK, 7 rows affected (0.06 sec) Records: 7  Deleted: 0  Skipped: 0 Warnings: 0 Query OK, 7 rows affected (0.06 sec) Records: 7  Duplicates: 0  Warnings: 0 Query OK, 3 rows affected (0.06 sec) Records: 3  Deleted: 0  Skipped: 0  Warnings: 0 Query OK, 3 rows affected (0.07 sec) Records: 3  Duplicates: 0  Warnings: 0 Query OK, 19 rows affected (0.06 sec) Records: 19  Duplicates: 0  Warnings: 0 Query OK, 19 rows affected (0.10 sec) Records: 19  Duplicates: 0  Warnings: 0

You can use the query in Listing 7.4 to check that nineteen sales orders have been loaded correctly.

Listing 7.4: Query to confirm the sales orders are loaded correctly

image from book
 /*****************************************************************/ /*                                                               */ /* confirm_initial_population.sql                                */ /*                                                               */ /*****************************************************************/ USE dw; SELECT   order_number on , customer_name , product_name , date , order_amount amount FROM   sales_order_fact a , customer_dim b , product_dim c , order_dim d , date_dim e WHERE     a.customer_sk = b.customer_sk AND a.product_sk = c.product_sk AND a.order_sk = d.order_sk AND a.order date sk = e.date_sk ; /* end of script                                                 */
image from book

Run the query in Listing 7.4 by using this command.

 mysql> \. c:\mysql\scripts\confirm_initial_population.sql

The result should be as follows.

 Database changed +--+------------------------+-----------------+-----------+---------+ |no| customer_name          | product _name   | date      | amount  | +--+------------------------+-----------------+-----------+---------+ | 3| Medium Retailers       | LCD Panel       | 2005-03-01| 4000.00 | | 4| Good Companies         | Hard Disk Drive | 2005-04-15| 4000.00 | | 5| Wonderful Shops        | Floppy Drive    | 2005-05-20| 6000.00 | | 6| Loyal Clients LCD      | Panel           | 2005-07-30| 6000.00 | | 7| Distinguished Partners | Hard Disk Drive | 2005-09-01| 8000.00 | | 8| Really Large Customers | Floppy Drive    | 2005-11-10| 8000.00 | | 9| Small Stores           | LCD Panel       | 2006-01-05| 1000.00 | |10| Medium Retailers       | Hard Disk Drive | 2006-02-10| 1000.00 | |11| Good Companies         | Floppy Drive    | 2006-03-15| 2000.00 | |12| Wonderful Shops        | LCD Panel       | 2006-04-20| 2500.00 | |13| Loyal Clients          | Hard Disk Drive | 2006-05-30| 3000.00 | |14| Distinguished Partners | Floppy Drive    | 2006-06-01| 3500.00 | |15| Really Large Customers | LCD Panel       | 2006-07-15| 4000.00 | |16| Small Stores           | Hard Disk Drive | 2006-08-30| 4500.00 | |17| Medium Retailers       | Floppy Drive    | 2006-09-05| 1000.00 | |18| Good Companies         | LCD Panel       | 2006-10-05| 1000.00 | |19| Wonderful Shops        | Hard Disk Drive | 2007-01-10| 4000.00 | |20| Loyal Clients          | Floppy Drive    | 2007-02-20| 4000.00 | |21| Distinguished Partners | LCD Panel       | 2007-02-28| 4000.00 | +--+------------------------+-----------------+-----------+---------+ 19 rows in set (0.00 sec)



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net