Chapter 11: On-Demand Population


You’ve learned initial population in Chapter 7 and regular population in Chapter 8. There is another type of population, on-demand population, that you need to familiarize yourself with. You apply on-demand population when you need to load source data outside the normal schedule, at the time the source data is available or as needed by the data warehouse. For example, sales promotion source data might be available only when a promotion is being scheduled and is not available otherwise.

The date pre-population discussed in Chapter 6, “Populating the Date Dimension” can be regarded as a type of on-demand population. You pre-populate your data warehouse with dates and when you almost run out of dates, you run the pre-population again.

On-demand population is the topic of this chapter. Here you learn how to enhance the schema and perform on-demand population for the dw database.

Enhancing the Schema

In this section I explain on-demand population using a sales promotion scenario. Regular population is not appropriate here as the data loading cannot be scheduled. The following is the content of a CSV flat file that needs to be loaded.

 PROMOTION CODE,PROMOTION NAME,START DATE,LAST DATE SO,Special Offer,2007-04-01,2007-04-10 DP,Disk Promotion,2007-05-05,2007-05-20 MS,Month Special,2007-06-01,2007-06-30 MS,Monitor Promotion,2007-07-10,2007-07-15 BS,Back to School,2007-08-10,2007-08-30

Note that the source data provides promotion periods, not individual promotion dates. Assuming your user wants to load only new promotions in the future, they don’t need promotion history in the data warehouse.

The first thing you need to do is add a new column, promo_ind (for promotion indicator) to the date_dim table. The new schema after the addition is shown in Figure 11.1.

image from book
Figure 11.1: The new promo_ind column in date_dim

You use the promotion_indicator.sql script in Listing 11.1 to add the promo_ind column in the date_dim table.

Listing 11.1: Promotion indicator

image from book
 /*****************************************************************/ /*                                                               */ /* promotion_indicator.sql                                       */ /*                                                               */ /*****************************************************************/ USE dw; ALTER TABLE date_dim Performing On-Demand Population ADD promo_ind CHAR (1) AFTER year ; /* end of script                                                 */
image from book

You run the script using this command.

 mysql> \. c:\mysql\scripts\promotion_indicator.sql

After you run the script, you’ll see this on the console.

 Database changed Query OK, 2132 rows affected (0.45 sec) Records: 2132  Duplicates: 0  Warnings: 0

Using the following SQL statement, query the date dimension to confirm there’s no data in the new promo_ind column.

 mysql> select * from date_dim where promo_ind IS NOT NULL;

All values of the column must be NULL, as shown here.

 Empty set (0.01 sec)



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net