Chapter 8: Regular Population


This chapter covers regular population. Unlike initial population that you perform once only before the start of your data warehouse operation, you schedule a regular population to load the source data regularly.

In this chapter I show you how to prepare data before running a script that does regular population in our dw database.

Identifying Data Sources and Loading Types

The first step to schedule a regular population is identify what source data is needed and available for every fact and every dimension of the data warehouse. Afterwards, you decide the extraction mode and the loading type suitable for the population. A sample document that summarizes this information is shown in Table 8.1.

Table 8.1: Data sources and loading types of regular population
Open table as spreadsheet

Source Data

Data Warehouse Table

Extraction Mode

Loading Type

Customer

customer_dim

Whole, Pull

SCD2 on address

SCD1 on name

Product

product_dim

Whole, Pull

SCD2

Sales order Transaction

order_dim

CDC (daily), Pull

Unique order number

sales_order_fact

CDC (daily), Pull

Daily sales orders

n/a

date_dim

n/a

Pre-populate

Note 

You learned about extraction mode in Chapter 5, “Source Extraction” and SCD in Chapter 2, “Dimension History.”

Another aspect of the source data that might have impact on your design is the window of time when a particular data is available for the regular population. This is especially important for transactional source data that is usually large, such as sales orders.

In addition, you need to know the detailed characteristics of every source data, such as its file type and record structure, down to the individual field or column.



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net