Chapter 7: Initial Population


Overview

In Chapter 6, “Populating the Date Dimension” you learned how to populate the date dimension. In this chapter, you learn how to populate the fact table and the other dimension tables.

Right before the start of your data warehouse operation, you need to load historical data. This historical data is the first set of data you populate the data warehouse with. This first loading is referred to as initial population.

Your data warehouse users determine how much historical data they want to have in the data warehouse. For example, if your data warehouse should start on March 1, 2007 and your user wants to load two years of historical data, you will load the source data dated between March 1, 2005 to February 28, 2007. Then, at the start of the data warehouse on March 1, 2007, you load the March 1, 2007 data.

You must load all dimension tables before you load the fact table, because the fact table needs the dimensions’ surrogate keys. This is true not only during the initial population, but also during regular population. (Regular population is discussed in Chapter 8.)

In this chapter I explain the steps to perform initial population, including identifying the source data, developing the initial population script, and testing the script.



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net