Part II: Extract, Transform, and Load


Chapter List

Chapter 5: Source Extraction
Chapter 6: Populating the Date Dimension
Chapter 7: Initial Population
Chapter 8: Regular Population
Chapter 9: Regular Population Scheduling

Part Overview

This part, Part II, discusses the process that populates a dimensional data warehouse. This process is known as ETL, short for Extract, Transform, and Load. Extract is getting the data you need for the data warehouse from the source. Transform is the process of preparing the data. And Load is the process of storing the data in the data warehouse.

The E, T, or L is not always a distinct step. For instance, if the source data is in a MySQL database, the ETL can be a single "INSERT SELECT" SQL statement. In other cases, the Transform portion can be quite involved, requiring not only adding surrogate keys and preparing the history maintenance, but also integrating multiple sources, handling data source errors, and aggregating.

This part covers the following topics.

  • Extracting source data

  • Populating the date dimension

  • Initial population

  • Regular population

  • Regular job and scheduling



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net