This part, Part II, discusses the process that populates a dimensional data warehouse. This process is known as ETL, short for Extract, Transform, and Load. Extract is getting the data you need for the data warehouse from the source. Transform is the process of preparing the data. And Load is the process of storing the data in the data warehouse.
The E, T, or L is not always a distinct step. For instance, if the source data is in a MySQL database, the ETL can be a single "INSERT SELECT" SQL statement. In other cases, the Transform portion can be quite involved, requiring not only adding surrogate keys and preparing the history maintenance, but also integrating multiple sources, handling data source errors, and aggregating.
This part covers the following topics.
Extracting source data
Populating the date dimension
Regular job and scheduling