Source Data


I explain in this section what key aspects of source data you need to identify before you can begin writing the initial population script.

The first step is to identify what source data is needed and available for every fact and every dimension of the data warehouse. You also need to know the characteristics of the data source, such as its file type, record structure, and accessibility.

A sample document that summarizes the key information of the source data for the sales order data warehouse, including the file types, formats, and data warehouse target tables, is shown in Table 7.1. This type of table is often called the data source map because it maps every data from the source to the target The activity to produce the table is called data source mapping.

Note 

The source data for customer and product dimensions map directly to their target data warehouse tables, the customer_dim and product_dim tables. On the other hand, the sales order transaction is the source for more than one data warehouse table. You learn a more complex source data in Chapter 20, “Non-Straight Sources.”

Table 7.1: Your data warehouse’s data sources
Open table as spreadsheet

Source Data

Data Type

File/Table Name

Target Table in Data Warehouse

Customer

CSV text file

customer.csv

customer_dim

Product

Fixed-width text file

product.txt

product_dim

Sales order transaction

MySQL

sales_order in source database

order_dim

sales_order_fact

date_dim, if you use “Loading the Date from Source” option, but you’ll use pre-population here



Dimensional Data Warehousing with MySQL. A Tutorial
Dimensional Data Warehousing with MySQL: A Tutorial
ISBN: 0975212826
EAN: 2147483647
Year: 2004
Pages: 149

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net