Planning a Data Warehouse

 < Day Day Up > 



The following steps outline one way of implementing a data warehouse.

  1. Determine your initial needs:

    1. Generate a list of business questions that you would like to answer.

    2. Specify data collections (data marts or dimensional data) that will provide answers to your business questions.

    3. Determine how and when you would like to receive information. Information can be delivered based on events, such as supply shortages, on time, such as monthly reports, or simply on demand.

  2. Map the data in your enterprise:

    • Locate existing storage locations for data that can be used to populate your data collections.

    • Determine storage format, data columns, and operating environments.

  3. Create a data model for your central data warehouse:

    • Combine selected enterprise data sources into a denormalized database that is optimized for efficient data extraction and ad hoc queries. SAS ETL Studio resolves issues surrounding the extraction and combination of source data.

    • Consider a generalized collection of data that might extend beyond your initial scope, to account for unanticipated business requirements.

  4. Estimate and order hardware and software:

    • Include storage, servers, backup systems, and disaster recovery.

    • Include the staging area, the central data warehouse, and the data marts or dimensional data model.

  5. Based on the data model, develop a plan for extracting data from enterprise sources into a staging area. Then specify a series of SAS ETL Studio jobs that put the extraction plan into action:

    • Consider the frequency of data collection based on business needs.

    • Consider the times of data extraction based on system performance requirements and data entry times.

    • Note that all data needs to be cleansed and validated in the staging area to avoid corruption of the data warehouse.

    • Consider validation steps in the extraction jobs to ensure accuracy.

  6. Plan and specify SAS ETL Studio jobs for data cleansing in the staging area:

    • SAS ETL Studio contains all of the data cleansing capabilities of the SAS Data Quality Server software.

    • Column combination and creation are readily available through the data quality functions that are available in the SAS ETL Studio's Expression Builder.

  7. Plan and specify SAS ETL Studio jobs for data validation and load:

    • Ensure that the extracted data meets the data mode of the data warehouse before the data is loaded into the data warehouse.

    • Load data into the data warehouse at a time that is compatible with the extraction jobs that populate the data marts.

  8. Plan and specify SAS ETL Studio jobs that populate data marts or a dimensional model out of the central data warehouse.

  9. Plan and specify SAS ETL Studio jobs that generate reports out of the data marts or dimensional model. These jobs and all SAS ETL Studio jobs can be scheduled to run at specified times.

  10. Install and test the hardware and software that was ordered previously.

  11. Develop and test the backup and disaster recovery procedures.

  12. Develop and individually test the SAS ETL Studio jobs that were previously specified.

  13. Perform an initial load and examine the contents of the data warehouse to test the extract, cleanse, verify, and load jobs.

  14. Perform an initial extraction from the data warehouse to the data marts or dimensional model, then examine the smaller data stores to test that set of jobs.

  15. Generate and publish an initial set of reports to test that set of SAS ETL Studio jobs.



 < Day Day Up > 



SAS Institute - SAS 9.1.3 ETL Studio. User's Guide
SAS 9.1.3 ETL Studio: Users Guide
ISBN: 1590476352
EAN: 2147483647
Year: 2004
Pages: 127
Authors: SAS Institute

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net