Chapter 4: Dimension ETL with SSIS


The next three chapters focus on the discussion of ETL for data warehousing and business intelligence processing. This chapter examines the processing of dimension tables in SSIS. Dimension tables are a data warehouse concept, which this chapter describes and then discusses how to move data from your data sources to your data warehouse dimension tables. Similar to this, Chapter 5 reviews the same things, but only applied for fact tables. Chapter 6 also covers business intelligence, but looks at the integration of SSIS with SQL Server 2005 Analysis Services (SSAS). Integration between SSIS and SSAS involves Online Analytical Processing (OLAP) cube and dimension processing, as well as SSAS data mining querying and training.

Dimension ETL Overview

Arguably, when looking at the development time investment required for a data warehouse ETL solution, dimension processing takes the longest, and is the most complex component. You may have experienced this, especially when the requirements call for tracking the history of changes that a dimension goes through.

Does this analyst requirement sound familiar: “I want to be able to look at the history of list price changes for a product to be able to understand the supply and demand curve”? In other words, what the business user is really asking is how the change in list price affects sales historically. To handle this type of request, a single product must have a new version created in the dimension table any time the list price changes. Furthermore, the sales activity (tracked in the appropriate fact table) must be associated with the right dimension record for the sales (the one with the list price at that point in time). Sound complicated? Add to that a second or third dimension attribute change (like department) that must be tracked historically, and pretty soon you will be pulling you hair out!

The good news is that SSIS comes with out-of-the-box functionality to handle dimension scenarios just like this. Specifically, it comes with a data flow transformation called the Slowly Changing Dimension (SCD) Wizard. As you will see, there are advantages to using the wizard, as well as some limitations you must consider, and ways to work through those limitations to create a scalable dimension process in your ETL.

But before going any further with how SSIS can be applied to dimension table ETL, you need to step back and review the basics of dimension, both in design and the tracking of history.



Expert SQL Server 2005 Integration Services
Expert SQL Server 2005 Integration Services (Programmer to Programmer)
ISBN: 0470134119
EAN: 2147483647
Year: 2004
Pages: 111

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net