Summary

In this chapter, we presented the ETL system as the core of the business intelligence systems reliability. ETL system development is the most time-consuming and resource- intensive activity of your DW/BI project.

We described at a high level what your ETL system will look like: a system of master packages calling child packages. Actual data manipulation occurs in the child packages. We talked about how to set up a template package to simplify development.

We walked slowly through some simple packages, to become familiar with the toolset and the way SQL Server Integration Services works. We extracted data, performed some simple transformations, and loaded that data into a dimension table.

Developing the logic to manage dimension changes is one of the challenges of an ETL system. That challenge is greatly simplified by Integration Services SCD Wizard. We also discussed how to modify the results of the SCD Wizard and how you might go about replicating that logic using more basic Integration Services transforms.

Fact table processing usually has more to do with data volumes than with the complexity of the transformations. We showed how to aggregate and allocate data, and perform other calculations.

The surrogate key pipeline, where the fact tables natural keys are replaced by surrogate keys, is the core fact table processing element. We described how to build the surrogate key pipeline, focusing on alternative methods of handling errors. Next , we described some of the issues around loading transactional fact data, both for the historical load and the ongoing incremental load.

We finished the chapter with a detailed view of the topics we began with: master packages, intra-system communication and event handling, and the auditing system.

Our goal with this chapter was to describe how to use Integration Services to build an ETL system. In Chapter 7, well talk about the next step in the DW/BI system: Analysis Services databases. Later, in Chapters 14 and 15, well return to Integration Services to discuss issues of Deployment and Operations. And in Chapter 17 well return to talk about how to populate a low latency DW/BI system.



Microsoft Data Warehouse Toolkit. With SQL Server 2005 and the Microsoft Business Intelligence Toolset
The MicrosoftВ Data Warehouse Toolkit: With SQL ServerВ 2005 and the MicrosoftВ Business Intelligence Toolset
ISBN: B000YIVXC2
EAN: N/A
Year: 2006
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net