Chapter 3: Data Extraction


Overview

A natural first discussion point for ETL is the extraction, the E in ETL. This chapter applies the concepts of the data extraction with using SSIS. As discussed in Chapter 1, ETL applies to a broad spectrum of applications beyond just data warehousing and ETL loading. Therefore, the discussion of this topic will include both generalized extraction concepts and data warehouse–specific concepts.

Data extraction is the process of moving data off of a source system, potentially to a staging environment or into the transformation phase of the ETL. Figure 3-1 shows the extraction process separated out on the left. An extraction process may pull data from a variety of sources, including files or database systems, as this figure highlights.

image from book
Figure 3-1: Extraction process

A few common objectives of data extraction include the following:

  • Consistency in how data is extracted across source systems

  • Performance of the extraction

  • Minimal impact on the source to avoid contention with critical source processes

  • Flexibility to handle source system changes

  • The ability to target only new or changed records

This chapter is structured into the following three sections:

  • SSIS data flow source adapters- Beyond just a review of the data flow sources, this chapter looks at performance and configuration options.

  • Incremental extraction- Part of applying data extraction in SSIS involves seeing how to design SSIS packages that perform incremental extractions-extracting only changed records from a source.

  • Data lineage- Many applications require that the source data being extracted include a mechanism that points back to the very records that were extracted, which provides valuable data validation.



Expert SQL Server 2005 Integration Services
Expert SQL Server 2005 Integration Services (Programmer to Programmer)
ISBN: 0470134119
EAN: 2147483647
Year: 2004
Pages: 111

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net