Integration Services Package Components


Integration Services packages are organized collections of DTP and DTR components. The package is the unit of execution for a Integration Services transfer operation. In other words, to use Integration Services to perform data transfers and ETL operations, you must create a package that contains all of the DTP components that define the source target and destinations for the data and the transformations that will take place, as well as the DTR components that define the work flow or sequence of actions that will be performed by the Integration Services package. Once the package has been created, you execute the package to perform the data transfer. You can see an overview of a sample Integration Services package shown in Figure 9-4.

image from book
Figure 9-4: Integration Services package overview

Integration Services packages can be created using the set of tools provided with SQL Server 2005, or they can be created programmatically using the DTR API. Integration Services packages can be saved either in SQL Server’s msdb database or as XML files that reside in the file systems. You can get a preview of the Integration Services design tools in the section “Integration Services Tools” that is presented later in this chapter.

Integration Services Package Features

Not surprisingly, the new Integration Services architecture adds some significant new capabilities to Integration Services that make it a more robust enterprise data transformation tool. Next, let’s look at some of the major enhancements that this new version of Integration Services brings to the table.

Setting Run-Time Properties of Integration Services Packages

One of the major limitations of the earlier versions of Integration Services was the fact that packages were essentially tied to a given source and destination. That meant that it was really difficult to take a given package and easily reuse it on a different server, even if that server supported the same databases. While it was possible to use global variables and other complex workarounds to get around this limitation, these workarounds weren’t the most graceful or robust solutions to the problem. The new SQL Server 2005 Integration Services enables one package to handle multiple source and destination endpoints. You can use this feature to set package properties at run time from parameters files, the Registry, or XML documents.

Logging

The original Integration Services also didn’t support any built-in logging capabilities. Logging the operation of packages is particularly useful for auditing and troubleshooting. While you could add them yourself, this required development work on your part. The new SQL Server 2005 Integration Services provides built-in logging options for packages, tasks, and transformations.

Checkpoint Restart

Another great new feature in SQL Server 2005’s Integration Services is support for checkpoints and restarts within Integration Services packages. This new feature enables different checkpoints to be associated with multiple steps in complex Integration Services packages. Thus if a package fails that’s using checkpoints, the entire package doesn’t need to be rerun from the very beginning. Instead, the package can be restarted at the first unsuccessful checkpoint. For long-running packages, this can be a huge time saver, as the entire package doesn’t need to be reprocessed. Instead, the Integration Services package can resume processing where it left off. When combined with event handling, this can be a powerful tool for automatically managing your Integration Services jobs.

Variables

Support for variables is another new feature found in SQL Server 2005 Integration Services. With SQL Server 7 and SQL Server 2000, DTS packages supported variables for use in transformation scripts but didn’t support package-scoped variables. This lack of global variables made it difficult to reuse DTS packages on multiple databases and on multiple servers. To be sure, there were workarounds, but since DTS packages were not designed to be used in this fashion, these workarounds were somewhat complex and not particularly robust. With SQL Server 2005 the new Integration Services design provides support for package, task, and event-level variables, greatly increasing the flexibility of Integration Services packages and also significantly facilitating the reuse of Integration Services packages. As their name implies, package-scoped variables can be seen by all of the components in the Integration Services package. Task-scoped variables can be seen and accessed by the components of that particular task but not by any other components that are part of that Integration Services package. One example usage of a package-scoped Integration Services variable might be to contain the name of the current SQL Server system, allowing the package to be easily run on another system by substituting that system’s server name into the Integration Services variable at run time.

System Variables

All Integration Services packages have a number of built-in system variables that represent different aspects of a Integration Services package. Table 9-1 lists the new Integration Services System variables.

Table 9-1: Integration Services System Variables

System Variable

Description

BreakpointTargetDescription

Contains a description of the break point

Cancel

Indicates that execution should be canceled

CountDone

Contains the transfer progress counter

CreationDate

Contains the package creation date

CreatorComputerName

Contains the name of the computer used to create the package

CreatorName

Contains the name of the user who created the package

CustomEventDescription

Describes a custom event

CustomEventInfo

Contains custom event information

CustomEventName

Contains a custom event name

CustomEventValue

Contains the custom event value

CustonEventGUID

Contains the custom event's GUID

ErrorCode

Contains an error code

ErrorDescription

Contains an error description

ExecutionInstanceGuid

Contains a GUID identifying the current package instance

ExecutionStatus

Contains the package's execution status

LocaleId

Contains the Locale identification code

MachineName

Contains the current machine name

MaxCount

Contains the maximum number of items to be transferred

PackageId

Contains the current package identification

PackageName

Contains the current package name

PercentComplete

Contains the transfer progress status

ProgressCountHigh

Contains the transfer progress high count

ProgressCountLow

Contains the transfer progress low count

ProgressDescription

Contains a transfer progress description

ProgressEvent

Contains a transfer progress event identification

Propagate

Indicates if an event can be propagated

SourceDescription

Contains an event source description

SourceID

Contains an event source identification

SourceName

Contains an event source name

StartTime

Contains the package's execution start time

TaskID

Contains a task identification

TaskName

Contains a task name

UserName

Contains the name of the user running the package

VersionBuild

Contains the package's build number

VersionComment

Contains the package's description

VersionGUID

Contains the package's GUID

VersionMajor

Contains the package's major version number

VersionMinor

Contains the package's minor version number

Complex Flow Control

SQL Server 2005’s Integration Services also supports complex flow control. Within each Integration Services package you can specify the path that will be taken if a given operation succeeds or fails. For instance, if an operation succeeds you can set up your Integration Services package to proceed along with the next subsequent action. Otherwise, if the operation fails you can take an alternative action. More information about the new error flow feature is presented in the following section. You can also set up the control flow in such a way that multiple tasks can be executed in parallel, or you can force tasks to execute sequentially by specifying that the next task will not be executed until the current task has completed. You can also use a new Integration Services construct known as a container to group together related Integration Services tasks. Each container can have its own internal control flow and variables. There are also multiple looping structures that enable you to set up repeated actions. There’s a For Each Loop container that is capable of iterating over a group of objects, taking action on those objects on every iterations. In addition, there’s a For Loop container that can evaluate an expression and conditionally perform repeated actions. More information about all of the available tasks is presented later in this chapter.

Error Flows

One of the big new features with SQL Server 2005 Integration Services is its ability to support error flows. The new Error Flows feature essentially enables you to add error handling to your Integration Services packages. With the new Error flows feature, when a Integration Services transform encounters rows that produce error conditions, instead of halting the process with an error, Integration Services can route the problem row according to the error flow that has been set up. For instance, the error flow may indicate that the row is simply written out to a log file, or it can also direct the package into a task that performs much more sophisticated error routines that can even manipulate the data and place the row back into the pipeline for reprocessing.

Immediate Mode and Project Mode

Similar to the way that SQL Server 2000 DTS worked, where DTS had both a wizard interface that was primarily designed to execute ad hoc data transfers and a DTS Designer that was used to build more complex DTS packages, SQL Server 2005’s Integration Services supports both an immediate mode and a project mode. To use Integration Services in the immediate mode, you can run the Integration Services Import/Export Wizard from the menu. The Integration Services Import/Export Wizard can be used to build, execute, and optionally save Integration Services packages that perform simple transfers. SQL Server 2005’s new Integration Services Import/Export Wizard looks and acts much like the version found in SQL Server 2000.

While immediate mode is useful for quick one-time data transfer operations, the project mode is useful for building more sophisticated Integration Services packages using the Business Intelligence Development Studio. The Business Intelligence Development Studio contains an all-new Integration Services Designer that supports an entirely new set of Data Flow, Control Flow, and Event handlers that can be used to build Integration Services packages. The new Integration Services Designer also provides full support for debugging Integration Services packages. You can see the new Integration Services Designer in more detail in the section “Integration Services Tools” later in this chapter.

Digital Signing of Integration Services Packages

Using a method much like the digital signing feature found that’s available for Microsoft .NET applications, Integration Services packages can now be signed as well. This enables you to verify the authenticity of a package when it is executed. Packages can be digitally signed during the design process using the Integration Services Designer. Once a package had been digitally signed, that package is read-only and can no longer be modified.

Data Adapters

The DTP uses data adapters to connect data source and destination endpoints. As their names suggest, source data adapters connect to the source of the data and provide the input for Integration Services packages. Destination data adapters connect to the data target and output the data. SQL Server 2005’s provides a number of source and destination data adapters. Table 9-2 lists the set of built-in data adapters that are supplied with SQL Server 2005.

Table 9-2: Built-in Data Adapters

Data Adapter

Description

Flat File Destination Adapter

A file system adapter that writes text-delimited data to a file

Flat File Source Adapter

A file system adapter that reads text-delimited data from a file

OLE DB Destination Adapter

An OLE DB provider that writes data to an OLE consumer

OLE DB Source Adapter

An OLE DB consumer that reads data from an OLE DB provider

Raw File Destination Adapter

A file system adapter that writes data to a file

Raw File Source Adapter

A file system adapter that reads data from a file

SQL Server Destination Adapter

A SQL Server adapter that's used to write data to a table or view

Web Service Source Adapter

A Web Service adapter that reads data from an XML web service

Containers

Containers are a new construct that Microsoft has added to SQL Server 2005’s Integration Services. The primary purpose for Integration Services containers is to add structure and flow control to your Integration Services packages. Containers group together related tasks and are designed to be used to execute repeated tasks or to provide scope for variables. SQL Server 2005 Integration Services supports the types of containers shown in Table 9-3.

Table 9-3: Integration Services Container Types

Container

Description

Package Container

A collection of package elements

Foreach Loop Container

Provides iterative control flow in a package

For Loop Container

Provides support for repeated actions in a package

Sequence Container

Groups related tasks and containers in package

TaskHost Container

Provides services to a task

Container Properties

Holds values that are common to the container

Container Collections

A collection of containers

Tasks

The Integration Services tasks are the most basic elements of the Integration Services package. The Integration Services task essentially defines an action that will be performed. These actions range from copying files, executing T-SQL statements, and running scripts, to performing FTP transfers and running data mining models. Multiple related tasks can be grouped into containers. Table 9-4 presents the tasks that are included in SQL Server 2005’s Integration Services.

Table 9-4: Integration Services Tasks

Task

Description

ActiveX Script Task

Executes an ActiveX script that performs a specified action

Analysis Services Execute DDL Task

Executes T-SQL DDL statements

Analysis Services Processing Task

Processes Analysis Services objects

Bulk Insert Task

Inserts data from a text file into a table

Data Flow Task

Copies and transforms data between data sources

Data Mining Query Task

Executes data mining queries

Execute Package Task

Executes other packages

Execute Process Task

Executes a program or script

Execute SQL Task

Executes T-SQL statements

File System Task

Executes actions on the file system

File Transfer Protocol Task

Executes FTP data transfers

Message Queue Task

Sends and receives messages from MSMQ data queues

Script Task

Executes scripts written in VB.NET using the Microsoft Visual Studio for Applications (VSA) environment.

Send Mail Task

Sends an e-mail message

XML Task

Accesses data in XML documents

Transformations

Integration Services transformations control what happens to the data as it’s moved from the source data adapter to the destination data adapter. SQL Server 2005 supports both a number of built-in transformations and user-defined custom transformations. You can create custom transformations using the API provided by the DTP object model. SQL Server 2005’s Integration Services provides an extensive list of built-in standard transformations, as shown in Table 9-5.

Table 9-5: Built-in Standard Transformations

Transformation

Description

Aggregate Transformation

Performs aggregations

Allocation Transformation

Spreads the value of an input column across multiple output columns

Character Map Transformation

Applies string functions to character data

Conditional Split Transformation

Evaluates data and routes it to different outputs

Copy/Map Transformation

Creates new output columns by copying input columns

Data Mining Model Accuracy Transformation

Calculates the accuracy of data mining models

Data Mining Model Training Transformation

Trains data mining models

Data Mining Query Transformation

Runs data mining prediction queries

Data Conversion Transformation

Converts the data type of an input column to a different output data type

Derived Column Transformation

Creates an output column from the results of expressions

Dimension Processing Transformation

Processes OLAP cube dimensions

File Extractor Transformation

Reads data from a package's data flow and writes that data to file

File Injector Transformation

Reads data from a file and adds that data to a package's data flow

Fuzzy Grouping Transformation

Standardizes values in input column data

Fuzzy Lookup Transformation

Looks up values in a reference table using fuzzy matching

Logged Lineage Transformation

Provides environment information to the package's data flow

Lookup Transformation

Looks up values in a reference table using exact matching

Merge Transformation

Merges two sorted datasets

Merge Join Transformation

Joins two datasets using a FULL, LEFT, or INNER join

Multicast Transformation

Distributes input data to multiple outputs

Partition Processing Transformation

Processes OLAP partitions

Pivot Transformation

Pivots the input data according to an input column value

Row Count Transformation

Counts the input rows and stores the count in a variable

Sampling Transformation

Creates a representative sampling of the input dataset

Script Transformation

Executes a script to transform the input

Slowly Changing Dimension Transformation

Coordinates updating and insert rows into OLAP dimensions

Sort Transformation

Sorts input data and copies the sorted data to the transformation output

Surrogate Key Transformation

Provides additional custom properties to the Integration Services package

Union All Transformation

Merges multiple datasets

UnPivot Transformation

Unpivots input data according to an input column value

Event Handling

The ability to raise and handle events is another new feature found SQL Server 2005 Integration Services. Event handling enables Integration Services packages to respond to events that are raised at run time by containers and tasks. Events can be fired by the Integration Services package elements to signal a number of different states, including error conditions, when a task has started, when a task completes, or a change in variable status. Table 9-6 lists the Integration Services Event Handlers.

Table 9-6: Integration Services Event Handlers

Event Handler

Description

OnCustomerEvent

Raised on demand by a task or package

OnError

Raised by a task or container on an error

OnExecStatusChanged

Raised by a task or container when its execution status changes

OnPostExecute

Raised by a task or container after it has run

OnPostValidate

Raised by a task after it has been validated

OnPreExecute

Raised by a task or container before it has run

OnPreValidate

Raised by a task before it has been validated

OnProgress

Raised by a task or container when a specified progress metric has been met

OnQueryCancel

Raised by a task or container to determine if it [ok] should stop running

OnTaskFailed

Raised by a task when it fails

OnVariableValueChanged

Optionally raised by a variable when its value changes

OnWarning

Raised by a task or container when a warning is generated

Log Providers

Log providers are another completely new feature that Microsoft has added to Integration Services packages. As their name suggests, Integration Services log providers enable you to add logging to your packages, containers, and tasks. Logging is used to record error information or other important run-time or status information. Table 9-7 lists the log providers that Microsoft ships with SQL Server 2005.

Table 9-7: Log Providers

Log Provider

Description

The Text File log provider

Writes log entries to ASCII text files using a comma-separated value (CSV) format using a default file extension of .log

The SQL Profiler log provider

Writes SQL trace data to a trace file using a default file extension of .trc

The SQL Server log provider

Writes log entries to the sysdtslog90 table in a SQL Server database

The Windows Event log provider

Writes entries to the Windows Application

The XML File log provider

Writes log files to an XML file using a default file extension of .xml




Microsoft SQL Server 2005 New Features
Microsoft SQL Server 2005 New Features
ISBN: 0072227761
EAN: 2147483647
Year: 2005
Pages: 80
Authors: Michael Otey

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net