Lesson 2: Introducing Microsoft Data Transformation Services (DTS)

3 4

DTS is a powerful set of graphical tools (and programmable objects) that you can use to import, export, and transform data to and from a wide variety of data sources and destinations. In this lesson, you learn about the components of DTS packages, including connections, tasks, transformations, and workflow. You are also introduced to storage options for DTS packages. Finally, this lesson introduces the DTS tools, including the DTS Import/Export Wizard (Dtswiz.exe), DTS Designer, and the DTS package execution utilities. Subsequent lessons will teach you how to use each of these tools.


After this lesson, you will be able to

  • Describe the components of a DTS package
  • Describe the data sources to which DTS can establish direct connections
  • Describe the type of data transformations that are available
  • Describe the DTS workflow options
  • Describe the DTS package storage options
  • Describe each of the DTS tools

Estimated lesson time: 15 minutes


Understanding a DTS Package

You can create a DTS package to connect to a data source, copy and manage data and database objects, run tasks as jobs from within a package, transform the data, and then store the transformed data and the database objects to the same data source or to a different data destination. You can also notify a user (or process) of the success or failure of the package, including attaching a file to an e-mail message. You define these tasks as discrete steps (known as DTS tasks) and control the execution using precedence constraints (such as performing a certain task if a task succeeds and a different task if the task fails). You create a DTS package in the DTS Import/Export Wizard, in DTS Designer, or programmatically.

DTS Connections

A DTS package must have a valid data source and a valid data destination to which it connects. It can connect to additional data sources during execution (such as lookup tables located on a third data connection). Table 7.2 describes the variety of connections that you can establish during the package creation process.

Table 7.2 Connection Types

Connection Type

Description

Data source connection

A connection to a standard database (such as Microsoft SQL Server 2000, Microsoft Access 2000, Oracle, dBase, or Paradox), an OLE DB connection to an ODBC data source, a Microsoft Excel 2000 spreadsheet, an HTML source, or any other OLE DB provider. The properties of the data source connection specify the necessary connection parameters.

File connection

A connection to a text file (normally delimited). The properties of the file connection specify the format of the text file. There are multiple formats you can use. For example, the text file format can be either a delimited or a fixed field format.

Data link connection

A connection to an intermediate file (.UDL file) that stores a connection string to create an OLE DB connection that is resolved at run time. The data link connection allows you to encapsulate the connection properties into a separate .UDL data link file. You can then edit the connection string in the data link file (from one data source to another) without changing the SQL Server 2000 DTS package.

DTS Tasks

A DTS package can perform a plethora of tasks, either sequentially or in parallel. Parallel tasks run independently of each other (using separate operating system threads to enhance performance), whereas sequential tasks execute in a defined order based on the success, failure, or completion of predecessor tasks. A DTS task is a discrete unit of work that is part of a data movement and transformation process (such as copying the contents of a text file to a SQL Server 2000 table). DTS tasks that transform data (such as the Parallel Data Pump, the Transform Data, and the Data Driven Query tasks) are based on an architectural component called the DTS data pump. The DTS data pump is an OLE DB service provider that provides the interfaces and the means to import, export, and transform data from an OLE DB data source to an OLE DB destination.

SQL Server 2000 provides a number of DTS tasks that transform data, copy and manage data and meta data, and function as jobs. These tasks are accessed through the DTS Designer tool (except the Parallel Data Pump task, which can only be accessed programmatically). Table 7.3 describes the tasks that are available through DTS Designer (all of which are also accessible programmatically).

DTS can perform column-level transformations on data during the importation process. These transformations can be as simple as changing column names and as complex as your transformation process requires. Using DTS Designer, you can use the Transform Data task or the Data Driven Query task, or you can write an ActiveX script. Using Visual Basic or Visual C++, you can also write transformations programmatically and use the Parallel Data Pump task. You perform a transformation by mapping a relationship between one or more columns in the data source with one or more columns in the data destination. You then define the transformations you want to occur during importation.

Table 7.3 Tasks Available through DTS Designer

Category

Task

Description

Tasks that copy and manage data and meta data

Bulk Insert task

Runs the BULK INSERT Transact-SQL statement from within a DTS package. This task provides the fastest way to copy information into a table or view, but it does not log error-causing rows. If you need to capture error-causing rows to an exception file, use the Transform Data task instead.

Execute SQL task

Runs Transact-SQL statements during package execution. You can perform a number of operations with Execute SQL task, which include dropping a table and running stored procedures.

Copy SQL Server Objects task

Copies SQL Server database objects (meta data) from one SQL Server instance to another. This task can transfer objects from one instance of SQL Server 7.0 to another; from an instance of SQL Server 7.0 to SQL Server 2000; or from one instance of SQL Server 2000 to another instance of SQL Server 2000.

Transfer Database Objects tasks

A collection of tasks that copy server-wide information (the Copy SQL Server Objects task copies only database-specific information) from one SQL Server instance to another. These tasks include the Transfer Database task, the Transfer Error Messages task, the Transfer Logins task, the Transfer Jobs task, and the Transfer Master Stored Procedures task. These tasks are used by the Copy Database Wizard.

Tasks that transform data

Transform Data task

Copies, transforms, and inserts data from a data source to a data destination. This task is the most basic implementation of the data pump engine in DTS.

Data Driven Query task

Selects, customizes, and executes one of several Transact-SQL operations (such as an update or a delete) on a row based on the data in the row. Use this task if the Transform Data task and the Bulk Insert task do not meet the requirements of your application.

Tasks that function as jobs

ActiveX Script task

Runs an ActiveX script. You can use this task to write code to perform functions that are not available in DTS Designer.

Dynamic Properties task

Retrieves data from an outside source and assigns values retrieved to selected package properties. External sources can be an .INI file, data file, query, global variable, environmental variable, or a constant.

Execute Package task

Runs other DTS packages as part of a workflow. Do not use this task recursively because it could generate a stack overflow, which could result in MMC shutting down.

Execute Process task

Runs an executable program or batch file. This task can be used to open any standard application, such as Microsoft Excel, but it is used primarily to run batch files or business applications that work against a data source.

File Transfer

Downloads data from a remote server Protocol task or an Internet location using FTP. The FTP task and Ftp.exe use the same connection method.

Send Mail task

Sends an e-mail message as a task. For example, notification can be sent to an administrator about the success or failure of a backup operation. In order to use this task, you need to install a MAPI client on the instance of SQL Server you are running.

Table 7.4 describes the types of transformations that are available.

Table 7.4 Available Types of Transformations

Transformation Type

Description

Copy Column

Copies data from a single column to a single column (or multiple column to multiple column). By default, allows all possible data type conversions and automatically truncates text (when necessary) without error notification.

ActiveX Script

Uses an ActiveX script to transform (and truncate) data between one or more source and destination columns on a row-by-row basis.

Date Time String

Converts a date or time value in a source column to a different format in the destination column. Both columns must be a string data type.

Lowercase String

Converts string data to lowercase characters and to the destination data type (if necessary) from a source column to the destination column, truncating data as necessary. Both columns must be a string data type.

Uppercase String

Converts string data to uppercase characters and to the destination data type (if necessary) from a source column to the destination column, truncating data as necessary. Both columns must be a string data type.

Middle of String

Extracts a substring of string data from a source column and copies it to a destination column, truncating data if necessary. Can also perform case conversion. Both columns must be a string data type.

Trim String

Removes leading, trailing, and embedded white space from string data in a source column and copies it to a destination column, truncating data if necessary. Can also perform case conversion. Both columns must be a string data type.

Read File

Opens and copies the contents of a file specified in the source column to a destination column. The source column must be a string data type. The destination column must be a string or binary data type.

Write File

Copies the contents of a source column to a file specified in the destination column. The source column must be a string or binary data type.

DTS Package Workflow

DTS uses steps and precedence constraints to order tasks within a DTS package. Steps define the sequence in which tasks within a package execute. In DTS Designer (or programmatically), you use precedence constraints to control this sequence. Precedence constraints sequentially link each task within a package. A task can have multiple precedence constraints. Tasks without precedence constraints operate in parallel. There are three types of precedence constraints, as shown in Table 7.5.

Table 7.5 Precedence Constraints and Their Functions

Precedence Constraint

Description

Unconditional

If Task 2 is linked to Task 1 by an Unconditional constraint, Task 2 will wait until Task 1 completes and then will execute, regardless of the success or failure of Task 1.

On Success

If Task 3 is linked to Task 2 by an On Success constraint, Task 3 will wait until Task 2 completes, and then will only execute if Task 2 completed successfully.

On Failure

If Task 4 is linked to Task 2 by an On Failure constraint, Task 4 will wait until Task 2 completes and then will only execute if Task 2 failed to complete successfully.

For example, assume Task 1 is a Drop Table task, Task 2 is a Create Table task, Task 3 is a Populate Table task, and Task 4 is a Restore Table task. If the table does not exist, Task 1 will fail and Task 2 will create the table. If the table does exist, Task 1 will drop the table and Task 2 will create the table. Next, if Task 2 creates the table successfully, Task 3 will populate the table. However, if Task 2 fails to create the table successfully, Task 4 will restore the table.

DTS Package Storage

You can store a DTS package to SQL Server 2000, SQL Server 2000 Meta Data Services, a Microsoft Visual Basic file, or a structured storage file. When you save a DTS package, all DTS connections, tasks, transformations, and workflow steps are saved. Table 7.6 describes each of these storage options.

Introducing DTS Tools

You create a DTS package using the DTS Import/Export Wizard, DTS Designer, or programmatically. The DTS Import/Export Wizard is the simplest way to create DTS packages to copy data between data sources, but it limits the complexity of the transformations, the addition of multiple DTS tasks, and the complexity of DTS task workflows. The DTS Import/Export Wizard is available through SQL Server Enterprise Manager and from the Start menu in the Microsoft SQL Server program group. DTS packages created using the DTS Import/Export Wizard can be further customized using DTS Designer, Visual Basic, or Visual C++.

Table 7.6 DTS Storage Options

Storage Location

Description

SQL Server 2000

Stored as a table in the msdb database on any instance of SQL Server 2000. This is the default save option. Multiple packages and multiple versions can be stored. When you save a package to SQL Server 2000, you have the option of securing the packages with one or more passwords.

Meta Data Services

Stored in the repository database in Meta Data Services on your computer. Allows tracking of columns and tables used in the source and destination, including the lineage (ancestry or original source) of data in a particular row. You can secure a package saved to Meta Data Services by using its own security.

Visual Basic file

Stored in Visual Basic code that you can later open and modify using Visual Basic or Visual C++. You can secure packages saved to a Visual Basic file using a system such as Microsoft Visual SourceSafe.

Structured storage file

Stored in an operating system file. Allows storage and transfer of a DTS package independent of any SQL Server database. Multiple packages and multiple versions can be stored in a single file. When you save a package to a structured storage file, you have the option of securing the packages with one or more passwords.

DTS Designer allows you to modify existing DTS packages or create new DTS packages using graphical objects to help build DTS packages containing complex workflows (such as multiple connections and event-driven logic). DTS Designer is available through the Data Transformation Services container in the SQL Server Enterprise Manager console tree.

You can also create DTS packages using Visual Basic and Visual C++. This method is useful for developers who need to access the DTS object model directly and exert a fine degree of control over package operations. Packages created programmatically can be further customized using DTS Designer. Model templates designed for specific solutions are available for customization (such as data driven queries).

DTS also includes package execution utilities to run and schedule DTS packages from a command prompt. These utilities include the DTS Run (Dtsrunui.exe) utility and the Dtsrun command. The DTS Run utility is an interactive command prompt utility that uses a dialog box to execute a DTS package (and create a Dtsrun batch file), whereas Dtsrun can execute a DTS package from a command prompt using command switches (frequently stored in a batch file).

Lesson Summary

DTS uses packages to connect to and move data between a wide variety of OLE DB data sources. A DTS package can extract data from one or more of these data sources, perform simple or complex transformations on this data, and then store the transformed data to one or more of these data destinations. You can use workflow logic (precedence constraints) within the DTS package. The DTS package itself can be stored in SQL Server 2000, in SQL Server Meta Data Services, as a Visual Basic file, or as a structured storage file. You can create a DTS package using the DTS Import/Export Wizard, DTS Designer, Visual Basic, or Visual C++.



MCSE Microsoft SQL Server 2000 System Administration(c) Training Kit (c) Exam 70-228
MCSE Microsoft SQL Server 2000 System Administration(c) Training Kit (c) Exam 70-228
ISBN: N/A
EAN: N/A
Year: 2001
Pages: 126

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net