The packages you created in Chapter 1 contain the connection objects, tasks, and precedence constraints necessary to perform the data movement tasks you specified in the DTS Import/Export Wizard. The quickest and easiest way to learn about these package components is to open several of the packages in DTS Designer and review them.
The packages you created in Chapter 1 contain different types of connection objects. As you learned in Chapter 1, connection objects are used by package tasks to connect to various data sources. In Chapter 1, these data sources included SQL Server databases, a Microsoft Access database, a delimited text file, and a Microsoft Excel worksheet.
A connection object defines the connection parameters that enable a task to connect to a data store. A connection object in a package can represent a separate connection to a data store or the reuse of an existing connection to a data store. The data stores to which DTS can connect include relational database management systems (such as SQL Server and Oracle), file system databases (such as Access, dBase, or Paradox), applications (such as Excel), text files (such as comma-delimited or HTML files), and any other structured data source that has an OLE DB provider or ODBC driver (such as Sybase, DB2, Microsoft Exchange Server, and Active Directory).
In the following procedure, you will learn about SQL Server connection objects by opening and modifying those created by the DTS Import/Export Wizard in the SQL_DTS_SBS_1.1 package. This package copies the data as seen through a view from the SBS_OLTP database to a newly created table in the SBS_OLAP database.
In Microsoft Windows, click the Start button, point to All Programs, point to Microsoft SQL Server, and then click Enterprise Manager.
DTS Designer is available only from within SQL Server Enterprise Manager and cannot be started as a stand-alone application.
In the SQL Server Enterprise Manager console tree, expand Microsoft SQL Servers, expand SQL Server Group , expand your local SQL Server instance, right-click Data Transformation Services, and then click Open Package.
Navigate to C:\Microsoft Press\SQL DTS SBS\Ch1\WorkingFolder in the Look In list, and then double-click SQL_DTS_SBS_1.1.dts.
You are prompted to select the package and package version that you want to load. Structured storage files can contain multiple packages and multiple versions of each package. (See Chapter 3 for more information about package formats and version control.)
In the Select Package dialog box, click SQL_DTS_SBS_1.1 and then click OK.
Only users who know the owner password can view or edit a package that is secured with an owner password. If you also provide a user password, users who know it can execute a secured package (but not view or edit it). (See Chapter 3 for more information about package security.)
In the Password box, type mypassword and then click OK.
This package contains a number of different icons representing types of objects in the package, including two connection objects, three tasks, and two precedence constraints. The connection icons are labeled Connection 1 and Connection 2 . The DTS Import/Export Wizard does not generate descriptive names for connection objects in the packages that it creates. In this procedure, you will learn how to edit the properties of the connection objects in this package.
When there are many objects in a package, for easier viewing, you can use the Zoom icon on the DTS Designer toolbar to resize the objects that appear on the design sheet. You can also click and drag objects on the design sheet to rearrange them for easier viewing or to better represent task or data flow within a package.
On the design sheet, double-click Connection 1.
This connection object defines the connection properties to the SBS_OLTP database on your local server, which you configured in the DTS Import/Export Wizard. The properties do not define how the connection is used; as mentioned earlier, connection objects merely define how to connect to a data source and can be used by different tasks. You can easily change most of the properties except the connection name of an existing connection object. For example, you can change its security context, or you can configure it to access a completely different data source or database.
You must take great care when modifying a connection object that is referenced by an existing task because changing a connection on which a task is dependent can cause that task to fail or execute in unforeseen ways. In this procedure, you will not make any changes to this connection object using this interface.
Click Cancel and then double-click Connection 2 on the design sheet.
This connection object defines the connection properties to the SBS_OLAP database on your local server, which you configured in the DTS Import/Export Wizard. The only differences between Connection 1 and Connection 2 are the name of the connection and the database to which the object connects. In this procedure, you will not make any changes to this connection object using this interface.
Click Cancel and then click Disconnected Edit on the Package menu to display the Edit All Package Properties dialog box, which shows the properties and values of every object in this package.
Using this interface, you can directly edit package properties. In this procedure, you will use the Disconnected Edit feature of DTS Designer to change the names of the connection objects in this package.
You should modify a package property using the Disconnected Edit feature of DTS Designer only when you cannot modify the package property using another method, such as through an object s Properties dialog box. The reason for this is that when you edit the properties of an object in DTS Designer using one of the property sheets for the object, DTS validates the property value change and will invalidate an invalid change. However, if you make changes using the Disconnected Edit feature of DTS Designer, no validation of the change occurs and you can make an invalid change that will break the package. Editing a package using this feature is similar to editing the Windows registry using the Registry Editor rather than using one of the icons in Control Panel.
In the left pane, expand the Connections property node and then click the Connection 1 property group to display the Connection 1 property names and values in the right pane. The value for the Name property is Connection 1.
In the right pane, double-click the Name property so that you can enter a more descriptive name for this connection object.
The Name property is a string value that you can edit. In this procedure, you will provide a more descriptive name for this connection object.
In the Value box, select Connection 1, type DataSource , and then click OK to display the changed value for the Name property.
Notice that the change also applies to the name of the property collection in the left pane.
In the left pane, click the Connection 2 property group to display the Connection 2 property names and values in the right pane.
In the right pane, double-click the Name property.
In the Value box, select Connection 2, type DataDestination , and then click OK.
Once again, the changed value for the Name property appears in both the left and the right panes.
The connection object names have not been updated on the design sheet because you used the Disconnected Edit feature. To update the display of the package on the design sheet, you must open each connection object to update the connection object and its dependent objects.
On the design sheet, double-click Connection 1.
The renaming of Connection 1 to DataSource is now reflected in the Connection Properties dialog box.
The Task References dialog box appears because DTS detects that the Copy Data From Sales By Category To [SBS_OLAP].[dbo].[Sales By Category] task (the Copy Data step) references this connection object and that the connection object has been modified.
Since a modification to a connection object can invalidate transformation tasks that reference the connection object, you must tell DTS whether to keep or delete the existing transformations in the task. For example, if a connection object is modified to point to a different database with a different schema, you would delete and then create new transformations in the tasks that were using the modified connection object to reflect the schema of the new data source. However, since you changed only the name of the connection object, do not delete any transformations in this task.
The design sheet displays the new descriptive name for this connection object.
On the design sheet, double-click Connection 2.
The renaming of Connection 2 to DataDestination is now reflected in the Connection Properties dialog box.
Click OK, and then click OK again to retain all existing transformations.
The design sheet now displays the descriptive names for both connection objects in the package.
On the toolbar, click Save. Do not close this package.
Use a descriptive name for each connection object to enable you to easily determine its properties and function. As you add tasks to packages and increase their complexity, using descriptive names will make it easier to reference the appropriate connection object in each task. You can provide these descriptive names when you create the connection objects using DTS Designer rather than rename them later, as you have done here.
You have successfully viewed and modified your first package in DTS Designer and used two different interfaces to look at the properties of a connection object. This simple package required only one data source connection object and one data destination connection object to serially perform the three tasks in the package. Before you learn about these tasks and how to control the order of their execution, however, you need to learn about using multiple connection objects in a package to support the execution of multiple tasks in parallel.
In this procedure, you will open and review the SQL_DTS_SBS_1.3 package that you created in Chapter 1 to learn how it uses multiple connection objects to execute some tasks serially and some in parallel. This package creates 10 tables in the SBS_OLAP database in SQL Server based on the schema of these same tables in the AccessPubsDB database in Access. This package then copies data unchanged from the tables in the Access database into the newly created tables in the SQL Server database.
On the Window menu in SQL Server Enterprise Manager, click Console Root\Microsoft SQL Servers to display the SQL Server Enterprise Manager console root.
You cannot open a second package from within DTS Designer while you are viewing an existing package. You must first return your focus to the SQL Server Enterprise Manager console root to open a second package.
In the SQL Server Enterprise Manager console tree, right-click Data Transformation Services in your local SQL Server instance, and then click Open Package.
Navigate to C:\Microsoft Press\SQL DTS SBS\Ch1\WorkingFolder in the Look In list, and then double-click SQL_DTS_SBS_1.3 to display the Select Package dialog box.
In the Select Package dialog box, click SQL_DTS_SBS_1.3, and then click OK.
In the Password box, type mypassword , and then click OK.
In the initial view of this package in DTS Designer, the icons are too small to see, even when you maximize DTS Designer on your computer screen.
On the toolbar, click Zoom, and then click 75% to increase the size of the icons so that you can more easily see the objects in the package. You can use the scroll bar to see the icons that appear to the left and right of the current focus on the design sheet. (If your monitor still doesn t show the icons clearly with the 75 percent setting, increase the zoom percentage further.)
You can also rearrange the icons to fit within the display window of the design sheet. There are five Connection 1 , five Connection 2 , five Connection 3 , and five Connection 4 connection objects in this package.
On the design sheet, double-click one of the Connection 1 connection objects.
This connection object defines connection properties to the AccessPubsDB.mdb database file in the file system. Do not make any changes to this connection object.
Click Cancel and then double-click another Connection 1 connection object on the design sheet.
You ll see that this Connection 1 object displays identical connection information. Each of the Connection 1 objects in this package represents a shared connection to the AccessPubsDB database; Connection 1 is shared by five tasks. Do not make any changes to this connection object.
Click Cancel and then double-click one of the Connection 3 connection objects on the design sheet.
This connection object is a different object from Connection 1 , but both objects define a connection to the same AccessPubsDB.mdb database file. In this package, Connection 3 is shared by five separate tasks.
If you open the Connection 2 and Connection 4 connection objects, you will see that these connection objects define connections to the SBS_OLAP database in SQL Server. Connection 2 is shared by five tasks and Connection 4 is shared by five tasks. Do not make any changes to these connection objects.
that share the same connection object execute serially and can block one another. Tasks with separate connection objects execute in parallel and cannot block one another (unless they cause a lock on the same resource in the underlying data store). In a package, you need to determine how many tasks you want DTS to execute in parallel versus how many tasks you want DTS to execute serially. Each connection object must be maintained separately, and each separate connection consumes resources when opened. However, separate connections can yield superior performance in situations that benefit from parallel processing. Shared connections do not require separate maintenance and do not consume additional resources, but a task that shares a connection can be blocked by another task sharing the same connection.
The prototype of a data movement application that you will build during the course of this book illustrates how to create a complete data movement application using multiple packages and multiple connections to eliminate blocking and take advantage of parallel processing.
In these two packages, you looked at connection objects that connect to a SQL Server database and to an Access database. Open the SQL_DTS_SBS_1.4 DTS package to view a Text File (Source) connection object and open the SQL_DTS_SBS_1.5 DTS package to view an Excel 97-2000 connection object. Now that you understand connection objects, you are ready to learn about the tasks that the DTS Import/Export Wizard created in the packages you built in Chapter 1.
Tasks perform the real work of a package and divide that work into discrete units. This enables the package to be more easily understood and extended. DTS ships with many predefined tasks, but you can also create your own custom tasks using scripting. The packages you created in Chapter 1 contain simple Transform Data tasks and Execute SQL tasks that use connection objects to extract, query, and save data. In this chapter, you will look at these tasks and also learn how to use the Bulk Insert task to import data from a structured text file. You will learn about other tasks throughout the remaining chapters in this book.
The Transform Data task is an implementation of a DTS COM component called the data pump . The Transform Data task moves data between a data source and a data destination, mapping columns in the data source to columns in the data destination. This task supports the use of a query that uses set-based logic to limit and filter the source data before it is copied or transformed by the data pump.
This task also supports the use of complex scripting to change the data on a row-by-row basis as it is being copied from the data source to the data destination. Although row-by-row transformation offers additional functionality not easily provided by using queries, it comes at the cost of slower performance. The Transform Data task can also be customized to fire events before and after the transformation.
In this procedure, you will learn about the Transform Data task by reviewing one in the SQL_DTS_SBS_1.1 package. This Transform Data task copies data as seen through a view from the SBS_OLTP database to a newly created table in the SBS_OLAP database.
On the Window menu in SQL Server Enterprise Manager, click DTS Package: SQL_DTS_SBS_1.1 to switch to this package in DTS Designer.
Position your cursor over the black directional arrow that points from the DataSource connection object to the DataDestination connection object.
This black arrow represents a Transform Data task and indicates the direction of the data flow between two connection objects. Positioning your cursor over the black arrow enables you to read the description of the Transform Data task that transforms data between the DataSource and the DataDestination connection objects. You can also see the task description by viewing the properties of the task.
Double-click the Transform Data task in this package to view or edit its properties.
The data source is the DataSource connection object, and the data retrieved from this data source is determined by a SQL query.
Click the Destination tab.
The data destination is the DataDestination connection object, and the schema of the destination table created by this package is displayed. You cannot change the schema of the destination table for this task without creating a new table. If you create a new table from within this task, the Create Table task in this package will simply recreate the original table the next time the package is executed. If you want to change the schema for the Sales By Category SBS table, you should modify the Create Table task and then modify the transformations in this task to match the new schema.
Click the Transformations tab to show the data transformation properties. The transformation name, DirectCopyXform , indicates that the column data are copied, without any modification, from the data source to the data destination. This Transform Data task is merely used to insert the rows retrieved by the SQL query into the appropriate columns in the destination table.
The four columns in the source and the four columns in the destination are linked through a single multi-headed arrow. This indicates that a single COM object is used to perform this transformation operation. Multiple arrows indicate that multiple COM objects are used.
When you are transforming 20 or more columns in a Transform Data task, you can obtain a performance increase of up to approximately 15 percent through the use of a single COM object.
Lookups are discussed in Chapter 4.
Click the Options tab. The default Max Error Count value is 0, which means that the task will terminate if any errors are detected in the transformation (such as an invalid data type or a referential integrity violation). A higher value will permit a specified number of errors before the task will terminate. In Chapter 5 you will learn about using the exception file to capture error rows.
Notice that the Use Fast Load check box is selected by default, which means that this task will use high-speed, bulk-copy processing for better insert performance. This option is automatically selected and is available only when the data destination is the Microsoft OLE DB Provider for SQL Server. When this option is selected, you can use the Insert Batch Size text box to specify the loading of data in batches. A value of 0 indicates that the entire insert operation is performed using a single batch.
When the Use Fast Load option is selected, you also have the option to ignore constraints on the relational tables during data loading (which you should use only when you know the data is clean) and enable table-level locking rather than row-level locking. These options can increase data throughput during data loading.
You have learned how a simple Transform Data task copies data unchanged from a data source to a data destination, and you ve looked at some of the options you can set for the Transform Data task to improve its performance and control its operation. You are now ready to learn about the other two tasks in this package.
The Execute SQL task is a programming task that enables you to use Transact- SQL code to access data stored within a SQL Server database from within a package. This task also enables you to create objects in a SQL Server database before a transformation process commences or to update objects in a SQL Server database after a transformation process has completed. For example, you can run maintenance procedures after an import, such as rebuilding indexes or updating statistics. You can also store the results of an Execute SQL task into a global variable for use in another task, or pass values into an Execute SQL task from another task to control how the Execute SQL task functions.
In this procedure, you will learn about the Execute SQL task by reviewing the two Execute SQL tasks in the SQL_DTS_SBS_1.1 package. One of these tasks creates a table in the SBS_OLAP database, and the other deletes the same table.
On the design sheet for the SQL_DTS_SBS_1.1 package in DTS Designer, position your cursor over the Create Table [SBS_OLAP].[dbo].[Sales By Category SBS] Step task (the Create Table step). This enables you to read the description of this Execute SQL task. Take note of the icon that represents an Execute SQL task.
Double-click the Create Table step to view its properties.
This task is using the DataDestination connection object and executes a CREATE TABLE statement to create the Sales By Category SBS table. The table created by this task is used by the Copy Data step. If you make any schema changes to the Sales By Category SBS table in this task, you will need to modify the Transform Data task to incorporate these changes; if you do not, the package will either fail or execute in an unexpected manner. In this procedure, you will not change the Create Table step.
Click Cancel and then double-click the Drop Table [SBS_OLAP].[dbo].[Sales By Category SBS] Step task (the Drop Table step) to view its properties.
This task is using the DataDestination connection object and executes a DROP TABLE statement to drop the destination table each time the package is executed. In this procedure, you will not change the Drop Table step.
You have learned how this package uses one Execute SQL task to drop a table and another Execute SQL task to create a table in a SQL Server database. You are now ready to learn about how this package uses precedence constraints to ensure that the Drop Table step executes and completes before the Create Table step executes, and to ensure that the Create Table step completes successfully before the Transform Data task executes.
Precedence constraints enable you to control workflow in a package. A precedence constraint between two tasks links the tasks in a sequential order. There are three types of precedence constraints:
On Completion An On Completion precedence constraint (which is labeled as a Completion constraint on the design sheet) is represented by a blue and white directional arrow between two tasks (Task A and Task B) and specifies that Task B wait until Task A completes before Task B executes. As long as Task A completes, Task B will execute, regardless of the success or failure of Task A.
On Success An On Success precedence constraint (which is labeled as a Success constraint) is represented by a green and white directional arrow between two tasks and specifies that Task B will execute only after Task A completes successfully. If Task A never completes successfully, Task B will never execute.
On Failure An On Failure precedence constraint (which is labeled as a Failure constraint) is represented by a red and white directional arrow between two tasks and specifies that Task B will execute only after Task A fails to complete successfully. If Task A never fails, Task B will never execute.
In the absence of precedence constraints that limit when tasks can execute, tasks in a package execute without regard to the execution of any other task. As discussed previously, tasks that share a connection object execute serially through the connection object and tasks that have their own connections execute in parallel to each other. Without precedence constraints, the sequence of execution for serial tasks is based on the order in which the tasks request use of the connection object. This sequence can vary from execution to execution of the package, depending on how quickly any pre-connection processing occurs within the tasks before they request the use of the connection object.
In this procedure, you will learn about precedence constraints by reviewing and modifying them in the SQL_DTS_SBS_1.1 package, and then executing the package with and without them. This package contains both On Completion and On Success precedence constraints.
On the design sheet for the SQL_DTS_SBS_1.1 package in DTS Designer, position your cursor over the blue and white directional arrow between the Drop Table step and the Create Table step (the Completion constraint).
The Completion constraint dictates that when the Drop Table step completes; regardless of its success or failure, the Create Table step will execute.
On the design sheet for the SQL_DTS_SBS_1.1 package in DTS Designer, position your cursor over the green and white directional arrow between the Create Table step and the DataSource connection object (the Success constraint).
The Success constraint dictates that when the Create Table step completes successfully, the Copy Data step will execute. If the creation of the destination table in the destination database fails for any reason, the Transform Data task will not execute and data will not be copied into the destination table.
By default, the failure of a step in a package does not stop other steps in a package from executing unless the steps are linked by a precedence constraint or the step is configured to cause the entire package to fail when the step fails.
On the toolbar, click Execute (the green arrowhead ).
The Sales By Category SBS table is dropped, recreated, and then populated with data.
Click OK to close the Package Execution Results message box. Click Done to close the Executing DTS Package: SQL_DTS_SBS_1.1 dialog box.
To learn more about how precedence constraints function, you will delete the Completion constraint and then re-execute the package.
On the design sheet, right-click the Completion constraint between the Drop Table step and the Create Table step and then click Delete.
On the design sheet, right-click the Success constraint between the Create Table step and the DataSource connection object and then click Delete.
On the toolbar, click Execute and then click OK when the Package Execute Results message box appears.
Without the Completion and Success constraints, one or two of the three tasks might fail because of out-of-order task execution. For the package to execute successfully, the Drop Table step must execute first, followed by the Create Table step and then the Copy Data step. Execution of the steps without precedence constraints can fail for the following reasons:
If the Copy Data step fails, the Create Table step executed first and then the Drop Table step immediately deleted the destination table before the Copy Data step could populate it, or the Drop Table step dropped the table and the Copy Data step tried to execute before the Create Table step had re-created it.
If the Create Table step fails, the Create Table step executed before the Drop Table step had completed its task.
If the Drop Table step fails, the Drop Table step task executed first and the Sales By Category SBS table did not exist in the database.
To determine why a particular step failed, double-click the failed step in the Executing DTS Package dialog box.
You might be able to see more than one of these failure patterns by executing the package several times. The pattern of task failures and successes will depend upon the speed of your computer. Some computers will demonstrate all three failure patterns, and others will never demonstrate a failure.
You have now learned how precedence constraints control workflow and ensure that tasks execute in a predictable and repeatable manner.
In this procedure, you will learn three ways to create a precedence constraint.
Simultaneously select both of the tasks that will be related by the precedence constraint. To do this, on the design sheet, click the Drop Table step, hold down the Ctrl key, and then click the Create Table step.
You can select multiple items on the design sheet by holding down the Ctrl key while selecting them or by using your mouse to draw a box around them.
Right-click the Create Table step, point to Workflow, and then click On Success. The On Success constraint is created with the arrow pointing from the Drop Table step to the Create Table step (this is the order in which you want them to execute).
If you had right-clicked the Drop Table step rather than the Create Table step when defining the precedence constraint, the arrow would be pointing from the Create Table step to the Drop Table step (and the tasks would execute in the wrong order).
You can also create a precedence constraint by selecting the tasks and then selecting the precedence constraint from the Workflow menu. If you use this approach, the order in which you select the tasks on the design sheet will determine the direction of the precedence constraint arrow.
Double-click the On Success precedence constraint to show a dialog box that displays steps that must be completed for the constraint to be exercised. In this package, the Drop Table step must be completed for the On Success constraint to be exercised. In this procedure, you will change the On Success constraint to an On Completion precedence constraint.
In the Precedence list box, select Completion and then click OK to change the On Success precedence constraint between the Drop Table step and the Create Table step to an On Completion precedence constraint.
On the design sheet, select the Create Table step and the DataSource connection object by drawing a rectangle around them.
Right-click DataSource, point to Workflow, and then click On Success.
On the toolbar, click Execute.
The Package Execution Results message box appears telling you that the DTS Designer successfully completed the execution of the package.
Click OK to close the message box.
The Status box in the Executing DTS Package dialog box shows that all three steps completed successfully because they executed in the required order.
On the toolbar, click Save and then close the SQL_DTS_SBS_1.1 package in DTS Designer. Do not close SQL Server Enterprise Manager.
Now that you have learned about precedence constraints, connection objects, and simple tasks ”the building blocks of a data movement application ”you are ready to create your first package from scratch using DTS Designer.