The Data Driven Query task enables DTS to read each row of data from a data source and determine the action it should take based on the row contents. Whereas the Transform Data task supports the execution of only a single query in response to the data lookup, the DDQ task enables DTS to look at the data in a row and then select a particular query to execute based on the content of the row of data. The data source for a DDQ task can be a table, view, file, SQL query, or stored procedure. Although the DDQ task provides great flexibility and functionality, it performs more slowly than the Transform Data task because it does not use the Fast Load option. As a result, always try to use the Transform Data task rather than the DDQ task, unless you need the flexibility of the DDQ task.
When you perform a lookup query or a DDQ task, you are performing row-by- row processing rather than set-oriented processing. This means that DTS retrieves a single row, performs the transformation on that row, adds that row to the destination rowset, and then repeats this process for each row in the data source. If the data source supports the use of stored procedures, you can use a stored procedure to generate the rowset from the data source and store that result set in memory. The stored procedure uses set-oriented processing to generate the entire rowset. In this case, when DTS queries the data source for each row, it is querying a rowset stored in memory rather than retrieving rows one by one from the data source. Using set-oriented processing to generate the source rowset reduces the performance impact of row-oriented processing and enables you to increase the flexibility of your data movement application.
In the following procedures, you will learn how the DDQ task operates by creating one that updates the ProductDim table with information from the ProductStage table. You will begin with an existing package that loads new and updated data into the ProductStage table from the NewProducts.txt structured text file. This text file contains a column with a value that indicates whether the product information in the text file updates information about an existing product or contains information about a new product. To store this value for each row in the ProductStage table, you will add an InsertOrUpdate column to the ProductStage table. You will then add a DDQ task that contains two queries. You will create an insert query that inserts new product information into the ProductDim table from the ProductStage table if the value in the InsertOrUpdate column indicates that the product does not already exist in the ProductDim table. You will also create an update query in the DDQ task that updates existing product information in the ProductDim table using the updated product information in the ProductStage table if the value in the InsertOrUpdate column indicates that the product already exists in the ProductDim table. Finally, after you test this DDQ task, you will create a stored procedure that generates the source rowset rather than having the DDQ task retrieve one row at a time from the ProductStage table.
Although this DDQ task relies on a value in an indicator column to determine whether to execute an insert or an update query, you could also use a script to compare each row in the data source with the rows in the data destination to determine whether to execute an insert or an update query. You could compare it based on the value of a particular column or you could use a rowsum to compare the rows across all columns . Determining whether a row in the data source contains new data or updated data at execution time can be significantly slower than using an indicator column.
Switch to SQL Query Analyzer and then click Load SQL Script on the toolbar.
In the Look In list, navigate to C:\Microsoft Press\SQL DTS SBS\ Ch4\ChapterFiles and then double-click AddInsertOrUpdateColumn.sql.
This script file adds the InsertOrUpdate column to the ProductStage table.
On the toolbar, click Execute to add this column to the ProductStage table in the SBS_OLAP database.
Now that you have added this column to this table, you are ready to create the DDQ task that will load data into this table.
Switch to SQL Server Enterprise Manager.
In the SQL Server Enterprise Manager console tree, right-click Data Transformation Services in your local instance and then click Open Package.
Navigate to C:\Microsoft Press\SQL DTS SBS\Ch4\ChapterFiles in the Look In list, and then double-click UpdateProductDim.dts.
Double-click UpdateProductDim in the Select Package dialog box, type mypassword in the Password box, and then click OK.
This package loads new and updated product information from the NewProducts.txt file into the ProductStage table. This package also contains an unused connection to the SBS_OLAP database, called SBS_OLAPAdditionalConnection , that the DDQ task will use.
On the Task menu, click Data Driven Query Task.
Type Insert or Update ProductDim Table in the Description box, select ProductStage in the Connection list, select [SBS_OLAP].[dbo].[ProductStage] in the Table / View list, and then click the Bindings tab.
In this procedure, you will select the ProductStage table as the binding table for the DDQ task to use.
A binding table defines the schema for the data that the DDQ task will hold in memory during the execution of its queries; binding tables never store any actual data. You can use any table as the binding table, provided the table has the appropriate schema to work with the queries in the DDQ task. Because the data source can be a stored procedure or a SQL query that generates the source rowset by combining data from multiple tables, you might not be able to use a source table as the binding table. You might not be able to use the destination table as the binding table if the destination table does not contain all the columns in the source rowset. The destination table might not contain columns to store all the columns in the source rowset because some of the columns might be used by the DDQ task only to determine which of several queries are actually executed for each row. As a result, in some cases you might need to create a binding table that maps to the schema of the rowset generated for the DDQ task and that is used only to define the schema for the DDQ task.
DTS uses the columns in the binding table as placeholders for the transformed data; DTS does not actually insert data into the binding table. The binding table can be any table that contains the appropriate schema for the data being received from the data source by the DDQ task. The actual destination for the transformed data is specified by the query in the DDQ task. The connection object used to connect to the binding table must be a different connection object than DTS uses to connect to the data source.
Select SBS_OLAPAdditionalConnection in the Connection list, select [SBS_OLAP].[dbo].[ProductStage] in the Table Name list, and then click the Transformations tab.
In this procedure, you will delete the default transformation and then create a transformation using an ActiveX script. This ActiveX transformation script will determine which of two separate Transact-SQL statements the DDQ task executes for each row of data retrieved from the ProductStage table. If a row in the ProductStage table contains an I (denoting an insert record) in the InsertOrUpdate column, the ActiveX transformation script will call an insert query. If a row in the ProductStage table contains a U (denoting an update record) in the InsertOrUpdate column, the ActiveX transformation script will call an update query.
Click Delete All, click Select All, and then click New.
Select ActiveX Script, click OK, type DDQ in the Name box, and then click Properties.
The default transformation script copies all data, unchanged, from the data source to the data destination and passes these values to the insert query that it calls. However, in this case, the data destination is merely a location in memory. The final destination is specified by the queries that you will write for this DDQ task. In this procedure, you will replace this transformation script with a script that evaluates the value of the InsertOrUpdate column and tells the data pump to execute either the insert query or the update query for a particular row.
Click Browse, navigate to C:\Microsoft Press\SQL DTS SBS\ Ch4\ChapterFiles in the Look In list, and then double-click DDQ.bas.
This ActiveX transformation script still copies all data, other than the data in the InsertOrUpdate column, unchanged from the data source to the data destination. It uses a SELECT CASE statement to pass these values to either an insert or an update query based on the value of the InsertOrUpdate column in the data source. If neither an I nor a U is present in this column, the row is skipped entirely.
Click Parse, click OK to acknowledge that the script was parsed successfully, and then click OK.
Click OK again and then click the Queries tab.
You can provide up to four separate query types: insert, update, delete, and select. Since the ActiveX transformation calls an insert query and an update query, you will define these two queries in this procedure.
The queries that you define for each query type can be any that you deem appropriate. They do not have to match the query types. These names are just placeholders. For example, you could have four different types of update queries.
Verify that Insert is selected in the Query Type list, open C:\Microsoft Press\SQL DTS SBS\Ch4\ChapterFiles\Insert.txt in Microsoft Notepad, copy the script from this file to the clipboard, and then paste the script in the query window.
This query contains a Transact-SQL INSERT INTO statement that inserts data into columns in the ProductDim table. The data values for this statement are represented by parameters that you will map to the columns in the binding table.
Click Parse/Show Parameters and then click OK to acknowledge that the query was parsed successfully.
The binding columns are now mapped to parameters in the query. In this procedure, the binding columns are in the proper order, so you do not need to rearrange them to match the parameters in the query.
You must always ensure that the order of the parameters being passed from the binding table are in the proper order for the question marks in the queries you create. If the binding columns are not in the necessary order, you must resequence them appropriately.
Select Update in the Query Type list, open C:\Microsoft Press\SQL DTS SBS\Ch4\ChapterFiles\Update.txt in Notepad, copy the script in this file, and then paste the script in the query window.
This query contains a Transact-SQL UPDATE statement that updates existing data in the ProductDim table. The data values for this statement are represented by parameters that you will map to the columns in the binding table.
Click Parse/Show Parameters and then click OK to acknowledge that the query was parsed successfully.
In this case, the parameters are not mapped to the correct destination columns. ProductCode in the destination column is mapped to parameter 1 in the update query. However, ProductCode should be mapped to parameter 8. ProductName should be mapped to parameter 1, ReorderLevel should be mapped to parameter 2, and so on.
In the Destination To ParameterMapping box, map the following parameters:
Click OK to save the configuration of the DDQ task.
Click the ProductStage connection on the design sheet, hold the Ctrl key down, and click the Insert Or Update ProductDim Table step.
Right-click the Insert Or Update ProductDim Table step, point to Workflow, and then click On Completion.
You are using an On Completion constraint to enable the Insert or Update ProductDim Table step to execute even if the Load ProductStage Table step fails. For example, if no new data is available for loading, you might still want to update the ProductDim table for data previously loaded into the ProductStage table. Chapter 10 will demonstrate this concept more fully.
On the Package menu, click Save As.
Type mypassword in the Owner Password box, type C:\Microsoft Press\SQL DTS SBS\DataMovementApplication\UpdateProductDim.dts in the File Name box, and then click OK to save the UpdateProductDim package with the DDQ task into the data movement application folder.
Type mypassword in the Password box, click OK, and then click OK again to acknowledge that you cannot execute the package without the owner password.
Now that you have saved this package, you are ready to test its execution.
Switch to SQL Query Analyzer and then click Clear Window on the toolbar.
In the query pane, type the following statements:
SELECT * FROM SBS_OLAP.dbo.ProductStageSELECT * FROM SBS_OLAP.dbo.ProductDim WHERE ProductCode IN (6,78,79)
Click Execute on the toolbar. Do not close this query.
The ProductStage table is currently empty and the name of the product in the ProductDim table with a product code of 6 is Grandma s Boysenberry Spread.
Switch to the UpdateProductDim package in DTS Designer.
On the toolbar, click Execute.
Click OK to acknowledge that the package executed successfully, and then click Done.
Three rows of data were added from the NewProducts.txt file into the ProductStage table and three rows were either inserted or updated in the ProductDim table.
Switch to SQL Query Analyzer and then click Execute on the toolbar to execute the previous query.
The three records added to the ProductStage table from the text file appear in the first result set. The second result set shows the changes to the ProductDim table. The product name for product code 6 was changed from Grandma s Boysenberry Spread to Grandma s Blueberry Spread, and the products with the product codes of 78 and 79 were added to the ProductDim table.
You have successfully created and tested a DDQ task that will retrieve a row from the ProductStage table, read the value of the InsertOrUpdate column to determine whether to perform the insert or the update query, execute the insert or update query using the values from the row of data, and repeat the process until all rows of data have been processed . Now you are ready to learn how to improve the performance of this task by using a stored procedure rather than a table as the data source for the DDQ task.
When you use a stored procedure or a query as the data source, you can take advantage of the performance benefits of set-oriented processing to populate an entire rowset in memory from the source table or tables; in contrast, when you use a table as the data source, the DDQ task retrieves one row at a time from the table. Although the DDQ task retrieves only one row at a time from the rowset in memory, this is much faster than retrieving one row at a time from the underlying table. Using a stored procedure or query as the data source also enables you to perform a multi-table join to generate the rowset.
On the SQL Query Analyzer toolbar, click Load SQL Script. When prompted to save the query, click No.
In the Look In list, navigate to C:\Microsoft Press\SQL DTS SBS\ Ch4\ChapterFiles and then double-click CreateProductStageRowsetSP.sql.
This script creates a stored procedure that generates the rowset required by the DDQ task to populate the ProductDim table with new and updated values.
You can generate the source rowset using complex stored procedures that pull data from multiple tables across multiple servers, perform a variety of aggregations, and include parameters passed to it by the DDQ task upon execution. When you do so, you must create a table matching the schema generated by the stored procedure that the DDQ task can use for the binding table if the generated rowset does not exactly match the schema of the destination table. If it does match the schema of the destination table, you can use the destination table as the binding table.
Execute the CreateProductStageRowsetSP.sql script to create this stored procedure and then close SQL Query Analyzer.
Now that you have created this stored procedure, you are ready to modify the DDQ task in the UpdateProductDim package to use this stored procedure.
Switch to DTS Designer in SQL Server Enterprise Manager.
In the UpdateProductDim package, double-click the Insert Or Update ProductDim Table step.
On the Source tab, click SQL Query.
In the query window, type EXEC dbo.ProductStageRowsetSP .
Click the Bindings tab and verify that ProductStage is still selected as the binding table.
Click the Transformations tab and verify that the previously configured transformations are still valid.
Since the substitution of the stored procedure for the table as the data source generates the same rowset, you are not prompted to update the previously configured transformations.
If you change the data source or the binding table in a DDQ task in a manner that invalidates the previously configured transformations, a dialog box will appear and inform you that some of the previous transformations are invalid. If you made a change that you expected would invalidate previously configured transformations, you can continue and choose to remove the invalid transformations or all the transformations. If you made a change that you did not expect would invalidate previously configured transformations, you can click Cancel and close the task to retain all previous settings in the task and then fix the previously made change.
Click OK to save the changes to the DDQ task.
On the toolbar, click Save.
On the toolbar, click Execute.
Click OK, and then click Done.
Three products were added to the ProductStage table and six products were either inserted or updated in the ProductDim table. Six products were inserted or updated this time rather than three because the original three entries in the ProductStage table were not deleted before the identical three entries were added again from the NewProducts.txt file. In Chapter 7, you will create a delete phase for this package to solve this double-entry problem.
Close the UpdateProductDim package in DTS Designer.
You have modified the DDQ task to enable the row-by-row processing to retrieve one row of data at a time from memory rather than one row of data at a time from a table. This technique can substantially increase the performance of the DDQ task.