Lesson 2: Using Data-Driven Queries to Update the Data Warehouse

You can use a data-driven query task to update data incrementally after you have copied or transformed the data from a source to a destination. A data driven query can execute a user defined SQL statement on the destination for each source row. Up to four different statements can be executed for each row. An ActiveX script uses the values in the source row to determine which of the four SQL statements is executed on the destination.

After this lesson, you will be able to:

  • Describe the Data Driven Query task in DTS

Estimated lesson time: 30 minutes

What Is a Data-Driven Query?

A data-driven query allows you to scan rows in your source data and, based on the values that it finds in a row, automatically select a particular query to execute. You can create four queries for each data driven query. In the user interface, these are named the insert, update, delete, and user queries but they are not limited to being queries of these types, you can use any valid SQL statement for each query.

When a data driven query task executes, DTS individually processes each row in the source. For every row in the source the ActiveX script is called. Based on the result from the ActiveX script, DTS then executes one or more of the four queries on the destination, passing the parameter values from the current row in the source.

For example, using the Data Driven Query Task in DTS Designer, you can scan a product table and:

  • Update rows in which the product price is over $5,000.
  • Delete rows in which the product status is Discontinued.
  • Insert rows for new products.

You can perform all of these tasks by configuring a single data-driven query in a DTS package.

When to Use a Data-Driven Query

You can use the Data Driven Query Task in DTS Designer when:

  • Incremental updates must apply to the data warehouse along with the capability of using transformations.
  • The ability to handle complexity of the query and need for flexibility outweighs the need for performance. (The Bulk Insert Task and Transform Data Task with fast loading turned on are optimized for performance but do not provide the same amount of flexibility as a data-driven query.)
  • You want to use update or delete queries or execute stored procedures. Although a data-driven query can include inserts, the Transform Data Task generally is more efficient at inserting rows of transformed data into a destination table.
  • You want maximum flexibility in using queries. Data-driven queries achieve this goal by using parameters. To use a data-driven query, the destination data provider must support parameterized queries (ICommand interface).

Do not use data driven queries for bulk operations. Standard data transformations (the data pump) and the bulk insert task provide much better performance than data driven queries for bulk operations.

Performance Issues

In order to get the best possible performance when you use data driven queries, consider the following:

  • Reduce the number of rows that are processed on the source by creating a source query that returns only the rows that have actually changed. You can do this either by using a WHERE clause that tests for a status value in the source table or by joining the source and destination tables.
  • Call only one SQL statement per source row. If you need to perform more than one operation, call a stored procedure from a user query.
  • Do not use a data driven query task to populate two destinations at the same time. It is more efficient to create two separate data transformations.

Data-Driven Query Examples

The following examples show how you write data driven queries for each of the different types.

The ?'s in each of these example queries are input parameters. When the data driven query task executes, values from the destination query in the task are passed into these input parameters.

Example 1

This is an example of a data driven insert query:

 INSERT INTO Categories(CategoryID, CategoryName, Description, Picture)  VALUES (?,?,?,?)  

Example 2

This is an example of a data driven update query:

 UPDATE Categories SET Description = 'Mustard' WHERE CategoryID = (?)  AND  CategoryName = 'Produce'  

Example 3

This is an example of a data driven delete query:

 DELETE FROM Categories WHERE CategoryName = 'Produce'  AND Description = (?)  

Example 4

In this example, a data-driven user-defined query is a stored procedure that accepts parameters:

 sp_mystoredproc ?,? 

Creating Data Driven Query Tasks

Data driven query tasks in the DTS Package Designer use a source data connection and a destination data connection. However, unlike data transformations, the actual data updates are not performed on the destination table that you select on the Destination tab of the Data Driven Query Properties dialog box. The columns of the destination table are used as placeholders to pass the transformed data to the query parameters. The data operations are performed by the data driven queries. These queries are executed on the destination data connection.

The movement of data in a data driven query task is as follows:

  1. Data from the source columns is moved into the destination columns. Transformations move data from the source to the destination columns and one of the transformations must be an Active script. The Active script sets the value that determines which query to run for each row in the source.
  2. Data is passed from the destination columns into the parameters of the query that was selected in the Active script.
  3. The query is executed on the destination data connection. The query can be any valid SQL statement; it is not required to update the destination table.

Exercise: Using a Data Driven Query in a DTS Package

In this exercise you will use a data driven query with three sample products tables in the Northwind_Mart database. This exercise shows you how to use a data driven query to apply conditional updates from a data source to a data destination in your data warehouse. If there are problems updating a product, the product is logged in an exception table. The most common use for this type of DTS operation is to update dimension tables, such as a products dimension table, in your data warehouse. In a production environment the source table will be in an operational database and the destination table will be in the data warehouse database.

In the procedures in this exercise you will create the sample tables, create a new DTS package, add destination and source data connections, and add a Data Driven Query task to update the Destination_Products table from the Source_Products table.

  • Creating the sample tables
    1. In SQL Server Query Analyzer, open the C:\SQLDW\Exercise\Ch08\CrDDQTb.sql query file.
    2. Execute the Transact-SQL code. Executing this code creates the Source_Products, Destination_Products, and Exception Products tables, and the LogException stored procedure.

  • Query the data in the sample tables
    1. In SQL Server Query Analyzer, open the C:\SQLDW\Exercise\Ch08\ProductsQueries.sql query file.
    2. Execute the queries and note the following. The Source_Products table contains products with ProductIDs of 1 to 77 as well as two invalid products with ProductIDs of -1000 and -2000. There are products in the Source_Products table that are marked as discontinued. The Destination_Products table contains products with ProductIDs of 1 to 60. The UnitPrice of many of the products in the Source_Products table is different to the UnitPrice of the same products in the Destination_Products table. The Exception_Products table is empty.

  • Creating a package and adding connections
    1. In the console tree in SQL Server Enterprise Manager, expand your server.
    2. Right-click Data Transformation Services, and then select New Package. The DTS Package Designer is displayed.
    3. To create the first source data connection, drag the Microsoft OLE DB Provider for SQL Server icon to the left center of the workspace (the icon can be found on the Data tool palette on the left side of the DTS Designer window). The Connection Properties dialog box is displayed.
    4. In the New Connection field, type Source Products. Select Use Windows NT authentication. Click the Refresh button to refresh the list of databases. From the Database list, select Northwind_Mart. Click OK
    5. To create the destination data connection, drag the Microsoft OLE DB Provider for SQL Server icon to the workspace and place it to the right of the Source Products connection. The Connection Properties dialog box is displayed.
    6. In the New Connection field, enter Destination Products. Select Use Windows NT authentication. From the Database list, select Northwind_Mart. Click OK
    7. Drag and a Data Driven Query Task from the Task tool palette to the workspace and place it above the two data connections. The Data Driven Query Properties dialog box is displayed.
    8. On the Source tab, enter Product Table Refresh in Description and select Source Products in the Connection drop down box. Select [Northwind_Mart].[dbo].[Source_Products] in the Table drop down box.
    9. On the Destination tab, select Destination Products in the Connection drop down box and select [Northwind_Mart].[dbo].[Destination_Products] in the Table drop down box.
    10. On the Transformations tab, click Delete to delete the default transformations.
    11. Select all of the columns in the Source table and Destination table lists. Make sure that none of the columns is excluded.
    12. Select ActiveX Script in the New transformation drop down box and click the New button. The ActiveX Script Transformation Properties dialog box is displayed.
    13. Edit the Main function in the Script window as follows. You can copy the Main function from the C:\SQLDW\Exercise\Ch08\Code.txt file if you do not want to type it yourself.
    14.  Function Main()  DTSDestination("ProductID") = DTSSource("ProductID")  DTSDestination("ProductName") = DTSSource("ProductName")  DTSDestination("SupplierID") = DTSSource("SupplierID")  DTSDestination("CategoryID") = DTSSource("CategoryID")  DTSDestination("QuantityPerUnit") = DTSSource("QuantityPerUnit")  DTSDestination("UnitPrice") = DTSSource("UnitPrice")  DTSDestination("UnitsInStock") = DTSSource("UnitsInStock")  DTSDestination("UnitsOnOrder") = DTSSource("UnitsOnOrder")  DTSDestination("ReorderLevel") = DTSSource("ReorderLevel")  If DTSSource("Discontinued") Then    Main = DTSTransformstat_DeleteQuery  ElseIf DTSSource("ProductID") > 60 Then   ' Insert new records   Main = DTSTransformstat_InsertQuery  ElseIf DTSSource("ProductID") >= 1 And DTSSource("ProductID") <= 60  Then   ' Update records that have changed   Main = DTSTransformstat_UpdateQuery  Else   'Log Exception using a stored procedure   Main = DTSTransformstat_UserQuery  End if  End Function 

    15. Click OK to close the ActiveX Script Transformation Properties dialog box.

  • Set up the Insert Query for the Data Driven Query
    1. On the Queries tab, select Insert in the Query Type drop down list.
    2. Click the Build button. The Data Transformation Services Query Designer is displayed. Maximize this window and click the Show/Hide Grid Pane button on the toolbar to show the columns grid.
    3. Drag the [Northwind_Mart].[dbo].[Destination_Products] table onto the table panel as shown in Figure 8.13.
    4. Click the boxes next to the ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice, UnitsInStock, UnitsOnOrder, and ReorderLevel columns as shown in Figure 8.13.
    5. Type ? (a question mark) in the New Value column for each of the columns in the Grid Pane as shown in Figure 8.13. Click OK to close the Data Transformation Services Query Designer.
    6. click to view at full size

      Figure 8.13 Data Transformation Services Query Designer dialog box for the Insert query

    7. On the Queries tab, set the parameters to correspond to the destination columns. The default assignments are incorrect. Use the table below to set the Destination for each Parameter.
    8. Destination Parameter
      ProductName Parameter 1
      SupplierID Parameter 2
      CategoryID Parameter 3
      QuantityPerUnit Parameter 4
      UnitPrice Parameter 5
      UnitsInStock Parameter 6
      UnitsOnOrder Parameter 7
      ReorderLevel Parameter 8

    9. On the Queries tab, select Update in the Query Type drop down list. Repeat steps 2 to 6 for the Update query. Select ProductID in the Destination column for Parameter 9.

  • Set up the Update Query for the Data Driven Query
    1. On the Queries tab, select Update in the Query Type drop-down list.
    2. Click the Build button. The Data Transformation Services Query Designer is displayed. Maximize this window and click the Show/Hide Grid pane button on the toolbar to show the columns grid.
    3. Drag the [Northwind_Mart].[dbo].[Destination_Products] table onto the table panel.
    4. Click the boxes next to the ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice, UnitsInStock, UnitsOnOrder, and ReorderLevel columns.
    5. Type ? (a question mark) in the New Value column for each of the columns in the Grid pane.
    6. In the SQL Pane, edit the query to add a WHERE clause that tests the ProductID. The complete query is:
    7.  UPDATE Destination_Products SET ProductName = ?, SupplierID = ?, CategoryID = ?,      QuantityPerUnit = ?, UnitPrice = ?, UnitsInStock = ?,      UnitsOnOrder = ?,  eorderLevel = ? WHERE ProductID = ? 

    8. Click OK to close the Data Transformation Services Query Designer.
    9. On the Queries tab, set the parameters to correspond to the destination columns. The default assignments are incorrect. Use the table below to set the Destination for each Parameter.
    Destination Parameter
    ProductName Parameter 1
    SupplierID Parameter 2
    CategoryID Parameter 3
    QuantityPerUnit Parameter 4
    UnitPrice Parameter 5
    UnitsInStock Parameter 6
    UnitsOnOrder Parameter 7
    ReorderLevel Parameter 8
    ProductID Parameter 9

  • Set up the Delete Query for the Data Driven Query
  • As stated previously, the ?'s in each of these example queries are input parameters. When the data driven query task executes, values from the destination query in the task are passed into these input parameters.

    1. On the Queries tab, select Delete in the Query Type drop down list.
    2. Type the following query in the query box
    3.  DELETE   FROM Destination_Products  WHERE ProductID = ? 

    4. Click the Parse/Show Parameters button. The default parameter mapping maps ProductID to Parameter 1; this is correct.

  • Set up the User Query for the Data Driven Query
    1. On the Queries tab, select User in the Query Type drop down list.
    2. Type the following query in the query box
    3.  EXEC LogException ?, ?, ?, ?, ?, ?, ?, ?, ? 

    4. Click the Parse/Show Parameters button. The default parameter mappings are correct.
    5. Click OK to close the Data Driven Query Properties dialog box.

  • Save and run the package
    1. On the Package menu, select Save. Type Products DDQ Update in the Product name text box.
    2. Select File in the Location drop down list, and type C:\SQLDW\Exercise\Ch08\Products DDQ Update.DTS in the File name text box. Click OK to save the package.
    3. Click the Execute button on the toolbar to execute your package.
    4. The Executing DTS Package: Products DDQ Update dialog box is displayed and a confirmation message tells you when the package has finished executing. Click OK and then click Done to close these dialog boxes.
    5. Close the DTS Package Designer.

  • Query the data in the sample tables after the package has been executed
    1. In SQL Server Query Analyzer, open the C:\SQLDW\Exercise\Ch08\ProductsQueries.sql query file.
    2. Execute the queries and note the following. The Source_Products table has not changed. The Destination_Products table contains products with ProductIDs of 1 to 60 and new products have been added. The UnitPrice of the products in the Source_Products table is the same as the UnitPrice of the same products in the Destination_Products table. Products that are marked as discontinued in the Source_Products table have been deleted from the Destination_Products table. The Exception_Products table contains the two invalid products.

    Lesson Summary

    Data driven queries provide another way for you to transform data in your DTS packages. They are suited to cases where you need to make incremental updates to destination data based on changes to source data. Data driven queries execute a query on the destination for every row in the source.



    Microsoft Corporation - Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit
    Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit
    ISBN: 0735606706
    EAN: 2147483647
    Year: 1999
    Pages: 114

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net