Lookup queries enable DTS to query rows of data from an initial data source, query the same or an additional data source (the lookup query) to add to or modify some or all the rows retrieved from the original data source, and then write the combined rowset to the data destination. The lookup query can be a SELECT , INSERT , UPDATE , or DELETE statement; or it can be a stored procedure invocation. For example, suppose you have a column in your destination data source that stores each customer s country as part of his or her address, but the data source from which you are loading the new data stores the value for each customer s country in an inconsistent manner (using a variety of abbreviations for a particular country). You can solve this problem using a DTS lookup query. This lookup query matches inconsistent values with corrected values in a lookup table and ensures that the value for each country is stored consistently. If additional values in each row need to be resolved to a consistent state, or if additional rows need to be added, you can create additional lookup queries that perform these tasks during the same transformation operation. For example, your queries can add contact names or phone numbers .
As mentioned earlier in this chapter, a lookup query is very similar to a Transact- SQL JOIN statement. When you can accomplish the same task using a lookup query or a Transact-SQL join, use the Transact-SQL join whenever practical for better performance. For example, a Transform Data task that joins two tables using a Transact-SQL JOIN statement to generate the source dataset runs faster than a Transform Data task that reads data from the first table and performs a lookup to include data from the second table to generate the rowset that is inserted into the data destination.
However, a lookup query is appropriate when your source join is not practical to perform because the source data resides in a non “SQL Server database, or because the volume of data generated by a source join would exceed system capacity. A lookup query is also appropriate in cases in which it updates or deletes data in the data destination based on information it retrieves from the secondary source. Finally, use a lookup query when the need for clarity in your code outweighs performance issues.
When DTS performs a lookup query in a Transform Data or Data Driven Query task, it keeps the connection to the data source, lookup table, and data destination open for the duration of the transformation. Keeping these connections open is more efficient than closing the connection to the lookup table after each lookup. If this type of lookup was done using a COM object and global variables in an ActiveX or custom task, the connection to the lookup table would be closed as soon as the lookup values had been retrieved. Closing the lookup connection after each lookup is very inefficient if many lookups must be performed. DTS can also cache the lookup data retrieved during a lookup query for reuse during the transformation. Setting a cache value is useful when the number of rows being transformed is large and the number of rows in the lookup table is small. These optimization techniques increase the performance of lookup queries over most other lookup methods .
If you skipped Chapter 3, execute the IfYouSkippedChapter3.cmd batch file. This batch file restores the SBS_OLTP and SBS_OLAP databases and copies the DTS packages that would have been created in Chapters 1 through 3 into the appropriate folders. If you do not want this batch file to overwrite any packages that you created in Chapters 1 through 3, you must move them or rename them before you execute this batch file.
In the following procedures, you will add to an existing package a lookup query that imports a list of new customers from a delimited text file to the CustomerStage table in the SBS_OLAP database. This lookup query will modify the data being imported from the text file using the values in the CountryCodes lookup table to ensure the country of each customer is stored in a consistent manner. You will begin by creating the CountryCodes lookup table in the SBS_OLAP database.
Connect to your local SQL Server instance as a system administrator using SQL Query Analyzer.
On the toolbar, click Load SQL Script.
In the Look In list, navigate to C:\Microsoft Press\SQL DTS SBS\ Ch4\ChapterFiles and then double-click CountryCodes.sql.
This script file creates the CountryCodes table that the Transform Data task will use as a lookup table to ensure that country names are stored consistently in the CustomerStage table in the SBS_OLAP database. The CountryCodes table contains two columns for each row. The first column contains an abbreviated value for a country, and the second column contains the value for that country in the desired form for use in the data destination.
On the toolbar, click Execute to create this table in the SBS_OLAP database.
Now that you have created the CountryCodes lookup table, you are ready to create a lookup query that will use this table.
Open SQL Server Enterprise Manager and, in the console tree, expand Microsoft SQL Servers, expand SQL Server Group, and expand your local SQL Server instance.
Right-click Data Transformation Services and click Open Package.
Navigate to C:\Microsoft Press\SQL DTS SBS\Ch4\ChapterFiles in the Look In list, and then double-click UpdateCustomerDim.dts.
Double-click UpdateCustomerDim in the Select Package dialog box, type mypassword in the Password box, and then click OK.
This package uses a Transform Data task to copy data from the NewCustomers.txt delimited text file to the CustomerStage table in the SBS_OLAP database. Notice that an additional connection object has already been created to the SBS_OLAP database. You will use this connection object for the lookup query in the Transform Data task.
Double-click the Transform Data step on the design sheet, verify that C:\Microsoft Press\SQL DTS SBS\Ch4\ChapterFiles\NewCustomers.txt appears in the Table / View list, and then click Preview to view the data in the NewCustomers.txt file.
Notice that the country names are abbreviated inconsistently. In this procedure, you will create a lookup query that uses the CountryCodes lookup table that you created in the previous procedure to transform these country values as they are copied to the CustomerStage table.
Click OK to close the View Data dialog box and then click the Destination tab.
The destination for the transformed data is the CustomerStage table using the CustomerStage connection.
Click the Transformations tab.
The data is currently being copied directly from the data source to the data destination without any changes.
Click the Lookups tab.
On this tab, you will create a lookup query and use a separate connection for it. Although you could use the same connection for the lookup query and the destination, the task will perform better if the lookup query is given its own connection. Doing so avoids serializing the commands across a single shared connection. You will also define a value for the number of lookup values stored in cache. This will improve performance if there are a large number of rows being transformed compared to the number of rows in the lookup table.
Type CountryCodes in the Name box, select SBS_OLAPAdditionalConnection in the Connection list, type 20 in the Cache box, and then click the ellipsis in the Query box to define the lookup query.
You can use the DTS Query Designer to help you build the lookup query, or you can simply type (or paste) your own query into the query window. For this lookup you will type a parameterized query into the query window that selects the value in the CountryConsistent column for each row that contains a value in the corresponding CountryInconsistent column.
In the query window, delete the partial query, type SELECT CountryConsistent FROM CountryCodes WHERE CountryInconsistent = ? and then click OK to save the query.
The ? is a placeholder for the parameter that will be passed into this query by the transformation script.
Now that you have created this lookup query, you must modify the existing transformation in this Transform Data task in order for this task to use this lookup query. You will change the default transformation from a copy column transformation to an ActiveX script transformation.
Click the Transformations tab, click Delete All, click Select All, and then click New to create an ActiveX script transformation.
To execute the parameterized lookup query that you just created, you need to create an ActiveX transformation script that calls the lookup query for each row that is being transformed.
Select ActiveX Script, click OK, type CountryCodeLookup in the Name box, and then click Properties.
The default transformation script copies each column, unchanged, from the data source to the data destination. In this procedure, you will modify this script for the BillCountry column to call the parameterized lookup query that you created in the previous procedure.
The ActiveX script that you will use to call the lookup query uses the following format to pass a value from a column in the data source to the lookup query and accept the return value from the lookup query:
Click Browse, navigate to C:\Microsoft Press\SQL DTS SBS\ Ch4\ChapterFiles in the Look In list, and then double-click Lookup.bas.
This ActiveX script is changed in three ways. First, a CountryCode variable is declared as an intermediate variable. An intermediate variable is used to prevent type mismatches that can occur when you directly map lookup query outputs to destination columns. Second, the value for Col007 for each row is passed to the CountryCodeLookup query as its input parameter, and the intermediate variable is configured to hold the value returned from this query. Third, the results of the lookup query stored in the CountryCode variable are used to populate the BillCountry destination column. This ActiveX script and the lookup script execute for each row being transformed.
Click OK and then click OK again.
You have now configured the transformation script to use the lookup query.
Click OK to save your modifications of the Transform Data task.
On the Package menu, click Save As.
You will save this package to the DataMovementApplication folder, where you are saving all of the components of the data movement application.
Type mypassword in the Owner Password box, type C:\Microsoft Press\SQL DTS SBS\DataMovementApplication\UpdateCustomerDim.dts in the File Name box, and then click OK to save the UpdateCustomerDim package with the new lookup query into the folder for the data movement application.
Type mypassword in the Password box, click OK, and then click OK again to acknowledge that you cannot execute the package without the owner password.
Now that you have added this lookup query to the UpdateCustomerDim package and saved this package to the data movement application folder, you are ready to test its execution.
On the toolbar, click Execute.
Click OK to acknowledge that the package executed successfully, and then click Done.
Close the UpdateCustomerDim package in DTS Designer.
In the SQL Server Enterprise Manager console tree, expand SBS_OLAP in the Databases node, and then click Tables.
In the details pane, right-click CustomerStage, point to Open Table, and then click Return All Rows.
The inconsistently stored country names in the NewCustomers.txt file were transformed through the use of a lookup query and a lookup table before they were stored in the CustomerStage table. Each county name is now entered consistently in the CustomerStage table.
Close the [Data in Table ˜CustomerStage in ˜SBS_OLAP on ˜(Local) ] window in SQL Server Enterprise Manager.
You have successfully configured and executed a lookup query from within a Transform Data task. In the process, you have begun to learn a little about ActiveX transformation scripts. You will learn about the use of ActiveX transformation scripts using Microsoft Visual Basic Script in much more detail in Chapter 7, but you can already begin to see that some ActiveX scripting is required to take full advantage of the capabilities of DTS.