Chapter 5: Creating an End-To-End Package


Now that you've learned about all the basic tasks and transforms in SSIS, you can jump into some practical applications for SSIS. You'll first start with a normal transformation of data from a series of flat files into SQL Server. Next you'll add some complexity to a transformation by archiving the files automatically. The last example will show you how to make a package that handles basic errors and makes the package more dynamic. As you run through the tutorials, remember to save your package and project often to avoid any loss of work.

Basic Transformation Tutorial

As you can imagine, the primary reason that people use SSIS is to read the data from a source and write it to a destination after it's massaged. This tutorial will walk you through a common scenario where you want to copy data from a flat file source to a SQL Server table without massaging the data. Don't worry; things will get much more complex later in your next package.

Start the tutorial by going online to the Wiley Web site and downloading the sample extract that contains zip code information about cities in Florida. The zip code extract was retrieved from public record data from the 1990 census and has been filtered down to just Florida cities to save on your download time. You'll use this in the next tutorial as well, so it's very important not to skip this tutorial. You can download the sample extract file called ZipCode.txt from this book's Web page at www.wrox.com. Place the file into a directory called C:\SSISDemos.

Open Business Information Development Studio (BIDS) and select File New Project. Then select Integration Services Project as your project type. Type ProSSISChapter5 as the project name, and accept the rest of the defaults (as shown in Figure 5-1).

image from book
Figure 5-1

The project will be created, and you'll see a default Package.dtsx package file in the Solution Explorer. Right-click on the Package.dtsx file in the Solution Explorer and select Rename. Rename the file ZipLoad.dtsx. When you're asked if you'd like to rename the package as well, click Yes. If the package isn't opened yet, double-click on it to open it in the Package Designer.

Creating Connections

Now that you have the package ready to begin, you need to create a shared connection that can be used across multiple packages. In the Solution Explorer, right-click on Data Sources and select New Data Source. This opens the Data Source Wizard. Select the "Create a data source based on an existing or new connection" radio box and click New, which opens the window to create a new Connection Manager.

Note

There are many ways you could have created the connection. For example, you could have created it as you're creating each source and destination. Once you're more experienced with the tool, you'll find what works best for you.

Your first Connection Manager will be to SQL Server, so select Native OLE DB\SQL Native Client. For the Server Name option, type the name of your SQL Server and enter the authentication mode that is necessary for you to read and write to the database. Lastly, select the AdventureWorks database and click OK. If you don't have the AdventureWorks database, you can pick any other database on the server. You can optionally test the connection. You will then have a data source in the Data Source box that should be selected. Click Next and name the data source AdventureWorks.

You'll use other connections as well, but for those, you'll create connections that will be local to the package only and not shared. With the ZipLoad package open, right-click in the Connection Managers box below and select New Connection from Data Source. You should see the AdventureWorks data source you created earlier. Select that data source and click OK. Once the Connection Manager is created, right-click on it and rename it AdventureWorks if it's not already named that. This is, of course, optional and just keeps us all on the same page.

Next, create a Flat File connection and point it to the ZipCode.txt file in your C:\SSISDemos directory. Right-click in the Connection Manager area of Package Designer, and select New Flat File Connection. Name the connection ZipCode Extract and type any description you like. Point the File Name option to C:\SSISDemos\ZipCode.txt or browse to the correct location by clicking Browse.

You need to set the Format drop-down box to Delimited with <none> set for the Text Qualifier option. The Text Qualifier option allows you to specify that character data is wrapped in quotes or some type of qualifier. This helps you when you have a file that is delimited by commas and you also have commas inside some of the text data that you do not wish to separate by. Setting a Text Qualifier will ignore those commas inside the text data. Lastly, select "Tab {t}" from the Header Row Qualifier drop-down box and check the "Column names in first data row" option. This states that your first row contains the column names for the file. Your final configuration for this page should look like Figure 5-2.

image from book
Figure 5-2

You can go to the Columns page to view a preview of the first 101 rows and set the row and column delimiters. The defaults are generally fine for this screen. The Row Delimiter option should be set to {CR}{LF}, which means that a carriage return separates each row. The Column Delimiter option should have carried over from the first page and will again be set to "Tab {t}". In some extracts that you may receive, the header record may be different from the data records and the configurations won't be exactly the same as in the example.

The Advanced page is where you can specify the data types for each of the three columns. The default for this type of data is a 50-character string, which is excessive in this case. Click Suggest Types to comb through the data and find the best data type fit for the data. This will open the Suggest Column Types dialog box, where you should accept the default options and click OK.

You can now see that the data types in the Advanced page have changed for each column. One column in particular was incorrectly changed. When combing through the first 100 records, the Suggest Column Types dialog box selected a "two-byte signed integer [DT_I2]" for the zip code column. While this would work for the data extract you have, it won't work once you get to some states that have zip codes that begin with a zero. Change this column to a string by selecting string [DT_STR] from the DataType drop-down box, and change the length of the column to 5 by changing the OutputColumnWidth option (shown in Figure 5-3). The last configuration change is to change the TextQualified option to False and click OK.

image from book
Figure 5-3

Creating the Tasks

With the first few connections created, you can go ahead and create your first task. In this tutorial, you'll have only a single task, which will be the Data Flow task. In the Toolbox, drag the Data Flow task over to the design pane in the Control Flow tab. Next, right-click on the task and select Rename to rename the task "Load ZipCode Info."

Creating the Data Flow

Now comes the more detailed portion of almost all of your packages. Double-click on the task to drill into the data flow. This will automatically take you to the Data Flow tab. You'll see that "Load ZipCode Info" was transposed to the Data Flow Task drop-down box. If you had more than this one Data Flow task, then more would appear as options in the drop-down box.

Drag and drop a Flat File Source onto the data flow design pane, and then rename it "Florida ZipCode File." All the rename instructions in these tutorials are optional, but they will keep you on the same page and make your operational people happier because they'll understand what's failing. Open the "Florida ZipCode File" source and point it to the Connection Manager called ZipCode Extract. Go to the Columns page and take notice of the columns that you'll be outputting to the path. You've now configured the source, and you can click OK.

Next, drag and drop a SQL Server Destination onto the design pane and rename it AdventureWorks. Connect the path (green arrow) from the "Florida ZipCode File" source to the AdventureWorks destination. Double-click on the destination and select AdventureWorks from the Connection Manager drop-down box. For the Use a Table or View option, click the New button next to the drop-down box. This is how you can create a table inside BIDS without having to go back to SQL Server Management Studio. The default DDL for creating the table will use the destination's name (AdventureWorks), and the data types may not be exactly what you'd like, as shown below:

 CREATE TABLE [AdventureWorks] (     [Zipcode] VARCHAR(5),     [State] VARCHAR(2),     [ZipName] VARCHAR(16) ) 

Suppose this won't do for your picky DBA, who is concerned about performance. In this case, you should rename the table ZipCode (taking out the brackets) and change each column's data type to a more suitable size and type:

 CREATE TABLE ZipCode (     Zipcode CHAR(5),     State CHAR(2),     ZipName VARCHAR(16) ) 

Once you have completed changing the DDL, click OK and the table name will be transposed into the table drop-down box. Finally, go to the Mapping page to ensure that the inputs are mapped to the outputs correctly. SSIS attempts to map the columns based on name, and in this case, since you just created the table with the same column names, it should be a direct match, as shown in Figure 5-4.

image from book
Figure 5-4

Once you've confirmed that the mappings look like Figure 5-4, click OK.

Completing the Package

With the basic framework of the package now constructed, you need to add one more task into the control flow to ensure that you can run this package multiple times. To do this, click on the Control Flow tab and drag an Execute SQL task over to the design pane. Rename the task "Purge ZipCode Table." Double-click on the task and select AdventureWorks from the Connection drop-down box. Finally, type the following query for the SQLStatement option (you can also click the ellipsis button and enter the query):

 DELETE FROM ZipCode 

Click OK to complete the task configuration. Connect the task as a parent to the "Load ZipCode Info" task. To do this, click the "Purge ZipCode Table" task and drag the green arrow onto the "Load ZipCode Info" task.

Saving the Package

Your first package is now complete. Go ahead and save the package by clicking the Save icon in the top menu or by selecting File Save Selected Items. It's important to note here that by clicking Save, you're saving the .DTSX file to the project, but you have not saved it to the server yet. To do that, you'll have to deploy the solution or package. The tutorial will cover that in the last section of this chapter.

Executing the Package

With the package complete, you can attempt to execute it. Do this by selecting the green arrow in the upper menu. You can also right-click on the ZipCode.dtsx package file in the Solution Explorer and select Execute Package. The package will take a few moments to compile and validate, and then it will execute.

You can see the progress under the Progress tab or in the Output window. In the Controller Flow tab, you'll see the two tasks go from yellow to green (hopefully). If both turn green, then the package execution was successful. In the event your package failed, you can look in the Output window to see why. The Output window should be open by default, but in case it's not, you can open it by clicking View Other Windows Output.

You can go to the Data Flow tab to see how many records were copied over. You can see the Controller tab in the left image in Figure 5-5 and the Data Flow tab in the right image. Notice the number of records displays in the path as SSIS moves from transform to transform.

image from book
Figure 5-5

By default, when you execute a package, you'll be placed in debug mode. Changes you make in this mode will not be made available until you run the package again. To break out of this mode, click the square stop icon or click Stop Debugging under the Debug menu.



Professional SQL Server 2005 Integration Services
Wireless Java : Developing with Java 2, Micro Edition
ISBN: 189311550X
EAN: 2147483647
Year: 2006
Pages: 182

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net