Quick Start


If you're like most people, when you get a new product, you like to see it in action as soon as possible. You want to just load it up and try it out. This chapter is about getting you into Integration Services as quickly as possible without worrying about all the details of how to do it and with no particular claims to thoroughness or depth of coverage. If you are familiar with Integration Services, you might want to skip this chapter or quickly read through it to catch the high points. If you have used Data Transformation Services in the past, you should work through this chapter to see how things have changed. If you're new to Integration Services and you've never used Data Transformation Services, this chapter is essential to establishing a context on which to build a solid understanding of Integration Services.

The Scenario

You want to build a process that will do rudimentary reporting on sales numbers. The report you'll build is a yearly summary but could easily apply to any periodic sales depending on the query you use to retrieve the sales data. You also want to have the numbers sent to you via email. More complex scenarios are possible, but this scenario is representative of the kinds of things you can do in Integration Services and yet is still pretty easy to follow.

Note

Make sure you have Integration Services set up with the AdventureWorks sample database installed and attached.


Creating a New Integration Services Project

The first thing you'll need to do is create a project. Projects are like workspaces and provide a way to manage your packages.

  1. From the Start menu, select Microsoft SQL Server 2005, Microsoft Business Intelligence Development Studio.

  2. When the Business Intelligence Development Studio appears, select New, Project from the File menu.

  3. In the Project Types window, select Business Intelligence Projects.

  4. In the Templates window, select Integration Services Project.

  5. In the Name field, type Quickstart.

  6. In the Location field, type (or select using the Browse button) the folder where you want to create your project. When you've completed entering the information, your screen should look similar to Figure 4.1.

    Figure 4.1. Create a new Integration Services project with the New Project dialog box

  7. Click the OK button.

The first thing you'll do is get the package started. In Integration Services, the package is the main container. You build up the package by dropping tasks on it and configuring them.

Starting the Package

  1. Show the Toolbox by pressing [Ctrl+Alt+X] or by clicking the View menu and then clicking Toolbox.

  2. Drag and drop a Data Flow Task from the Toolbox onto the main designer window with the tab labeled Control Flow.

  3. Rename the Data Flow Task to "Get Sales Data" by clicking on it once to select it and then clicking on the name.

  4. Click the Format menu and select Autosize to size the Data Flow Task to the name. Figure 4.2 shows the Toolbox, the resulting Data Flow Task, and the Control Flow designer so far.

Figure 4.2. Build the package by dropping tasks onto the control flow


Adding the Foreach Loop

The Foreach Loop is a container that enables portions of the workflow to be executed multiple times as it enumerates items in a collection. The values of the items in the collection can be passed as parameters to the workflow so that the workflow inside of the Foreach Loop is effectively controlled by the collection. In the scenario, the Foreach Loop only iterates once because there is only one row in the collection of rows.

  1. Drag and drop a Foreach Loop Container from the Toolbox onto the same designer window.

  2. Rename the Foreach Loop to "Send Mail With Results."

  3. Drag and drop a Send Mail Task from the Toolbox onto the Foreach Loop so that it appears within the Foreach Loop.

  4. Rename the Send Mail Task to "Send Report."

  5. Click the Format menu and choose Autosize to resize the Send Mail Task.

  6. Click the Data Flow Task you renamed to "Get Sales Data."

  7. There should be a small, dangling arrow extending from the bottom of the Data Flow Task. Drag the arrow to the Foreach Loop title bar and drop it to create a precedence constraint.

  8. From the Format menu, select Autolayout and then Diagram.

  9. From the Tools menu, select Options. In the Options dialog box, select the Business Intelligence Designers node, then select Integration Services Designers, and then select General. Under Accessibility, select Show Precedence Constraint Labels. Click the OK button. Your package should now look similar to the package shown in Figure 4.3.

    Figure 4.3. The Quickstart package control flow

Adding the Variables

You use variables in Integration Services to store values. In this scenario, you'll use variables to store sales numbers. You'll also define the Results variable to hold an ADO recordset object.

  1. If the Variables window is not visible, right-click on the designer window and select Variables.

  2. In the Variables window, click the icon in the upper-left corner to create a new variable.

  3. Call it TopSales2004. Make sure the type is Int32.

  4. Create another variable and call it SubTotal2004. Make sure the type is Int32.

  5. Create a third variable and call it AvgSales2004. Make sure the type is Int32.

  6. Create a fourth variable in the same way and call it Results. Choose Object as the data type.

  7. In the Variables window, click the icon in the upper-right corner to show the Choose Variable Columns dialog box. Put a check in the Namespace check box to show the Namespace column in the Variables window.

  8. For each of the variables you just created, change the namespace to SAMPLES.

  9. Click the File menu, and then click Save All or press [Ctrl+Shift+S] to save the entire project.

Figure 4.4 shows the Variables window after creating the variables.

Figure 4.4. The Variables window shows all variables in the package


Configuring the Data Flow

The Data Flow Task is a high-performance, data-processing engine. In this scenario, you'll be using it to summarize annual sales data from the AdventureWorks sample database. You'll store the resulting summarized data in a row that you'll later use in the Foreach Loop.

The OLE DB Source

The OLE DB Source Adapterretrieves rows from an OLE DB connection and places the row data into the Data Flow Task. The Data Flow Task uses adapters to access data in different formats and locations, making it possible to centrally process heterogeneous data.

  1. Double-click the Data Flow Task or click on the Data Flow tab at the top of the designer.

  2. From the Toolbox, drag and drop an OLE DB Source Adapter from the Data Flow Sources group of components.

  3. Rename it to "Retrieve Fiscal Sales per Person."

  4. Double-click on it.

  5. When the Editor dialog box opens, click the New button to open the Configure OLE DB Connection Manager dialog box.

  6. Click the New button to create a new data connection. Select the server on which the AdventureWorks database is installed. Figure 4.5 shows the Connection Manager dialog box.

    Figure 4.5. Create a new connection manager with the Connection Manager dialog box

  7. Click the OK button twice to return to the OLE DB Source Editor dialog box.

  8. In the Data Access Mode field, select SQL command.

  9. In the SQL Command Text field, type in the query to retrieve the sales data. Figure 4.6 shows the OLE DB Source Editor dialog box correctly configured with this query.

    Figure 4.6. Retrieve data from a database table or view using a SQL command

    SELECT "2004" FROM Sales.vSalesPersonSalesByFiscalYears 

  10. Click the Columns tab on the left of the dialog box.

  11. Ensure that the check box next to the column named 2004 is checked.

  12. Click the OK button.

The Aggregate Transform

The Aggregate transform is used to perform one or more aggregations on the data in the Data Flow task. It supports simple aggregations such as average and sum and more complex operations such as group-by and count distinct.

  1. Drag an Aggregate component onto the data flow.

  2. Rename it to Figure Sum, Avg, and Max.

  3. Click the OLE DB Source you just configured.

  4. Drag the green (on the left) data flow output to the Aggregate component and drop it.

  5. Double-click on the Aggregate transform.

  6. Check the 2004 column in the Available Input Columns window.

  7. In the Input Column column, select the 2004 column.

  8. For the Output Alias, type in 2004Total.

  9. For the Operation, choose Sum.

  10. On the next row, choose 2004 for the Input Column again.

  11. For the Output Alias, type in 2004Avg.

  12. For the Operation, choose Average.

  13. On the third row, choose 2004 for the Input Column again.

  14. Type in 2004Max for the Output Alias.

  15. For the Operation, choose Maximum.

  16. Click the OK button.

Figure 4.7 shows the Aggregate Transformation Editor properly configured.

Figure 4.7. The Aggregate Transformation Editor can perform multiple aggregations per column


The Data Conversion Transform

Integration Services is very type aware and will fail if you attempt to set the value of a variable or column of one type with a different type. But the columns you're retrieving from the tables are of type Currency. So you need to convert the types. You'll use a Data Conversion transform to do this.

  1. Drag a Data Conversion transform to the data flow.

  2. Click on the Aggregate transform you just configured and drag its output to the Data Conversion transform.

  3. Double-click on the Data Conversion transform.

  4. Select all three columns in the Available Input Columns window.

  5. For each column, name the Output Alias and Data Type, as shown in Figure 4.8.

    Figure 4.8. The Data Conversion transform can change the data type of a column

  6. Click the OK button.

The Recordset Destination

The Recordset Destination will take the row that you've built using the Aggregate transform and place it into an ADO.NET recordset.

  1. Drag a Recordset Destination to the data flow.

  2. Click on the Data Conversion transform and drag the green output to the Recordset Destination.

  3. Double-click on the Recordset Destination.

  4. In the VariableName property, type in "Results," which is the name of the variable of type Object you created earlier.

  5. Click the Input Columns tab and check the columns you named with the AsInt suffix. There should be three: 2004TotalAsInt, 2004AvgAsInt, and 2004MaxAsInt.

  6. Click the OK button.

Figure 4.9 shows the resulting editor for the Recordset Destination.

Figure 4.9. The Recordset Destination transform creates an ADO recordset


Now, the Data Flow Task is built. It will retrieve the 2004 sales numbers and then find the sum, average, and maximum sales. Then, using the Data Conversion transform, you modify the type of the aggregated results to be integers instead of type Currency, which will make it a little easier for you in the control flow. Finally, put the resulting row in a recordset. Figure 4.10 shows the resulting data flow.

Figure 4.10. The sales report data flow


Configuring the Foreach Loop

The Foreach Loop provides a way to iterate over a collection and assign the values in the collection to variables. In this case, there is only one row in the recordset and so the loop only iterates once. You'll assign the values from each column in the row to a variable. Later, you'll use the variables to build the very rudimentary report. You'll do this in two steps. First, you'll configure the enumerator so it can find the recordset. Then, you'll configure the variable mappings so it can correctly place the column values from the row into the variables.

Configuring the Enumerator

In this step, you tell the Foreach ADO Enumerator what variable is holding the recordset. At package execution time, the Foreach ADO Enumerator finds the variable, retrieves the recordset, and iterates through it.

  1. Select the Control Flow designer by clicking the Control Flow tab.

  2. Double-click on the Foreach Loop.

  3. In the left pane of the Foreach Loop Editor, select the Collection node.

  4. In the Enumerator property in the upper-right grid, select the Foreach ADO Enumerator.

  5. For the ADO Object Source Variable selection, select the User::Results variable. That's where you stored the recordset object.

  6. Ensure that the Rows in the First Table option is selected, which is the default.

Figure 4.11 shows the resulting Foreach Loop Editor with the Foreach ADO Enumerator properly configured.

Figure 4.11. The Foreach ADO Enumerator iterates over rows in a recordset


Configuring the Variable Mappings

The variable mappings are how Foreach Enumerators know where to place the values they extract from collections. In this case, the mappings are between the variables in the package and the columns in the recordset.

  1. In the left pane of the Foreach Loop Editor, select the Variable Mappings node.

  2. Click on the first row in the grid on the right and select User::SubTotal2004 as the variable.

  3. Ensure that zero is the index (which is the default).

  4. Click on the second row and select User::AvgSales2004.

  5. Set the index to 1. The index is the zero-based column index.

  6. Click on the third row and select User::TopSales2004.

  7. Set the index to 2.

  8. Figure 4.12 shows the resulting Variable Mappings dialog box.

Figure 4.12. Use the Variable Mappings node to map values from collections into variables


Configuring the Send Mail Task

Now, it's time to send mail with the numbers you've generated. Notice that the Send Report (Send Mail Task) has a red x icon. That's an indication that there were errors when the task was checked and that it needs to be configured. Figure 4.13 shows the error that's causing the problem.

Figure 4.13. The Send Mail Task must have a valid SMTP connection manager


You haven't set up an SMTP connection for the task or set up the actual message that you'll send. You'll do that in the following section.

The SMTP Connection Manager and Other Properties

The SMTP Task, like other components, uses a connection manager for making connections to server resources. The SMTP Task uses an SMTP Connection Manager, which you'll need to set up with the SMTP server name.

  1. Double-click on the Send Mail Task to open the Send Mail Task Editor.

  2. In the left pane, select the Mail node.

  3. In the SmtpConnection property, select New Connection.

  4. In the SMTP Connection Manager Editor dialog box, type the name of the SMTP server in the SMTP Server text box. For example, it could be smtp.yourdomain.com. Click the OK button.

  5. In the right pane of the Send Mail Task Editor, enter the appropriate addresses for the From and To properties.

  6. Enter a subject for the Subject property.

  7. Ensure that the MessageSourceType property lists Direct Input (which is the default). This tells the task that you'll directly enter the message into the MessageSource property, which you'll add in just a minute.

Figure 4.14 shows the resulting SMTP Connection Manager Editor and Send Mail Task Editor.

Figure 4.14. Specify the mail properties in the Mail node of the SMTP Task


The Message Source

The message source is the actual content of the email. You generate the message based on the results of the data flow as stored in the variables you populated with the Foreach Loop. To do this, you use a feature called property expressions. Property expressions are a way to associate an expression with a property. In this case, you'll build an expression that will generate the text for the message body.

  1. With the Send Mail Task Editor still open, click on the Expressions node in the left pane.

  2. On the right, there is a grid with one element called Expressions.

  3. Click on the Expressions entry, and then click on the button with the ellipsis.

  4. The Property Expressions Editor dialog box opens, which is shown in Figure 4.15.

    Figure 4.15. Use the Property Expressions Editor to manage expressions

  5. In the Property column, choose MessageSource. This tells the designer that the expression you're about to add should be associated with the MessageSource property.

  6. Click on the ellipsis button in the Expression column.

  7. The Expression Builder appears.

    In the Expression field, type in the following expression:

    "Here are the sales numbers for this year " + "Average Per Salesperson      " + (DT_WSTR, 20) @[SAMPLES::AvgSales2004] + " " + "Total Sales                  " + (DT_WSTR, 20) @[SAMPLES::SubTotal2004] + " " + "Top Salesperson Sales        " + (DT_WSTR, 20)@[SAMPLES::TopSales2004] 

  8. Click the Evaluate Expression button. It should correctly evaluate with zero for the sales numbers.

    Figure 4.16 shows the resulting Expression Builder.

    Figure 4.16. Use the Expression Builder to build property expressions

  9. Click the OK button three times to return to the designer.

You should now be ready to run the package. Click the little green triangle that looks like a Play button on the main toolbar, or you can simply press the F5 button. The package should execute in a few seconds and send email to the email address you've specified in the To field of the SMTP Task with the sales summary data. Figure 4.17 shows the designer Progress tab after the package has successfully completed. The Progress tab shows the progress of the package as it is executing.

Figure 4.17. Use the Progress tab to view the package as it executes


When you are finished running the package, click the Stop button, the square blue button next to the Play button, to reset the designer to design mode.



Microsoft SQL Server 2005 Integration Services
Microsoft SQL Server 2005 Integration Services
ISBN: 0672327813
EAN: 2147483647
Year: 2006
Pages: 200
Authors: Kirk Haselden

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net