Excel


Excel is the favorite "database" software of many people without database expertise. We were amazed when an acquaintance of one of the authors confessed that she used Excel to do everything, even write letters. SQL Server Integration Services has Data Flow source and destination components made just for Excel. You can be sure that these components will be used in many SSIS packages, because data is often imported from Excel files into a SQL Server database or exported into Excel for many high-level tasks such as sales forecasting. Because Excel is so easy to work with, it is common to find inconsistencies in the data. For example, while possible to implement, it is less likely for an Excel workbook to have lookup lists or data type enforcement in place. It's often possible for the person entering data to type a note in a cell where a date should go. Of course, cleansing the data is part of the ETL process, but it may be even more of a challenge when importing from Excel.

In this section, you'll look at both exporting to and importing from Excel as the AdventureWorks staff performs their annual inventory.

Exporting to Excel

The easiest way to export data from SQL 2005 to an Excel file is to use the SQL Server Import and Export Wizard. Since using the wizard is covered in Chapter 2, you will use BIDS to create SSIS packages to import and export Excel data. The first example shows how to create a package that exports a worksheet the AdventureWorks inventory staff will use to record the physical inventory counts.

Create a new Integration Services Project in BIDS. Drag a Data Flow task from the Toolbox to the Control Flow design area and then switch to the Data Flow tab. Add an OLE DB Source and an Excel Destination. Drag the Data Flow path (green arrow) from the OLE DB Source to the Excel Destination.

It is always good practice to use meaningful names when designing your SSIS packages, so change the name of the OLE DB Source to Products. Name the Excel Destination Inventory Worksheet. You can name the components by using the Properties window or by directly clicking into the name of the actual component. The source and destination components in the Data Flow design area should resemble Figure 8-1.

image from book
Figure 8-1

Create a Connection Manager pointing to the AdventureWorks database as described in Chapter 3. Create a second Connection Manager pointing to an Excel Workbook file. Right-click in the Connection Managers area and select New Connection. Choose Excel from the list of Types and click Add. The Excel Connection Manager window opens, where you can enter a file path to a new or existing Excel file. Select the appropriate version of Excel and make sure that First Row has Column Names is checked (see Figure 8-2). Click OK to dismiss the dialog box. You can rename the Excel Connection Manager in the Properties window. Name it Worksheet Destination.

image from book
Figure 8-2

Double-click the Products component to bring up the OLE DB Source Editor. Make sure that Connection Manager is selected on the left. Choose the AdventureWorks Connection Manager for the OLE DB Connection Manager property. The Data access mode should be set to SQL Command. In this case, you will write a query to specify which data to export:

 SELECT ProductID, LocationID, Shelf, Bin,   Null as PhysicalCount FROM Production.ProductInventory ORDER by LocationID, Shelf, Bin 

If you select Columns in the left pane, you have the opportunity to deselect some of the columns or change the name of the output columns (see Figure 8-3). Click OK to accept the configuration.

image from book
Figure 8-3

Double-click the Inventory Worksheet component to bring up the Excel Destination Editor. With Connection Manager tab selected, make sure that Worksheet Destination is set in the OLE DB Connection Manager property.

The Data Access Mode should be set to Table or View (more about this later). Next to Name of the Excel sheet, click New and a dialog box with a Create Table statement will pop up. You can modify the table name or the column properties if necessary in this dialog box (see Figure 8-4). Click OK to create a new worksheet with the appropriate column headings in the Excel file. Make sure that Name of the Excel Sheet is set to Inventory Worksheet.

image from book
Figure 8-4

You must click Mappings on the left to set the mappings between the source and destination. Each one of the Available Input Columns should match up exactly with an Available Output Column (see Figure 8-5). Click OK to accept the Inventory Worksheet settings.

image from book
Figure 8-5

Run the package to export the product list. The fields selected in the Production.Inventory table will be exported to the Excel file, and your inventory crew members can each use a copy of this file to record their counts.

In this example, you started with a blank Excel workbook file. But what would happen if the file already had a worksheet set up with column headings that did not match up perfectly to the input columns? You can manually map input columns to output columns. Figure 8-6 shows the mappings when the Production.Product table is exported to an Excel spreadsheet with destination columns that are named differently than the source and an extra column that doesn't exist in the source.

image from book
Figure 8-6

Alternatively, you could write a query in the source or the destination component to control how the fields match up. To write a query to select the correct columns from the spreadsheet, change the Data Access Mode from Table or View to SQL Command in the Excel Destination component. A SQL command text box will replace the Name of the Excel Sheet text box. At this point, you can either type a command into the SQL command text box or click Build Query for assistance writing the query.

Importing from Excel

Assume that the AdventureWorks inventory crew divided up the assignments according to the product location. As each assignment is completed, a partially filled-out worksheet file is returned to you. In this example, you'll create a package to import the data from each worksheet that is received and update the Production.ProductInventory table with the physical counts.

In order to follow along with the example, manually update the PhysicalCount column of a few of the rows in the spreadsheet that you created in the previous section. Look up and use the actual counts found in the AdventureWorks Production.ProductInventory table for some of the rows. In SQL Server Management Studio, click New Query and run the following query. It will give you the first 10 product ID numbers and quantities (see Figure 8-7).

image from book
Figure 8-7

 USE AdventureWorks GO SELECT TOP 10 ProductID, Quantity FROM Production.ProductInventory ORDER BY LocationID, Shelf, Bin 

Make sure to update some of the rows in the spreadsheet with the exact count found in the AdventureWorks database and make up a count for the others (see Figure 8-8). Save and close the Excel file.

image from book
Figure 8-8

Open BIDS and create a new Integration Services Project. Drag a Data Flow task to the Control Flow design pane. Open the Data Flow tab and add an Excel Source and an OLE DB Destination component. Rename the Excel Source to Inventory Worksheet. Rename the OLE DB Destination to Inventory Import. Drag the Data Flow Path from the Inventory Worksheet component to the Inventory Import component.

Note

The OLE DB Destination sometimes works better than the SQL Server Destination component for importing data from non-SQL Server sources! When using the SQL Server Destination component, you cannot import into integer columns or varchar columns from an Excel spreadsheet and must import into double precision and nvarchar columns. The SQL Server destination component does not support implicit data type conversions and works as expected when moving data from SQL Server to SQL Server.

Create a Connection Manager for the Excel file you have been working with by following the instructions in the previous section. Rename the Excel Connection Manager in the Properties window to Inventory Source. Create a Connection Manager pointing to the AdventureWorks database.

Double-click the Inventory Worksheet component to bring up the Excel Source Editor. For the OLE DB Connection Manager setting, select Inventory Source from the list. If you set the Data Access Mode to Table or view, you will see a list of spreadsheets and named areas under Name of the Excel spreadsheet (see Figure 8-9). The $ character designates a spreadsheet name.

image from book
Figure 8-9

For this example the Data access mode should be set to SQL Command because we only want to import rows with the physical count filled in. Type the following query into the SQL command text box (see Figure 8-10):

 SELECT ProductID, PhysicalCount, LocationID, Shelf, Bin FROM Inventory_Worksheet WHERE PhysicalCount IS NOT NULL 

image from book
Figure 8-10

Double-click the Inventory Import component to bring up the OLE DB Destination Editor. Make sure the AdventureWorks connection is chosen. Under Data Access Mode, choose Table or View. Click the New button next to Name of the Table or the View to open the Create Table dialog box. Change the name of the table to InventoryImport and change the DOUBLE PRECISION columns to INTEGER in the script (see Figure 8-11).

image from book
Figure 8-11

Click OK to create the table. Select Mappings. Each field from the worksheet should match up to a field in the new table. Click OK to accept the configuration.

Move back to the Control Flow tab. Drag an Execute SQL Task from the Toolbox. Double-click it to view the properties. Change the name to Update Inventory. Make sure that the SQLSourceType is Direct Input. Click the ellipsis button next to the SQLStatement property for a dialog box where you can type in this query:

 UPDATE PI SET Quantity = PhysicalCount,   ModifiedDate = getDate() FROM Production.ProductInventory PI INNER JOIN InventoryImport II On II.ProductID = PI.ProductID   and II.LocationID = PI.LocationID   and II.Shelf = PI.Shelf   and II.Bin = PI.Bin 

Select the connection that points to the AdventureWorks database in the Connection property and click OK to accept the configuration. Drag the Precedence Constraint (green arrow) from the Data Flow task to the Update Inventory task.

Add a second Execute SQL task to the Control Flow pane. Change the name to Truncate Import Table. Double-click the component to open the Execute SQL Task Editor. Choose the connection pointing to AdventureWorks in the Connection property. Make sure that the SQLSourceType is set to Direct Input. Type the following in the SQLStatement property:

 TRUNCATE TABLE InventoryImport 

Click OK to accept the configuration. Drag the Precedence Constraint from the Truncate Import Table task to the Data Flow task. The Control Flow design area should now resemble Figure 8-12.

image from book
Figure 8-12

Now you're ready to run the package and see how it works. Once the package has completed, go back to SQL Server Management Studio and rerun the query to see the updated counts (see Figure 8-13):

 Select top 10 ProductID, Quantity From Production.ProductInventory Order by LocationID, Shelf, Bin 

image from book
Figure 8-13

While this is a simple example, it illustrates just how easy it is to import from and export to a non-SQL Server data source. Now you'll move on to Access with its own interesting twists.



Professional SQL Server 2005 Integration Services
Wireless Java : Developing with Java 2, Micro Edition
ISBN: 189311550X
EAN: 2147483647
Year: 2006
Pages: 182

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net