XML Data


SQL 2005 provides many ways to work with XML. The XML Data Source is yet another jewel in the SSIS treasure chest. It enables you to import an XML file directly into relational tables if that is what you need to do. In this example, you will import an RSS (Really Simple Syndication) file from the Web. You may not want to let your manager know how easy this is!

Create a new Integration Services project to get started. Add a Data Flow task to the Control Flow design area. Click the Data Flow tab to view the Data Flow design area. Add an XML Source and name it SQLNews. Double-click the SQLNews component to open the XML Source Editor. Make sure that Connections Manager is selected on the left. Select XML File Location for the Data Access Mode. For the XML location property, type in the following address:

  • http://www.sqlservercentral.com/sscrss.xml

If you click the Browse button, a regular file open dialog box opens. It is not obvious at first that you can use a URL address instead of a file on disk.

The XML file must be defined with an XML Schema Definition (XSD), which describes the elements in the XML file. Some XML files have an in-line XSD, which you can determine by opening the file and looking for xsd tags. There are many resources and tutorials available on the Web if you would like to learn more about XML schemas. If the file you are importing has an in-line schema, make sure that Use Inline Schema is checked. If an XSD file is available, you can enter the path in the XSD location property (see Figure 8-38). In this case, you will create the XSD file by clicking Generate XSD. Once the file is generated, you can open the file with Internet Explorer to view it if you are interested in learning more.

image from book
Figure 8-38

Now that the SQLNews component understands the XML file, click Columns. You will notice a drop-down box next to Output Name listing Channel and Item (see Figure 8-39).

image from book
Figure 8-39

Even though the XML document is one file, it represents two tables with a one-to-many relationship. If you browse to www.sqlservercentral.com/sscrss.xml, you'll see a channel, which describes the source of the information, usually news, and several items, or articles, defined (see Figure 8-40). One note of caution here: If you are importing into tables with primary/foreign key constraints, there is no guarantee that the parent rows will be inserted before the child rows. Be sure to keep that in mind as you design your XML solution.

image from book
Figure 8-40

The properties of the channel and item tags match the columns displayed in the XML Source Editor. At this point, you can choose which fields you are interested in importing and change the output names if required.

Create a new Connection Manager pointing to the AdventureWorks database or to another test database. Add an OLE DB Destination component to the design area and name it Channel. Drag the Data Flow Path (green arrow) from the XML Source to Channel. Because the XML data represents two tables, an Input Output Select box opens (see Figure 8-41). Choose Channel in the Output drop-down and click OK.

image from book
Figure 8-41

Double-click the Channel icon to bring up the OLE DB Destination Editor. Make sure that the OLE DB Connection Manager property is set to point to your sample database. Next to Use Table or View, click New. A window with a table definition will pop up. Click OK to create the table. Click Mappings and then click OK to accept the configuration.

Add a second OLE DB Destination component and name it NewsItem. Drag a Data Flow Path (green arrow) from the XML Source to the NewsItem component. This time, the designer will automatically set up the connection to use the Item table. Double-click the NewsItem destination component and verify that the Connection Manager property is set to the test database. Click the New button next to Name of the Table or the View to see a Create Table statement. Notice that the description field is only 255 characters. Modify the statement, increasing the number of characters to 2000, and click OK to create the table (see Figure 8-42). Click Mappings to view and set the mappings, and then click OK to accept the configuration.

image from book
Figure 8-42

Add one more OLE DB Destination component and name it Errors. Drag a red Data Flow Path from the XML Source to the Errors component. An Input Output Selection dialog box opens. Select Item Error Output (Error Output) in the Output option (see Figure 8-43). Click OK.

image from book
Figure 8-43

The Configure Error Output dialog box will then open. In the Truncation property of the Description row, change the value to Redirect row (see Figure 8-44) and click OK.

image from book
Figure 8-44

Double-click Errors and make sure it is pointing to the test database. Click New next to Name of the Table or the View and OK to create an Errors table. Click Mappings to accept the mappings and then click OK to save the configuration. The Data Flow design area should now resemble Figure 8-45.

image from book
Figure 8-45

Run the package. If it completed successfully, some of the rows will be added to the NewsItem table. Any row with a description over 2000 characters long will end up in the Errors table.



Professional SQL Server 2005 Integration Services
Wireless Java : Developing with Java 2, Micro Edition
ISBN: 189311550X
EAN: 2147483647
Year: 2006
Pages: 182

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net