Oracle


Since SSIS made its appearance, some people who would never consider using a Microsoft database are purchasing SQL Server licenses just for the SSIS functionality. Many Oracle IT customers have indicated that they want to continue to use Oracle as their source or target data, and use SSIS just as an integration platform. SSIS can help Oracle users in three ways: help migrate from Oracle, help load data into Oracle, and help synchronize data between database systems, no matter the type.

Before using SSIS to connect to Oracle, you must first install the Oracle client network components on the machine running SSIS, which can be found on your Oracle CD or online at www.oracle.com. After these components are installed, you will need to edit the tnsnames.ora file, which is typically in the c:\oracle\ora92\network\admin directory in a default installation. This file will have a set of lines for each server you want to connect to. For example, if you had a server called FullyQualifiedServerName, your entry may look like this:

 MyServer = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.10.10.10)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = MyServer) ) )

In SSIS, you would then use MyServer property for your server name, much like an alias in SQL Server. This file can be edited in any editing tool like Notepad, and it adds a slightly new layer of complexity for someone who’s more familiar with SQL Server. Don’t forget to also install the Oracle tools on any server that will be running the package (such as your QA or production server).

Reading from Oracle

You have many different options when you choose to read data out of Oracle. Each has its pros and cons. First, it comes down to a choice between connection providers: the Oracle OLE DB Provider for Oracle, the Microsoft OLE DB Provider for Oracle, or the ADO.NET Provider for Oracle (from Microsoft). There’s no silver bullet, though, that says to always use one driver over another. Really, this will depend on the type of problem you’re trying to solve.

The first question is what adapter you should use to pull data out of Oracle in the data flow. You have the option to use the Microsoft or Oracle provider. In the data flow, you can use the OLE DB Source or a Data Reader Destination. If you choose to use the Microsoft OLE DB Provider for Oracle, an issue with numeric data types may come up occasionally in older releases of Oracle. This issue was resolved when Oracle began using the Integer data type. This issue is where sometimes the provider guesses the precision and scale incorrectly. In this case, you can open the Advanced Editor to fix the issue.

To use the Advanced Editor, right-click the source component in the data flow and select Show Advanced Editor. Next, go to the Input and Output Properties pane and drill into the OLE DB Source Outputimage from bookOutput Columns. Select the column you want to correct and change the Data Type, Precision, and Scale as necessary (see Figure 10-18). Oftentimes, the Precision and Scale properties will be set to 18 and 0, respectively, which will not hold numbers to the right of the decimal.

image from book
Figure 10-18: Changing the Data Type, Precision, and Scale

The Microsoft provider for Oracle also does not read CLOB, BLOB, BFILE, and UROWID data types. If you need to read from a table that uses one of those data types, you must use the Oracle OLE DB Provider for Oracle. From a speed perspective, it is recommended that you use the Oracle OLE DB Provider for Oracle.

There are other providers that you can use to read data out of Oracle. It’s a constantly expanding list, but as of this writing, the following table shows you a list of providers that can be used to read data from Oracle. Only some of these providers can be used from within SSIS, although many of the ones that can’t are planning to add compatibility.

Open table as spreadsheet

Vendor

API

Version

SSIS Natively Compatible

Microsoft

OLE DB

7.3.4 and later

Yes

Microsoft

ADO.NET

8.0 and later

Yes

Oracle Corp

OLE DB

8i and later

Yes

Oracle Corp

ADO.NET

8i and later

Yes

Oracle Corp

ODBC

8i and later

No

Microsoft

ODBC

8i and later

No

Attunity

OLE DB

9i and later

Yes

DataDirect

OLE DB

8i and later

No

DataDirect

ADO.NET

8i and later

No

ETI

Bulk Load

9.0 and later

Yes

Code Page Issues

When you create a new OLE DB Source in the data flow that doesn’t use SQL Server, you may receive a warning that won’t go away. This warning would show the following warning message when you hover over the OLE DB Source and in the logs at execution time:

 Cannot retrieve the column code page info from the OLE DB Provider. If the component supports the "DefaultCodePage" property, the code page from that property will be used. Change the value of the property if the current string code page values are incorrect. If the component does not support the property, the code page from the component's locale ID will be used.

This message is displayed because SSIS is not able to get an answer from the data provider on what code page to use. You can fix this warning by selecting the OLE DB Source and changing the AlwaysUseDefaultCodePage property to True. The warning will immediately go away.

Parameterized Queries

Another issue with pulling data out of Oracle is that it does not handle parameterized queries in the OLE DB Source or in the Execute SQL Task. The workaround for this is to use expressions. At runtime, the expression will either set a variable that contains a query, or the query text, depending on the usage. For the Execute SQL Task, you must use an expression on the SQLStatementSource property in the task. When using the expression with variables, keep in mind that you will have to cast variables of data types that don’t match to a string to match the SQL statement. This can be seen in Figure 10-19. In this example, the goal was to parameterize a query to pass in a variable that was a datetime. To do that, you must use the (DT_STR) casting function and pass the length and code page of the target value. After doing that, a variable of any data type will concatenate into a string as shown in the following query:

 "SELECT * FROM TableName WHERE StartTime > " +(DT_STR, 30, 1252) @[User::Query_StartTime]+""

image from book
Figure 10-19: Casting variables of data types that don’t match to a string to match the SQL statement

For an OLE DB source in the data flow to be dynamic, you must use a variable and set an expression on the variable. To do this, first create a string variable to store the query. Next, select the variable. In the Properties window, change the EvaluateAsExpression property to True. Above that property, set the Expression property to a dynamic query, as you did in the Execute SQL example (see Figure 10-19).

The final step is to configure the OLE DB source in the data flow. In the source, change the Data access mode to SQL command from variable. Then, specify the variable that had the expression on it for the Variable name property, as shown in Figure 10-20.

image from book
Figure 10-20: Specifying the variable that had the expression on it

Writing Data to Oracle

Writing data into Oracle is a tough situation. The predicament that you’re in is that Microsoft has not written a provider that will bulk load data quickly into Oracle (such as the Fast Load option in SQL Server). This is not an investment that you should expect Microsoft to make, since it would be very resource-intensive to test and improve every third-party vendor’s connectivity components, and they may see it as prolonging Oracle’s life span in your environment. To give you an idea on the poor performance, a 50 MB file with one million rows took approximately 50 minutes to load into Oracle using the OLE DB Destination in the data flow on a dual-core laptop. Of course, pulling data out of Oracle was much, much faster.

However, there are some workarounds to this performance problem. One solution is to stage the data into a text file and use sqlldr (Oracle’s Bulk Loader) to load the data. You would call the sqlldr command in the Execute Process Task. Although it is a clunky solution, it does allow for a great restartability opportunity with checkpoints.

The best solution (if the budget allows) at this point is to build an interface into sqlldr using a Script destination or a canned component. One canned component is from Persistent Systems (www.persistentsys.com/products/ssisoracleconn/ssisoracleconn.htm) and they have built an out-of-the-box, optimized Oracle destination. In one test, that same 50-minute load dropped to under a minute with their destination or the sqlldr command line tool.



Expert SQL Server 2005 Integration Services
Expert SQL Server 2005 Integration Services (Programmer to Programmer)
ISBN: 0470134119
EAN: 2147483647
Year: 2004
Pages: 111

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net