Transactions | Expert SQL Server 2005 Integration Services (Programmer to Programmer)

Checkpoint files are only the first tool in your SSIS arsenal to help with restartability and those middle-of-the-night calls. A more important problem is managing data state. For example, if you were to run a package that loads three tables and the third table was to fail, ideally, you would want all three tables to roll back. Otherwise, you may not know what state your data is in to fix the problem and rerun the package.

Transactions in SSIS allow you to protect your data and envelope any data change that occurs in the package in that protection. Transactions can be either on the entire package, a container, or an individual task. They can be nested just like in TSQL. What is important to note about transactions is that only data-related tasks are protected. If your package were to archive a file, for example, that file would not be un-archived upon a package failure (unless you explicitly created some kind of compensating action to perform such duties).

Let’s take the previous package you created and wrap the package in a transaction. First, let’s disable the checkpoint from the previous example by changing the CheckpointUsage package property to Never and the SaveCheckpoints property to False. Next, delete all the records from the RestartabilityExample table.

With the example now reset back to its original state, you’re ready to enable transactions on the package. In the Properties window for the package, change the TransactionOption at the bottom of the window to Required. By changing this from Supported to Required, you have created a transaction that envelopes the package. Any task or container that has this same property set to Supported will join the transaction. By default, each task is set to join the transaction with the TransactionOption of Supported set.

For you to enable this type of transaction, though, you need the Microsoft Distributed Transaction Coordinator (MSDTC) started on each server that you want to participate in the transaction. So, if you have a server that changes data on two servers and runs on a third server, you would need MSDTC running on all three servers. Transactions can then protect data nearly from any database like SQL Server, Oracle, and UDB, since you have externalized the transaction-handling to another service. The caveat with this, though, is the database must be running on a Windows machine, since MSDTC is a Microsoft service.

Before executing a package that has transactions enabled, you must ensure that MSDTC is started. You can start the MSDTC service in the Services applet, or you can type the following command from a command prompt: NET START MSDTC. If you did not have the service started, you would receive the following error when executing the package:

 Error: The SSIS Runtime has failed to start the distributed transaction due to error 0x8004D01B "The Transaction Manager is not available.". The DTC transaction failed to start. This could occur because the MSDTC Service is not running.

After having the service started, let’s break the package yet again. Select the last task, which is named Insert Record 2, and change the ForceExecutionResult package property to Failure. Execute the package again and you should see the results as shown in Figure 7-11. This time, if you were to view the records in the AdventureWorks database, you should see no new rows. This is because the purging of the table occurred, then the first record was inserted, and when the final task failed, both it and the first two operations (within the same transaction) were rolled back.

image from book
Figure 7-11: Result of package execution

Let’s now try a slight variation of the package. This time, select the Insert Record 1 Task and change the TransactionOption property to NotSupported. Execute the package again, and note that this time it will hang on the second step because there is a table lock currently on the table from the first step (because the task is deleting all the records). If the Execute SQL Task were only reading data out of the table, then you could have changed the IsolationLevel property to ReadUnCommitted and the process could have continued, since it would be allowed to read dirty data. Since this operation is writing to the table that has a lock, you cannot fix this problem without addressing the first task.

To fix this problem, stop the execution and then select the first Package Properties window and change the IsolationLevel property to ReadUncommitted. Rerun the package again, and the package should still fail on the last task, as it did previously. You can now go to Management Studio and see that the first record is now in the table. This is because setting the TransactionOption property to NotSupported tells the task to not participate in the transaction. Run the package again and you’ll see the 1 record in the table twice, showing that the DELETE statement rolled back as well.

As you can see, transactions are an incredible way to protect your data. They ensure that if a problem happens you do not end up in a state with half of your data not loaded and the other half in limbo, leaving you to figure out what to undo. You can also nest transactions inside of other transactions just like TSQL. To do this, you could use sequence containers, as shown in Figure 7-12.

image from book
Figure 7-12: Sequence Containers

Figure 7-12 would represent the following set of TSQL statements:

 BEGIN TRAN DELETE FROM RestartabilityExample INSERT INTO RestartabilityExample Values (1)         BEGIN TRAN                 INSERT INTO RestartabilityExample Values (2)                 INSERT INTO RestartabilityExample Values ('a')         COMMIT COMMIT

This statement will fail on the last INSERT statement, since you’re inserting a character value into an integer column. The results would be that the table would be completely put back to its original state because the inner transaction rolls back the outer transaction. The same thing would occur in the package shown in Figure 7-12. If the container and package had the TransactionOption set to Required and all other tasks were set to Supported, then all the data changes would be rolled back, even though the problem occurred inside the container.

However, if you were to change the container’s TransactionOption to NotSupported, and the package’s TransactionOption to Required, you would see that transaction never rolled back and the records are in the table. This is because any task inside the container would not participate in the transaction, since the tasks inherited the container’s transaction or lack of transaction.

Important

This is probably a good spot to give you a word of warning about using these restartability features in SSIS. When you use transactions and checkpoint files together, it’s important to really plan well prior to development. There is a danger that when you use them in conjunction and don’t plan the package well, the transaction may roll the data back, but then the checkpoint starts on the third or fourth step in the package, skipping over the data that needs to be reloaded. The only way to solve this problem is proper planning and some good whiteboard sessions.