Performance Management


The essence of SSIS performance management is in understanding the mechanics of the processing engine and then making adjustments in the package design to work with these mechanisms. Package design is usually an iterative process that requires several design cycles to discover inefficiencies and to make appropriate design changes. Using the debugging and logging tools, you will learn to work through this process effectively and hone packages into their most simple and efficient form. The following principles define the core components of effective performance management:

  • Identifying execution trees

  • Reading and understanding execution plans

  • Optimization through iterative design

  • Effective use of execution parallelism

  • Maximizing CPU use

  • Performance monitoring

  • Using SSIS log reports

Execution Trees

When a data flow begins from a data source and each time the data flow is split into a separate branch, this creates a new execution tree. An execution tree can be defined as a segment of a data flow, ending in an asynchronous output. The number of execution threads created for a running package is generally calculated by using the following formula:

  • Number of threads = Number of sources + Execution trees

Multiple execution trees can be either advantageous or detrimental. Because each execution tree generally creates new buffers and can spawn a new execution thread, this might result in faster, parallel processing at the possible expense of more system resources. Regardless, this is an important system-monitoring metric. To get an accurate log of the execution trees for a package or container, enable the PipelineExecutionTrees log event, and then create an event handler, using the techniques you learned in Chapter 9, "Detecting and Handling Processing Errors."

Execution trees demonstrate how your package uses buffers and threads. At run time, the Data Pipeline engine breaks down data flow task operations into execution trees. These execution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and thread. When a new buffer is created, additional memory is required to handle the data transformation; however, it is important to note that each new tree also gives you an additional worker thread.

Note that you can display execution trees for your own packages by turning on package logging, enabling logging for the data flow task, and then selecting the PipelineExecutionTree event. You will not see the execution tree until you execute the package. It will display in the Log Events window within BIDS.

Execution Plans

Much like the way the SQL Server database engine prepares a query execution plan before each query runs, Integration Services uses a similar approach for package execution. When a package is first executed, an execution plan is generated for each data flow and cached, to be compiled and used internally to optimize run-time execution against SQL Server database objects. The execution plan is a set of lower-level database engine instructions translated from the settings of the data adapters and transformations in a data flow task. It's a form of pseudocode that is translated further into machine-specific instruction code by the Microsoft.NET Framework common language runtime (CLR). In subsequent executions, the compiled execution plan performs the same operations without the added overhead of retranslating the data flow components into instructions. Likewise, data source adapters that execute Transact-SQL statements, stored procedures, or user-defined functions in a SQL Server database benefit from the stored execution plans associated with these objects.

An execution plan provides further insight into which the data flow execution engine operations are to be performed for a given data flow task. Details are provided on individual threads and operation codes. Source threads are generated for data sources; worker threads are created for transformations and all other work. Set the EngineThreads property of the data flow task to the number of source threads plus the number of execution trees to ensure that the execution engine has enough threads available to use for optimal processing.

To view an SSIS execution plan, add logging to the package or container and then add the PipelineExecutionPlan event for any data flow task you wish to log. The following example is the execution plan logged for a package you will use in a later exercise. Note the metadata for each execution thread and the commands to initiate each component. A SourceThread is created for each data source, and WorkThreads are created for each execution tree in the data flow. Listener objects are defined to coordinate the operations, and buffers are created and managed for each thread. You can see how each transformation is initiated by using various Call commands.

 SourceThread0    Drives: 1    Influences: 5751 5829 6020 6079    Output Work List       CreatePrimeBuffer of type 1 for output ID 11.       SetBufferListener: "WorkThread0" for input ID 5752       CreatePrimeBuffer of type 3 for output ID 12.       CallPrimeOutput on component "OLE DB Source Cross Join" (1)    End Output Work List End SourceThread0 WorkThread0    Drives: 5751    Influences: 5751 5829 6020 6079    Input Work list, input ID 5752 (1 EORs Expected)       CallProcessInput on input ID 5752 on component "Multicast" (5751)         for view type 0       CallProcessInput on input ID 6080 on component "Sort 1" (6079)         for view type 2    End Input Work list for input 5752    Output Work List       CreatePrimeBuffer of type 6 for output ID 6081.       SetBufferListener: "WorkThread1" for input ID 5895       CallPrimeOutput on component "Sort 1" (6079)    End Output Work List End WorkThread0 WorkThread1    Drives: 5829    Influences: 5829 6020    Input Work list, input ID 5895 (1 EORs Expected)       CallProcessInput on input ID 5895 on component "Union All" (5829)         for view type 7    End Input Work list for input 5895    Output Work List       CreatePrimeBuffer of type 4 for output ID 5831.       SetBufferListener: "WorkThread2" for input ID 6033       CallPrimeOutput on component "Union All" (5829)    End Output Work List End WorkThread1 WorkThread2    Drives: 6020    Influences: 6020    Input Work list, input ID 6033 (1 EORs Expected)       CallProcessInput on input ID 6033 on component "OLE DB Destination"         (6020) for view type 5    End Input Work list for input 6033    Output Work List    End Output Work List End WorkThread2 

A comparison of different prototype packages can help you find inefficiencies in the execution plan. This execution plan created four threads: one SourceThread and three WorkThreads. It also created four buffers.

As a matter of practice, you should also use the SQL Server Profiler to view the behavior of these execution plans on the database server. Evidence of an efficient execution plan will include data selection from pre-existing clustered and nonclustered indexes and efficient join and merge operations using presorted records. Less-efficient operations use operations such as table scans and sorting without the aid of a supporting index.

For SQL queries used in the package, use the Show Execution Plan feature in SQL Server Management Studio. For each operation (whether a read, write, sort, compare, group, join, merge, union, aggregation, or calculation, for example), the database engine will consider different physical operations to process these tasks by using prepared database objects such as tables and indexes. It will use statistics stored with a table or index, which describe characteristics of column values (such as uniqueness, density, and selectivity) to make these choices. For each operation, the plan will estimate the cost of physical disk input/output (I/O), memory, CPU time, and number of rows necessary to process.

Understanding how to use and analyze execution plans is a crucial element of database design and programming; however, the details of this topic are beyond the scope of this book. For more information, read the following topics in SQL Server Books Online:

  • "Displaying Graphical Execution Plans" http://msdn2.microsoft.com/en-us/library/ms178071.aspx

  • "Displaying Execution Plans by Using SQL Server Profiler Event Classes" http://msdn2.microsoft.com/en-us/library/ms190233.aspx




Microsoft SQL Server 2005 Integration Services Step by Step
MicrosoftВ® SQL Server(TM) 2005 Integration Services Step by Step
ISBN: 0735624054
EAN: 2147483647
Year: 2007
Pages: 152

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net