Some transformations need partial or complete sets of data to perform their functions. When data in the pipeline is held up by a transformation operation, it blocks the flow of data to subsequent components downstream. As you know, transformation components that depend on each other are synchronous while those that have no direct dependency are asynchronous. Depending on whether the first transformation must process all or part of the data, to pass the data to the next transformation, it will be either a blocking or partially blocking transformation.
Transformations that require all input rows to be read before they can process the data are blocking transformations. These include:
Aggregation.
Sort.
The aggregation transformation must read all records to perform a mathematical algorithm and yield a single calculated value. The sort transformation must read records into a separate buffer in the sorted order, which cannot be completed until the last read has been added to the sort stack. Only after these operations have completed will the results flow down the data pipeline to the next transformations or data sources.
Partially blocking transformations are often used to combine data sets. They tend to have multiple data inputs with a resulting output that might have a greater number, a smaller number, or the same number of records than the total number of input records. The transformation(s) on the receiving end of a partially blocking transaction can work with subsets of the total input stream. Because the number of input records will likely not match the number of output records, these transformations are also called asynchronous transformations. Examples of partially blocking transformation components available in SSIS include the following:
Merge
Merge Join
Pivot
Unpivot
Union All
Term Lookup
Data Mining Query
With partially blocking transformations, the output of the transformation is copied into a new buffer, and a new thread is introduced into the data flow.
This category of transformations processes data one row at a time. Row transformations are typically considered to be partially blocking because rows are available to downstream components as soon as they flow from the transformation component.
Row transformations include the following:
Derived Column
Data Conversion
Copy Column
Character Map
OLE DB Command
Script Command
These types of transformations are called row transformations because they process data in the pipeline one row at a time. These are also known as synchronous transformations because rows flow to and from the pipeline without completely blocking downstream components. A row transformation also reuses buffers rather than requiring data to be copied to new buffers that would require more memory and processing. It can add or modify columns but will not add additional rows to the pipeline.