The Package and Other Objects


Packages are to Integration Services users what the canvas is to an artist or the field to the farmer. It's the basic starting point for everything you do and the foundation upon which Integration Services solutions are built. The first thing people do when designing a solution in IS is to create a package. After you create a package, you can add other objects to it, gradually building it up into a solution. After you've built your package, you can execute it using one of the tools that ships with Integration Services, such as the designer, dtexec, or dtexecui. So, in a way, building packages is like building custom software or an application.

In very basic terms, the package is the object that contains all the other objects. The tasks are the objects that cause things to happen, such as moving a file from here to there or pulling data from one location and putting it in another.

Connection managers describe the location of resources outside the package boundaries. Tasks and other components use connection managers to access those outside resources. There are connection managers for describing the location of a file or how to connect to a database among others.

Variables are objects used for holding data or communicating information from one component to another inside the package. For example, one task might place a date in a variable and another task might use that date for naming a file.

Containers are objects that contain other containers or tasks. They give structure and scope to packages and the objects within them. Containers can be executed like tasks. When executed, containers also execute the objects within them, their children.

Precedence constraints are objects that dictate the order of execution of containers and tasks, given certain constraints, such as the success or failure of the previous task or if a variable has a certain value.

Log providers are the objects responsible for outputting log information to a particular destination. For example, there is a SQL Server log provider and a flat-file log provider. Log providers have names based on the log destination format they support. Take all these different objects, put them together, set their properties, and you'll have a package that does something like populate a dimension table, deduplicate a customer list, or back up a database.

People use the name package to describe different things, all of them related to the basic building block of Integration Services. A package could be the actual object you create when you first start building a new solution. It could be a disk file that holds the persisted state of the package object, a collection of packages that are all children of one package, or the part of the package that's visible in the designer. Mostly though, when this book uses the term package, it refers to a collection of objects together with the package object that makes up the solution, as shown in Figure 6.1.

Figure 6.1. The package is an object containing a collection of interrelated objects


Tasks

Tasks do the work of packages. They determine a package's behavior and each task has a well-defined function that it performs. Many stock tasks ship with IS, including the FTP Task for using the FTP protocol to move files from and to remote machines, the SQL Task for executing SQL, a Web Services Task for invoking web methods, and others. Perhaps the most important stock task is the Data Flow Task. Its function is to move and transform high volumes of data at high speed. Tasks are pluggable extensions, which means third parties, not just Microsoft, can easily write custom tasks that can plug in to the IS environment.

Variables

Figure 6.1 illustrates how central and important variables are in packages. You use them to communicate between the different objects because the objects you see in Figure 6.1 are isolated from one another by the Integration Services architecture. One task cannot see another. In other words, one task cannot directly communicate with another so they must communicate through variables. This separation is designed to isolate the various objects, especially tasks, yielding better security, usability, manageability, and robustness. Without this isolation, it would be possible for tasks to gain access to other objects and manipulate their properties at any given time. In some cases, this is harmless, but in many cases, it is fatal to the package. Interaction between objects is also difficult to discover. If it were possible to write packages that were self-modifying, it would be very difficult to upgrade or migrate them in future versions. For example, suppose you wrote a package in which one task was designed to execute another task. There would be no way for IS to detect that relationship between the two tasks. Further, it would be difficult for the user to discover that as well, especially if the design was buried in the bowels of that task's code.

Variables, on the other hand, are designed to be the communication mechanism between objects. The IS runtime guards them to prohibit variable value data corruption that can occur with multiple simultaneous data access. IS variables are also hardened to avoid some of the other technical issues that occur in such a simultaneous multiaccess object model environment. IS variables can contain various types of data, including integers, strings, or even other objects.

Connection Managers

Connection managers are objects that provide other components with a link to the world outside of the package. Their name is perhaps a misnomer, because they don't always provide a connection. Some connection managers only provide the name of a file or a web address. Others provide more information than just a connection. For example, the Flat File Connection Manager provides a filename as well as information about the format of the flat file. Connection managers serve as a barrier between packages and the environment. That barrier is useful in all environments, but is especially useful in cases when the environment changes. For example, when moving a package from one machine to the next, connection managers provide a central location for updating the package to point to new locations on the destination machine. Without connection managers, tasks and other objects must provide properties to the user for specifying where to find resources such as files and web pages. It's much easier to modify the connections than to find all the objects that access those resources and modify each of them directly. Connection managers are also pluggable extensions. Although at the time of this writing, there are already more than 15 different types of connection managers, Microsoft realized that the industry changes and new data access technologies would emerge over time. Microsoft also realized that they hadn't built a connection manager for every existing type of data access technology and so made it simple to augment the stock offerings with custom connection managers.

Log Providers

Log providers are an important part of the IS logging infrastructure. Every IS object has access to the logging infrastructure. The problem is that not every user wants to send their log entries to the same format or destination. Some prefer the Extensible Markup Language (XML), others prefer flat files, and still others would rather send their logs to a table in SQL Server. Still, others might want to log to a destination that IS doesn't yet support. This is why Microsoft created log providers. Log providers are, yet again, pluggable components that are responsible for taking log entries sent by components in the package and writing them to the particular destination and medium they support. For example, a stock XML log provider writes log entries to an XML file. Tasks and other components are not aware of the destination to which they send their log entries. They just send their logs to the IS runtime and the runtime routes the log to the configured log providers. The log providers then take the log entries and convert them to the correct format and save the log entries based on how they are configured. It's also possible to log to more than one location at a time using two different log providers simultaneously on the same package. And, if you want to log to a location, format, or medium that isn't supported, you can write your own log provider. Log providers use connection managers to describe the destination of their log entries.

Containers

Containers are like the skeleton of packages. They organize the package into smaller chunks while providing transaction scope, variable scope, execution scope, looping functions, debugging breakpoints, error routing, and logging scope. The package is a container. There are loop containers that allow workflow to execute multiple times. The ForEach Loop allows the workflow to execute once for each item in a collection. The For Loop allows the workflow to execute until an expression evaluates to false. The sequence container helps to better organize packages. Each task has its very own container called the TaskHost that is, for the most part, transparent to the user. There is also an EventHandler container that the runtime executes when a task or other component raises an event.

Precedence Constraints

Precedence constraints are the traffic lights of IS. You use them to control which tasks execute and when or if a task should even execute at all. In the designer, they take the form of an arrow line from one task to another. Precedence constraints can minimally be defined to cause a task to execute when another succeeds, fails, or completes. They can also be configured to use complex expressions that reference IS variables to determine if workflow should execute. This flexibility provides an enormous number of options for conditionally branching within the package at execution time. In a way, precedence constraints are like If statements in structured coding that check Boolean expressions to determine if a certain piece of code should execute.



Microsoft SQL Server 2005 Integration Services
Microsoft SQL Server 2005 Integration Services
ISBN: 0672327813
EAN: 2147483647
Year: 2006
Pages: 200
Authors: Kirk Haselden

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net