OLE DB | Designing Component-Based Applications

As mentioned, OLE DB is a specification of a set of COM interfaces for data management. The interfaces are defined so that data providers can implement different levels of support, based on the capabilities of the underlying data store. Because OLE DB is COM-based, it can easily be extended: extensions are implemented as new interfaces. Clients can use the standard COM QueryInterface method to determine whether specific features are supported on a particular machine or by a particular data store. This capability is a substantial improvement over the function-based API defined by ODBC.

Figure 3-2 below shows the high-level OLE DB architecture. This architecture consists of three major pieces: data consumers, service components, and data providers. OLE DB providers are COM components responsible for providing data from data stores to the outside world. All data is exposed as virtual tables, known as rowsets. Internally, of course, a provider will make calls to the underlying data store using its native data access method or a generic API such as ODBC. Data consumers are COM components that access data using OLE DB providers. OLE DB service providers are COM components that encapsulate a specialized data management function, such as query processing, cursor management, or transaction management. OLE DB is designed so that these service components can be implemented independently from data providers, delivered as stand-alone products, and plugged in as needed. For example, simple data providers might only provide a way to get all the data from a data source, with no ability to query, sort, or filter the data. A service component might implement SQL query processing for any data provider. If a consumer wanted to perform a SQL query on the data from a simple data provider, the service component could then be invoked to execute the query.

click to view at full size.

Figure 3-2. The OLE DB architecture.

The OLE DB object model consists of seven core components, shown in Figure 3-3. These objects are implemented by data providers or service components and used by data consumers.

In the OLE DB object model, an Enumerator object is used to locate a data source. Consumers that aren't customized for a specific data source use an Enumerator to retrieve a list of names of available data sources and subordinate Enumerators. For example, in a file system, each file might correspond to a data source and each subdirectory might correspond to a subordinate Enumerator. The consumer searches the list for a data source to use, moving through the subordinate Enumerators as necessary. Once a data source is selected by name, a Data Source object can be created.

click to view at full size.

Figure 3-3. The OLE DB object model.

A Data Source object knows how to connect to a type of data store, such as a file or a DBMS. Each OLE DB provider implements a Data Source component class with a unique CLSID. A consumer can either create a specific Data Source directly, by calling CoCreateInstance, using the Data Source's CLSID, or it can use an Enumerator to search for a data source to use. Although each Data Source class has a unique CLSID, all the classes are required to expose a certain set of OLE DB interfaces so that a consumer can use any available data source in a standard way. Consumers specify the name of the data source they want to connect to, as well as any authentication information, through the Data Source object. Once a Data Source object is created, it can be used to reveal the capabilities of the underlying data provider.

Data Sources are factories for Session objects—in other words, you create Session objects using a Data Source object. A Session represents a particular connection to a data source. The primary function of a Session object is to define transaction boundaries. Session objects are also responsible for creating Command and Rowset objects, which are the primary ways to access data through OLE DB. A Data Source object can be associated with multiple Session objects.

If an OLE DB provider supports queries, it must implement Command objects. Command objects are generated by Session objects and are responsible for preparing and executing text commands. Multiple Command objects can be associated with a single Session object. Notice the use of the term "text commands" instead of "SQL commands." OLE DB doesn't care what command language is used. All that matters is whether the Command object understands the command and can translate it into calls to the underlying data provider when the command is executed.

Commands that return data create Rowset objects. Rowset objects can also be created directly by Session objects. A Rowset simply represents tabular data. Rowsets are used extensively by OLE DB. All Rowsets are required to implement a core set of OLE DB interfaces. These interfaces allow consumers to sequentially traverse the rows in the Rowset, get information about Rowset columns, bind Rowset columns to data variables, and get information about the Rowset as a whole. Additional features, such as updating the Rowset or accessing specific rows directly, are supported by implementing additional OLE DB interfaces.

The OLE DB object model also includes an Error object. Error objects can be created by any other OLE DB object. They contain rich error information that cannot be conveyed through the simple HRESULT returned by COM methods. OLE DB Error objects build on a standard error-handling mechanism, IErrorInfo, defined by Automation. OLE DB extends this error-handling mechanism to permit multiple error records to be returned by a single call and to permit providers to return provider-specific error messages.

The OLE DB object model provides a powerful, flexible mechanism for consumers to access any type of data in a uniform way. OLE DB does not take a "least common denominator" approach to UDA. Instead, it defines a rich, component-based model that lets data providers implement as much functionality as they are able to support, from sequential access to simple rowsets, to full DBMS functionality. This gives developers the option of writing generic data access components that use only the most basic functionality, or of writing components optimized for a specific DBMS that use a single programming model.