More than ever, organizations are tasked with providing data in an increasingly diverse manner. Not long ago, most information was held on a mainframe and in various database management systems (DBMSs). Now an organization's important information can also be found in locations such as mail stores, file systems, Web-based text, and graphical files.
As organizations seek to gain maximum advantage from data and information distributed throughout their departments and divisions, they can attack the problem of disparate data sources by putting all the data in a single data store. With this universal storage approach, a single data store holds any and all kinds of data.
Universal storage solves the problem of multiple access methods by allowing only one type of data store. However, universal storage presents a huge technical challenge as far as writing a data store that can efficiently store and retrieve any type of data is concerned. Universal storage also fails to address the handling of existing data terabytes stored in another location, as the cost of converting data to the universal store would be enormous. In addition, the possibility of the universal store's single point of failure poses an additional risk.
Realistically, the Microsoft Open Database Connectivity (ODBC) approach of providing a common access method seems more feasible than that of universal storage. However, the common access method must encompass all types of data, rather than limiting itself to relational database tables and SQL queries as in ODBC.
Application program interfaces (APIs) are sets of commands that applications use to request and execute lower-level services performed by a computer's operating system. In this section, we discuss methods of manipulating various data sources through different data access interfaces. Each database vendor provides a vendor-specific API to ease database access. Non-DBMS data can be accessed through data-specific APIs, such as the Microsoft Windows NT Directory Service API (ADSI), the Messaging API (MAPI) for accessing mail data, and file system APIs. By using a native access method for each data store, developers can use the full power of each store. However, this procedure requires that developers know how to use each access method, and developers must have a detailed understanding of API functions associated with each data store to use API access methods. If developers must maintain access to several data stores, and consequently must learn all the data access methods involved, organizational costs for training alone can become quite high.
Instead of using native data access methods, developers can choose to use a generic, vendor-neutral API such as the ODBC interface. Using this type of interface is advantageous in that developers need to learn only one API to access a wide range of DBMSs. Applications then can simultaneously access data from multiple DBMSs.
The Microsoft Universal Data Architecture (UDA) is designed to provide high-performance access to any type of data—structured or unstructured, relational or non-relational—stored anywhere in an enterprise. UDA defines a set of COM interfaces that actualize the concept of accessing data. UDA is based on OLE DB, a set of COM interfaces for building database components. OLE DB allows data stores to expose their native functionality without making nonrelational data appear relational. OLE DB also provides a way for generic service components, such as specialized query processors, to augment features of simpler data providers. Because OLE DB is optimized for efficient data access rather than ease of use, UDA also defines an application-level programming interface, or Microsoft ActiveX Data Objects (ADO). ADO exposes dual interfaces to easily be used with scripting languages as well as with C++, Microsoft Visual Basic, and other development tools. We discuss ADO more thoroughly in the ADO section later in this chapter.
UDA is a platform, application, and tools initiative that defines and delivers both standards and technologies tailored to providing enterprise data access. It is a key element in the Microsoft foundation for application development. In addition, UDA provides high-performance access to a variety of information resources, including relational and non-relational data, and an easy-to-use programming interface that is tool-independent and language-independent.
UDA doesn't require expensive and time-consuming movement of data into a single data store, nor does it require commitment to a single vendor's products. UDA features broad industry support and works with all major established database products. UDA has its origins in standard interfaces, such as ODBC, Remote Data Objects (RDO), and Data Access Objects (DAO), but it significantly extends functionality of these well-known and well-tested technologies.
Microsoft Data Access Components (MDAC) provides a UDA implementation that includes ADO as well as an OLE DB provider for ODBC. This capability enables ADO to access any database that has an ODBC driver—in effect, all major database platforms. OLE DB providers are also available for other types of stores, such as the Microsoft Exchange mail store, Windows NT Directory Services, and Microsoft Windows file system using Microsoft Index Server. As shown in Figure 9.1, developers can write applications for existing or new data and for structured or unstructured data, using ADO as the single data access mechanism, regardless of the data's location.
Figure 9.1 Microsoft's UDA design