11.4 The Virtual Federation


11.4 The Virtual Federation

Today, leaving the data in place and retrieving it on demand is even easier with the advent of the Web, which facilitates accessing data stored on multiple autonomous computing systems connected by a network presented to users as one integrated database. This virtual database presents data to users by means of user views that, from a user perspective, look exactly like the views of data in a centralized database. User views in this virtual database are mapped to underlying tables or objects that may be stored in any of the various databases in a federation. The federated database system processes queries defined in terms of these views, retrieving data as needed from the other systems in the federation and delivering the results as if all the data were local. Today, almost every Web page on any large site is assembled automatically from multiple sources. Click on a button and data may be retrieved from a federation of remote databases and servers. Users may not be aware of this, but it goes on constantly.

Federated databases operate by means of a similar principle, except that each data resource is defined by means of a database schema or view, providing the user with much more power to access and manipulate the data. Users interact with a front-end system that presents information on what data is available, accepting ad hoc queries. The front-end system decomposes each query into subqueries that can be handled by the various database servers in the federation; the system ships each subquery to the appropriate database server, assembling and delivering the result to the user.

A federated database system will deliver acceptable performance for most applications only if it includes a fully developed cost-based optimizer that is aware of the distribution and heterogeneity of the back-end servers. Some older systems that appear to deliver federated database management such as the GRASP system are really only providing gateways to other databases. GRASP, for example, was able to access all of the major national credit bureaus at that time—Equifax, Trans Union and TRW (now Experian)—and convert their different codes and content into a single easy-to-interpret format, but it wasn't capable of handling real-time queries that accessed these data at multiple locations. It ran batches at night.

Both the servers and the data models in the federation may be heterogeneous. The federation can include different database engines residing on different operating systems and hardware platforms. For example, a federation may include DB2 databases on Sun Microsystems and IBM servers, Informix databases on Hewlett-Packard servers, Microsoft SQL Server on a Unisys ES7000, and a Teradata database on an NCR Worldmark 5250. Database servers can be added to and deleted from the federation over time. It is a fully modular system for accessing diverse data sources.

In terms of data models, the front-end system may present users with a relational view, while retrieving data from database servers that are object-oriented, hierarchical, and so on. Different users or applications may be presented with different views of the data suited to particular uses. Multiple copies of the front end can be distributed throughout the network. Front-end systems may maintain snapshots, materialized views, or summaries of data from various servers. Such snapshots have several advantages, including accelerated query performance. And the snapshot can be available even when the underlying data is not. Snapshots also provide a stable picture of a dynamic phenomenon for data mining analysis and reporting purposes. To access and make available the output of such a federated system, a Web of services would be used

Until recently, the centralized approach represented the more attainable of two limited choices for building intelligent data warehouse support systems: a massive initial investment of time and money to build and load a centralized database or a complicated patching together of existing systems with all the associated integration and maintenance problems. A virtual federated database, however, allows for the data to be left where it is. The users can still access the data they need using an integration layer built on heterogeneous data sources and database views. In order to be linked and have access to an entity validation, a Web service architecture would be used.




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net