Integration with an ETL Tool

 < Day Day Up > 



When your BusinessObjects implementation is part of an overall data warehouse initiative, you may be able to further minimize universe maintenance by integrating metadata from an ETL tool. An ETL tool has three primary goals:

  • Extract Get the information out of the source system.

  • Transform Cleanse the data and make it consistent regardless of where the information originated.

  • Load Load the data into data marts or data warehouses in a star or snowflake schema.

As ETL tools have matured, these three components have become easier, more graphical, and more robust. Many tools now include business and process intelligence to extract data from leading systems such as SAP, PeopleSoft, J.D. Edwards, or Lawson and to build corresponding data marts that focus on specific processes within an organization. What once was a nuts-and-bolts process of creating code to facilitate extracts, now is a complete set of tools to create interrelated procedures and to build business-oriented, dimensional views of the data. With this evolution, ETL tools contain more metadata: information about where the data originates, how it is transformed, business names, and uses. The BusinessObjects universe also contains metadata in the form of objects (meaningful business names), object descriptions (help text, calculations, and potentially, source system information), and joins (relationships between tables). In an effort to ensure one consistent business definition, ETL and BI vendors have worked to enable metadata to be shared between the two platforms.

In order to better understand how this information is shared, refer to the definitions in Table 14-2.

Table 14-2: Glossary of Terms for Integrating Metadata Between ETL and BI Tools

Term

Explanation

CWM

The Common Warehouse Metamodel (CWM) is a specification that describes metadata interchange among data warehousing, business intelligence, knowledge management, and portal technologies.

XMI

XML Metadata Interchange. It allows tools to share metadata information through the XML file format. Also referred to as CWMI.

XML

The Extensible Markup Language is a file format for sharing data.

Metadata

Information about the data, including from which source system or ERP system it originates, how a number is calculated, transformation logic, and business terminology.

API

Application Programming Interface.

CWMI

XML-based file for exchanging metadata. Also referred to as XMI.

The degree to which you use the full functionality of the ETL tool will affect how useful it is for you to integrate your tool with BusinessObjects. An initial build of the universe is fairly easy and straightforward. However, maintaining the universe as underlying physical tables change and understanding where the data originated can add a significant amount of maintenance. Recall in the discussion how linked universes help reduce maintenance and ensure consistent business definitions across universes that use the same objects. Theoretically, sharing metadata from an ETL tool brings this consistency and leverage to yet a higher level. In Chapter 6, you looked at how strategy files use SQL to read data dictionary information to build the initial classes and objects. With ETL tools, a BusinessObjects bridge reads ETL metadata to build the universe. ETL metadata contains information above and beyond what standard data dictionary tables contain and can be used in the following ways in the universe:

  • The source system / ERP TABLE.COLUMN name is displayed in the object description to show where the data originated.

  • Business names established in the ETL tools become the universe object names, thus ensuring consistent terminology across multiple BI tools and multiple universes.

  • As additional tables or columns become available in the data warehouse, their definitions can be imported into the BusinessObjects universes.

  • Primary and foreign keys that form star and snowflake schemas build the joins.

With a bridge to the metadata, you can extract information from an ETL tool to build the universe. Business Objects offers two vendor-specific bridges and released a new Universal Metadata Bridge in Q1 2003.

Informatica and IBM Metadata Bridges

Business Objects offers vendor-specific bridges to Informatica’s PowerMart and IBM’s Data Warehouse Manager for DB2. Informatica’s PowerMart and IBM’s Data Warehouse Manager for DB2. The BusinessObjects Metadata Bridge accesses metadata within each ETL tool’s repository. For Informatica, the Metadata Bridge communicates with the Informatica repository through an API, as shown in Figure 14-5. With DB2, IBM publishes its metadata in an Information Catalog that BI tools access through ODBC. The Metadata Bridge then allows a designer to use this information to build a new universe. In version 5.0 of the Informatica bridge, the universe object IDs (refer to Chapter 8, “Warning: Object IDs”) were not preserved when the bridge was used to update a universe; this has been corrected in version 5.1. The IBM bridge does not allow universe updates and can only be used for the initial universe build.

click to expand
Figure 14-5: BusinessObjects Metadata Bridges allow metadata in two leading ETL tools to be shared with the BusinessObjects universe.

Universal Metadata Bridge

While the preceding approach works, it puts the burden on Business Objects to develop and maintain a bridge for each ETL tool. Meanwhile, the Common Warehouse Metamodel (CWM) has been gaining industry acceptance. CWM uses a number of standards to determine how metadata can be exchanged between different tools. CWMI or XMI specifies how metadata can be exchanged via XML. In Q1 2003, BusinessObjects released Universal Metadata Bridge that will read XMI formats. As more ETL vendors support CWM and XMI, designers can use BusinessObjects’ Universal Metadata Bridge to read metadata from ETL-generated XMI files and build universes. With Business Objects acquisition of Acta, Data Integrator (the renamed and updated Acta ETL tool) will be the first ETL-tool to leverage the Universal Metadata Bridge. Figure 14-6 shows how the new bridge works.

click to expand
Figure 14-6: BusinessObjects Universal Metadata Bridge enables designers to build universes from any tool that supports XMI.

The Universal Metadata Bridge has several new features lacking in the Informatica- and IBM-specific bridges, such as batch updates and improved update handling. Previously, a designer could only interactively update a universe. With the new bridge, universe updates can be run in batch mode on a scheduled basis. As well, designers could previously only specify whether or not to update object names and descriptions. With the Universal Metadata Bridge, a designer can specify how to update them, either to replace the descriptions or to combine existing object descriptions with new object descriptions from the ETL metadata. The bridge also allows you to compare the metadata from the ETL tool versus the universe before updating the universe.

click to expand

With the Universal Metadata Bridge, BusinessObjects creates a reference file to specify how each TABLE.COLUMN in the ETL import file is associated with a universe class and object. As a designer updates the universe with a new XMI file, the Bridge compares information from the XMI file with information in the reference file to correctly update existing objects. This helps preserve the unique OBJECT_IDS and ensure that user reports continue to work correctly.

For companies that use Informatica as an ETL tool, designers can choose between either the Informatica Bridge that communicates with the Informatica repository via the API or the new Universal Metadata Bridge that works with an XMI file. With the former bridge, users can import the Business Name from the Informatica repository to be used as the object name in the universe. Unfortunately, the Business Name is not included in Informatica’s XMI file as the Business Name is not part of the standard CWM specification. So while designers would benefit by using the Universal Bridge for batch scheduling and improved update handling, the lack of a Business Name will limit its usefulness.

Not surprisingly, the integration between Business Object’s Data Integrator and the Universal Metadata Bridge is tighter and does include the Business Name. Here, Data Integrator has its own XML format, whereas Informatica uses the CWM format.



 < Day Day Up > 



Business Objects(c) The Complete Reference
Cisco Field Manual: Catalyst Switch Configuration
ISBN: 72262656
EAN: 2147483647
Year: 2005
Pages: 206

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net