Meta Data Silos


Meta Data Silos

Data administrators have tried to inventory, define, and organize meta data since the early 1980s. Most data administrators used generic data dictionary products (meta data repositories used to be called data dictionaries ); only few tried to design and build their own. Some of the generic data dictionary products were rather sophisticated and expandable, and they could store most of the required meta data components . However, there were multitudes of problems associated with these early efforts.

  • Populating these early data dictionaries required a manual effort, which was time consuming and tedious , as all manual efforts are.

  • The lack of technical skills on the part of most data administrators prevented them from expanding the data dictionary products with custom features to make them more useful.

  • The reporting capabilities of the early data dictionary products were less than desirable. Some products did not even have application programming interface (API) capabilities that would allow data administrators to generate customized reports .

  • The immature technologies used in most early data dictionaries (which were mainframe products) did not provide automated interfaces, easy-to-use graphical user interface (GUI) displays, or context-sensitive help functions.

  • The lack of standards (or the lack of enforcement of standards) created an insurmountable burden for the data administrators who had to resolve conflicting and inconsistent data names , data definitions, and data domains.

  • No management appreciation for the value of meta data made meta data a low priority in most organizations. Business managers and business executives, as well as some information technology (IT) managers, viewed meta data as systems documentation, which they considered important but could live without.

  • No cross-organizational initiatives existed in organizations, except departmental initiatives usually spearheaded by data administrators in IT. Therefore, many business managers and business executives did not understand the value of the effort and did not buy into it. The popularity of data warehouse initiatives in the 1990s helped raise the understanding of the value of cross-organizational initiatives.

Because of these problems, data administration efforts to manage meta data were only marginally successful in the past. On many projects, these efforts were even considered project obstructions because of the extra time it took to define and capture the meta data when the technicians were eager to rush into coding. IT managers and business managers often asked, "Why aren't we coding yet?" ”obviously, they perceived writing programs as the only productive project development activity.

Sources of Meta Data

It was not until the advent of cross-organizational BI initiatives and the associated plethora of BI tools that meta data started to receive its proper recognition. People began to realize that these BI tools, with their own sets of meta data in their own proprietary dictionary databases, were creating the all-too-familiar problems of redundancy and inconsistency, only this time with meta data. Knowledge workers, business analysts, managers, and technicians were getting very frustrated with the tangled web of meta data silos (Figure 10.1).

Figure 10.1. Meta Data Silos

graphics/10fig01.gif

Meta data cannot be avoided, especially technical meta data, because database management systems (DBMSs) and most tools do not function without it. It is their "language." For example, meta data instructs the DBMS what type of database structures to create, tells the ETL tool what data to transform, and lets the OLAP tool know how to aggregate and summarize the data. Different meta data components are stored in different tools, and none of the tools (except a meta data repository) is designed to store all the other meta data components from all the other tools. For example:

  • CASE tools store the business meta data for the logical data model components and the technical meta data for the physical data model (logical database design) components.

  • DBMS dictionaries store the technical meta data for the database structure, such as databases, tables, columns , indices, and so on.

  • ETL tools store the technical meta data about source-to-target data mappings and the transformation specifications, which are used by these tools to perform their ETL processes.

  • Data-cleansing tools store the business meta data for data domains and for business rules that allow these tools to identify data quality problems. They also store the cleansing specifications, which are used by these tools to perform their data-cleansing functions.

  • OLAP tools store the technical meta data of the tables and columns in the BI target databases, the report definitions, and the algorithms for deriving, aggregating, summarizing, and in other ways manipulating BI data.

  • Data mining tools store the technical meta data about the various analytical models and the algorithms for the data mining operations.

As with vendors of other relatively new software and middleware product lines, the vendors of meta data repository products are competing for dominance , which slows down standardization of the product line. As a result, organizations end up with a tangled web of disparate and distributed meta data scattered across the proprietary dictionaries of their tools. To manage this situation, they now have to extract, merge, and accurately integrate meta data from these tool dictionaries, which can be as much of a challenge as extracting, merging, and accurately integrating business data from the operational systems.



Business Intelligence Roadmap
Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications
ISBN: 0201784203
EAN: 2147483647
Year: 2003
Pages: 202

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net