Meta Data SilosData administrators have tried to inventory, define, and organize meta data since the early 1980s. Most data administrators used generic data dictionary products (meta data repositories used to be called data dictionaries ); only few tried to design and build their own. Some of the generic data dictionary products were rather sophisticated and expandable, and they could store most of the required meta data components . However, there were multitudes of problems associated with these early efforts.
Because of these problems, data administration efforts to manage meta data were only marginally successful in the past. On many projects, these efforts were even considered project obstructions because of the extra time it took to define and capture the meta data when the technicians were eager to rush into coding. IT managers and business managers often asked, "Why aren't we coding yet?" ”obviously, they perceived writing programs as the only productive project development activity. Sources of Meta DataIt was not until the advent of cross-organizational BI initiatives and the associated plethora of BI tools that meta data started to receive its proper recognition. People began to realize that these BI tools, with their own sets of meta data in their own proprietary dictionary databases, were creating the all-too-familiar problems of redundancy and inconsistency, only this time with meta data. Knowledge workers, business analysts, managers, and technicians were getting very frustrated with the tangled web of meta data silos (Figure 10.1). Figure 10.1. Meta Data Silos
Meta data cannot be avoided, especially technical meta data, because database management systems (DBMSs) and most tools do not function without it. It is their "language." For example, meta data instructs the DBMS what type of database structures to create, tells the ETL tool what data to transform, and lets the OLAP tool know how to aggregate and summarize the data. Different meta data components are stored in different tools, and none of the tools (except a meta data repository) is designed to store all the other meta data components from all the other tools. For example:
As with vendors of other relatively new software and middleware product lines, the vendors of meta data repository products are competing for dominance , which slows down standardization of the product line. As a result, organizations end up with a tangled web of disparate and distributed meta data scattered across the proprietary dictionaries of their tools. To manage this situation, they now have to extract, merge, and accurately integrate meta data from these tool dictionaries, which can be as much of a challenge as extracting, merging, and accurately integrating business data from the operational systems. |