Metadata Standards

While projects to build an enterprise repository are often less than successful, the effort continues because there are at least four good reasons to have a standard, shared repository for metadata.

  • When tools can exchange metadata, you can reuse existing metadata to help define each new step in the implementation process. Column names and descriptions captured in the data model design step can be used to build the relational tables, reused to populate the OLAP engine, and then used again to help populate the front-end tools metadata layer, for example.

  • If the metadata is in a standard form, your investment in defining objects and processes is protectedyou are not locked in to a particular tool. For example, Reporting Services Report Definition Language (RDL) is a language standard that describes a report. If all your reports are stored in RDL, you could potentially switch to a new front-end tool and still be able to use your existing report library. (This benefit is not particularly popular with tool vendors .)

  • A central repository gives you a single, common understanding of the contents and structure of the data warehouseit is the best documentation you could have. And, to the extent that this shared repository holds the official, active metadata for each tool, you know it is the current version and not a copy that may be out of date.

  • An integrated metadata repository allows you to more easily assess changes through impact and lineage analysis. The two concepts are essentially like looking down the same pipe from either end. In impact analysis , you want to know what downstream elements will be affected as a result of a potential change, like dropping a table. In lineage analysis , you want to know how a certain element came into beingwhat sources and transformations it went through to get to where it is.

Youd like to have a standard repository for metadataat least in the DW/BI environment. The tools would write their metadata to the repository during the design phase and read it back in during the execution phase. In theory, any tool can write to and read from the standard model.

As it turns out, there is a standard framework for data warehouse-oriented metadata called the Common Warehouse Metamodel (CWM) that was first published by the Object Management Group (OMG) in 2001. The CWM is an object-based attempt to provide a metadata store for the various elements of a data warehouse. As we understand it, the overall standard is composed of three parts :

  • The Common Warehouse Metamodel (CWM) is a domain model (or metamodel, as OMG calls it) of the data warehouse. It has over a dozen subdomains (or sub-metamodels), including ones to describe relational elements, transformations, OLAP, and front-end components .

  • The Meta Object Facility (MOF) is essentially the definition language used to define a specific instance of the CWM.

  • XML Metadata Interchange (XMI) provides the structure to support the interchange of metadata with a standard format based on XML.

We say as we understand it, because the CWM standards set is long, complex, and difficult to navigate for the average data warehouse professional. The sheer volume of information is daunting with over 1,200 pages of documentation, plus the XML document for XMI, and the interface definition language (IDL) for the MOF. Beyond that, the standard is managed by an object-oriented organization and is based on CORBA and UML. Theres a lot to unravel before one can begin to understand it. That said, the metadata manager on your team should spend some time figuring out the CWM and how it plays in your vendors responses to the metadata problem.

While the CWM was a significant attempt to help solve the problem of metadata proliferation , the current status of the CWM is unclear. Vendor adoption of standards is always slow because each individual tool vendor has a vested interest in keeping its metadata proprietary. In fairness, some vendors believe the CWM is not flexible or comprehensive enough to support their metadata needs. Even so, many ETL and front-end tool vendors claim some level of support for CWM, but what this means is uncertain . For example, it is possible for a tool to write its metadata out to extended, custom structures in the CWMstructures that are essentially unreadable to other tools. This obviously does not help in the exchange or reuse of metadata, even if it does allow the vendor to claim CWM compliance. Further uncertainty comes from the fact that as of the time of this writing, there has been very little recent activity from the OMG on the CWM standard. There are no white papers on the topic on their web site. Version 1.1 of the standard came out in March of 2003, and the most recent press and articles listed on the main CWM page are from the year 2000. While the concept is valuable , it is too soon to celebrate the success of the CWM.

image from book
IN THEIR OWN WORDS

Perhaps its best to let the OMG describe the CWM to help clarify its purpose and contents:

  • Since every data management and analysis tool requires different metadata and a different metadata model (known as a metamodel) to solve the data warehouse metadata problem, it is simply not possible to have a single metadata repository that implements a single metamodel for all the metadata in an organization. Instead, what is needed is a standard for interchange of warehouse metadata.

    The CWM is a response to these needs. It provides a framework for representing metadata about data sources, data targets, transformations, and analysis, and the processes and operations that create and manage warehouse data and provide lineage information about its use.

    • Source: Common Warehouse Metamodel (CWM) Specification Version 1.1, Volume 1, March 2003

      http://www.omg.org/docs/formal/03-03-02.pdf

  • The Common Warehouse Metamodel (CWM) is a specification that describes metadata interchange among data warehousing, business intelligence, knowledge management and portal technologies. The OMG Meta-Object Facility (MOF) bridges the gap between dissimilar meta-models by providing a common basis for meta-models. If two different meta-models are both MOF-conformant, then models based on them can reside in the same repository.

    • Source: the Data Warehouse, CWM(tm), And MOF(tm) Resource Page http://www.omg.org/technology/cwm/

image from book
 
image from book
METADATA, META DATA, OR META-DATA?

Unfortunately, the state of metadata standards in our industry is reflected in the number of different ways we present the term metadata. The OMG uses a single word, metadata ; the Data Warehousing Institute uses two words, meta data ; and some publications have recently started using a hyphenated version, meta-data . The fact that we have no standard for the word metadata cannot bode well for the actual standard itself.

image from book
 

In any case, all this talk about standards is definitely a future-oriented discussion for us because SQL Server 2005 does not support the CWM standard.



Microsoft Data Warehouse Toolkit. With SQL Server 2005 and the Microsoft Business Intelligence Toolset
The MicrosoftВ Data Warehouse Toolkit: With SQL ServerВ 2005 and the MicrosoftВ Business Intelligence Toolset
ISBN: B000YIVXC2
EAN: N/A
Year: 2006
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net