Top-Down Logical Data Modeling


The most effective technique for discovering and documenting the single cross-organizationally integrated and reconciled view of business data is entity-relationship (E-R) modeling, also known as logical data modeling. A popular approach to E-R modeling in the early 1980s was to model all the data for the entire organization all at once. While this approach was a worthwhile architectural endeavor, it did not yield better systems because the process was not integrated with the systems development lifecycle. A more effective approach is to incorporate E-R modeling into every project and then merge the project-specific logical data models into one consolidated enterprise data model over time.

Project-Specific Logical Data Model

E-R modeling is based on normalization rules, which are applied during top-down data modeling as well as during bottom-up source data analysis. Using normalization rules along with other data administration principles assures that each data element within the scope of the BI project is uniquely identified, correctly named, and properly defined and that its domain is validated for all business people who will be accessing the data. Thus, the normalized project-specific logical data model yields a formal representation of the data exactly as it exists in the real world, without redundancy and without ambiguity.

This formal representation of data follows another normalization rule: process independence. Therefore, by definition, a logical data model, which is based on normalization rules, is also process independent. Process independence means that the structure and content of the logical data model are not influenced by any type of database, access path , design, program, tool, or hardware, as shown by the X markings in Figure 5.2.

Figure 5.2. Process Independence of Logical Data Models

graphics/05fig02.gif

Because of its process independence, a logical data model is a business view, not a database view and not an application view. Therefore, a unique piece of data, which exists only once in the real business world, also exists only once in a logical data model even though it may be physically stored in multiple source files or multiple BI target databases.

Enterprise Logical Data Model

It is the responsibility of an enterprise architecture group, or of data administration if the organization does not have an enterprise architecture group , to merge the project-specific logical data models into an integrated and standardized enterprise logical data model, as illustrated in Figure 5.3.

Figure 5.3. Creating an Enterprise Logical Data Model

graphics/05fig03.gif

This enterprise logical data model, also known as the enterprise information architecture, is not constructed all at once, nor is it a prerequisite for BI projects to have a completed one. Instead, the enterprise logical data model evolves over time and may never be completed. It does not need to be completed because the objective of this process is not to produce a finished model but to discover and resolve data discrepancies among different views and implementations of the same data.

These data discrepancies exist en masse among stovepipe operational systems and are the root causes of an organization's inability to provide integrated and consistent cross-organizational information to its business people. The discovery of these discrepancies should be embraced and celebrated by the BI project team, and especially by the business people, because poor-quality data is finally being addressed and resolved. Gaining control over the existing data chaos is, after all, one major function of any BI decision-support initiative.

graphics/hand_icon.gif

If organizations would follow business analysis best practices by developing logical data models for all their operational applications and merging them (over time) into an enterprise logical data model, the BI decision-support development effort could be significantly reduced. This would enable BI project teams to increase the speed of delivering reliable decision-support information to the business people. In other words, the BI project teams could deliver the "quick hits" that everyone wants ”and deliver them with quality.

Logical Data Modeling Participants

Logical data modeling sessions are typically facilitated and led by a data administrator who has a solid business background. If the data administrator does not have a good understanding of the business, a subject matter expert must assist him or her in this task.

The business representative and the subject matter expert assigned to the BI project are active participants during the modeling sessions. If the data is being extracted from several different operational systems, multiple data owners may have to participate on the BI project because each operational system may be under the governance of a different owner. Data owners are those business individuals who have authority to establish business rules and set business policies for those pieces of data originated by their departments. When data discrepancies are discovered , it is the data owners' responsibility to sort out the various business views and to approve the legitimate usage of their data. This data reconciliation process is and should be a business function, not an information technology (IT) function, although the data administrators, who usually work for IT, facilitate the discovery process.

Systems analysts, developers, and database administrators should also be available to participate in some of the modeling sessions on an as-needed basis. These IT technicians maintain the organization's applications and data structures, and they often know more than anyone else about the data ”how and where it is stored, how it is processed , and ultimately how it is used by the business people. In addition, these technicians often have in-depth knowledge of the accuracy of the data, how it relates to other data, the history of its use, and how the content and meaning of the data have changed over time. It is important to obtain a commitment to the BI project from these IT resources since they are often busy "fighting fires" and working on enhancements to the operational systems.

Standardized Business Meta Data

A logical data model, representing a single cross-organizational business view of the data, is composed of an E-R diagram and supporting business meta data. Business meta data includes information about business data objects, their data elements, and the relationships among them. Business meta data as well as technical meta data, which is added during the design and construction stages, ensure data consistency and enhance the understanding and interpretation of the data in the BI decision-support environment. A common subset of business meta data components as they apply to data (as opposed to processes) appears in Figure 5.4.

  • A data name , an official label developed from a formal data-naming taxonomy, should be composed of a prime word, a class word, and qualifiers. Each data name uniquely identifies one piece of data within the logical data model. No synonyms and no homonyms should exist.

  • A data definition is a one- or two- sentence description of a data object or a data element, similar to a definition in a language dictionary. If a data object has many subtypes , each subtype should have its own unique data definition. A data definition explains the meaning of the data object or data element. It does not include who created the object, when it was last updated, what system originates it, what values it contains, and so on. That information is stored in other meta data components (e.g., data ownership, data content).

  • A data relationship is a business association among data occurrences in a business activity. Every data relationship is based on business rules and business policies for the associated data occurrences under each business activity.

  • A data identifier uniquely identifies an occurrence of a data object. A data identifier should be known to the business people. It should also be "minimal," which means it should be as short as possible (composed of just enough data elements to make it unique). In addition, a data identifier should be nonintelligent, with no embedded logic. For example, account numbers 0765587654 and 0765563927, where 0765 is an embedded branch number, would be poor data identifiers.

    graphics/hand_icon.gif

    A logical data identifier is not the same thing as a primary key in a database. Although a data identifier can be used as a primary key, it is often replaced by a surrogate ("made-up") key during database design.

  • Data type describes the structure of a data element, categorizing the type of values (character, number, decimal, date) allowed to be stored in it.

  • Data length specifies the size of a data element for its particular data type. For example, a decimal data element can be an amount field with two digits after the decimal point or a rate field with three digits after the decimal point.

  • Data content (domain) identifies the actual allowable values for a data element specific to its data type and data length. A domain may be expressed as a range of values, a list of allowable values, a generic business rule, or a dependency rule between two or more data elements.

  • A data rule is a constraint on a data object or a data element. A data constraint can also apply to a data relationship. A data constraint can be in the form of a business rule or a dependency rule between data objects or data elements, for example, "The ceiling interest rate must be higher than the floor interest rate."

  • Data policy governs the content and behavior of a data object or a data element. It is usually expressed as an organizational policy or a government regulation. For example, "Patients on Medicare must be at least 65 years old."

  • Data ownership identifies the persons who have the authority to establish and approve the business meta data for the data objects and data elements under their control.

Figure 5.4. Data-Specific Business Meta Data Components

graphics/05fig04.gif

Although logical data models are extremely stable, some of these business meta data components (such as data content, data rules, data policy, data ownership) occasionally change for legitimate reasons. It is important to track these changes in the meta data repository.



Business Intelligence Roadmap
Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications
ISBN: 0201784203
EAN: 2147483647
Year: 2003
Pages: 202

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net