Data Analysis Activities


The activities for data analysis do not need to be performed linearly. Figure 5.8 indicates which activities can be performed concurrently. The list below briefly describes the activities associated with Step 5, Data Analysis.

  1. Analyze the external data sources.

    In addition to requiring internal operational source data, many BI applications need data from external sources. Merging external data with internal data presents its own set of challenges. External data is often dirty and incomplete, and it usually does not follow the same format or key structure as internal data. Identify and resolve these differences during this step.

  2. Refine the logical data model.

    A high-level, project-specific logical data model should have been created during one of the previous steps. In addition, some or all of the internal and external data may have been modeled on other projects and may already be part of the enterprise logical data model. In that case, extract the representative portion of the enterprise logical data model and expand it with the new data objects, new data relationships, and new data elements. If the required data has not been previously modeled , create a new logical data model for the scope of this BI project. It should include all internal as well as external data elements.

  3. Analyze the source data quality.

    At the same time that the logical data model is created or expanded, the quality of the internal and external source files and source databases must be analyzed in detail. It is quite common that existing operational data does not conform to the stated business rules and business policies. Many data elements are used for multiple purposes or are simply left blank. Identify all these discrepancies and incorporate them into the logical data model.

  4. Expand the enterprise logical data model.

    Once the project-specific logical data model is relatively stable, merge it back into the enterprise logical data model. During this merge process additional data discrepancies or inconsistencies may be identified. Those will be sent back to the BI project for resolution.

  5. Resolve data discrepancies.

    Occasionally data discrepancies discovered during data analysis involve other business representatives from other projects. In that case, summon the other business representatives as well as the data owners to work out their differences. Either they will discover a new legitimate subtype of a data object or a new data element, which must be modeled as such, or they will have to resolve and standardize the inconsistencies.

  6. Write the data-cleansing specifications.

    Once all data problems are identified and modeled, write the specifications for how to cleanse the data. These specifications should be in plain English so they can be validated by the data owner and by business people who will use the data.

Figure 5.8. Data Analysis Activities

graphics/05fig08.gif



Business Intelligence Roadmap
Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications
ISBN: 0201784203
EAN: 2147483647
Year: 2003
Pages: 202

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net