Database Design Activities


The activities for database design do not need to be performed linearly. Figure 8.4 indicates which activities can be performed concurrently. The list below briefly describes the activities associated with Step 8, Database Design.

  1. Review the data access requirements

    The database administrator must review the data access and analysis requirements ( reports , queries), which were analyzed and finalized during Step 6, Application Prototyping. He or she also has to review the prototyping results with the application lead developer to help determine the most appropriate design schema for the BI target databases.

  2. Determine the aggregation and summarization requirements.

    Before committing to the final design schema for the BI target databases, the database administrator needs to finalize the data aggregation and summarization requirements with the business representative and the application lead developer. Pay close attention to aggregation and summarization explosion and to data explosion in general. Business people often ask for data "just in case" they will need it some day, and then they rarely use it, if ever.

  3. Design the BI target databases.

    The widespread claims that all BI applications are only about multidimensional analysis and multidimensional reporting are not true! For example, some financial analysts (statisticians) reporting to the CFO or the CEO will emphatically state their requirements similar to this: "I need to be able to ask any question of any detailed data in any way. Don't try to box me into any predetermined reporting patterns. I have none!" These analysts need total ad hoc flexibility against historical detailed data and are always willing to give up performance, even if it means that their queries will run for hours or overnight. Although these types of analysts are definitely in the minority, they do exist, and you must take their data access requirements into consideration. Therefore, while the designs of most of your BI target databases will be based on a multidimensional schema, some will be based on an entity-relationship schema. Database designs are documented as physical data models.

    The data access requirements and the data aggregation and summarization requirements will determine the most appropriate database design. If there are obvious reporting patterns or if the requirements ask for slice and dice analysis capabilities, then the most appropriate database design is a multidimensional one. If there are no reporting requirements and if the business analysts insist that they need ad hoc access to their detail data, then the most appropriate design is the entity-relationship design, which is more normalized with few or no aggregations or summaries.

    These are not the only two design schemas applicable for BI target databases. For some types of access and analysis requirements, a hybrid design may be the most appropriate.

  4. Design the physical database structures.

    Clustering, partitioning, indexing, and appropriately placing the datasets are the four most important characteristics of physical database design. The database administrator should cluster the most frequently used tables in order to reduce the disk arm movement. He or she must also determine where to place the datasets and how to partition tables across multiple disks. Finally, he or she has to select an index strategy.

  5. Build the BI target databases.

    The physical databases are built when the data definition language (DDL) is run against the DBMS. The database administrator uses the DDL to describe the database structures (e.g., storage groups, database partitions) to the DBMS.

    Database security is established when the data control language (DCL) is run against the DBMS. In standard relational databases, security is imposed at the table or view level. Because of the dimensional nature of BI target databases, the capability to drill down into detail data, sometimes across databases, presents an often-overlooked security risk.

    Grant database authority either to individuals or to groups into which individuals have been assigned. Managing security on an individual level can quickly become a maintenance nightmare, which is why most organizations prefer to set up group identifiers (group IDs). Each group ID is granted some form of create, read, update, delete (CRUD) access to the tables. An audit trail can then show which specific " user ID" under which group ID accessed the database. If there is a breach of security, the "infiltrator" can often be located through this audit trail.

  6. Develop database maintenance procedures.

    Once the database goes into production, it will be important to set aside time for taking database backups or reorganizing fragmented tables. Therefore, establish procedures to address database maintenance functions.

  7. Prepare to monitor and tune the database designs.

    Once the BI application is implemented, the BI target databases have to be monitored and tuned . The best database design does not guarantee continued good performance, partly because tables become fragmented and partly because actual usage of the BI target databases changes over time. Monitor performance of queries at runtime with a performance-monitoring utility that has diagnostic capabilities. It does not help to know that performance has degraded without knowing the causes. Diagnosing performance problems is usually much harder than discovering them.

  8. Prepare to monitor and tune the query designs.

    Since performance is such a challenge on BI applications, you must explore all tricks of the trade to address this problem. Parallel query execution is one of those tricks that could boost query performance.

Figure 8.4. Database Design Activities

graphics/08fig04.gif



Business Intelligence Roadmap
Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications
ISBN: 0201784203
EAN: 2147483647
Year: 2003
Pages: 202

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net