Because of the differences in intent and purpose between operational systems and BI applications, different database design techniques have been devised for BI target databases. These highly denormalized designs store aggregated and summarized data in a multidimensional fashion. Logical database designs are documented as physical data models with technical meta data. Aggregation and summarization are probably the most significant contributors to good BI application performance. If most business analysts need to see their data summarized, these totals should be precalculated and stored for quick retrieval. It is important to discuss the level of granularity with the business representative, as well as with other business analysts who will be using the BI target databases, since they will expect the database design to allow them to drill down to a certain level of detail. Multidimensional database designs support the quick retrieval of a wide range of data. Two popular multidimensional design techniques are the star schema and the snowflake schema, both described below. The Star SchemaIn a star schema, data is represented as an array of precalculated values, called facts, around which analysis is performed. These precalculated facts represent atomic operational data values that have been presummarized by certain dimensions, such as customer, product, and time. A dimension in a star schema is similar to an entity in a logical data model: it is a business object about which data is collected for business purposes. The star schema mirrors the view of a business query. As the name implies, the star schema has a single object in the middle, called the fact table, which is connected in a radial fashion to a number of objects, called dimension tables. Figure 8.2 presents an example of a star schema. Figure 8.2. Star Schema A star schema has two, and only two, levels: the fact table and a series of single-level dimension tables. Fact tables have the following characteristics:
Most multidimensional DBMSs effectively deal with the optimization of large multi-table JOINs. One method for determining whether the DBMS is resolving the query efficiently is to look at the optimized plan for the query. For example:
In either case, verify that your DBMS is executing multidimensional queries in the most efficient manner since your performance depends on it. The star schema is the most popular database design schema for BI applications for a variety of reasons.
The preceding questions are typical drill-down questions (asking for more detailed data) and typical roll-up questions (asking for more summarized data). The Snowflake SchemaA snowflake schema is a variation of a star schema, except in a snowflake the points of the star radiate into more points, as shown in Figure 8.3. Figure 8.3. Snowflake Schema In snowflake schemas, the levels of the hierarchies in the dimension tables are normalized, thereby increasing the number of tables. Table 8.2 lists the advantages and disadvantages of snowflake schemas. Table 8.2. Advantages and Disadvantages of Snowflake Schemas
![]() |