This chapter looked at some of the key Business Intelligence concepts and techniques. We showed that OLTP systems typically have a normalized database structure optimized for updates rather than queries. Trying to provide access to this data is difficult because the complex schemas of OLTP databases makes them difficult for end users to work with, even when simplifying views are provided; furthermore, there are usually multiple data sources in an organization that need to be accessed together.
We propose building a data warehouse relational database with a design and an operational approach optimized for queries. Data from all the source systems is loaded into the warehouse through a process of extraction, transformation, and loading (ETL). The data warehouse uses a dimensional model, where related numeric measures are grouped into fact tables, and descriptive attributes are grouped into dimension tables that can be used to analyze the facts.
A separate OLAP database that stores and presents information in a multidimensional format is built on top of the data warehouse. An OLAP cube includes precalculated summaries called aggregates that are created when the data is loaded from the data warehouse and that can radically improve the response times for many queries.
We also looked at some of the key concepts in dimensional modeling. A hierarchy is a set of attributes grouped together to provide a drilldown path for users. In a snowflake dimension design, the dimension is stored as several separate related tables, and we often recommend taking this approach when it will improve the performance or maintainability of the data-loading process. We recommend using surrogate keys for every dimension table, which are generated integer values that have no meaning outside the data warehouse.
We also covered some of the potential pitfalls of BI projects. Some of the key areas to focus on are making sure the business is involved and using an iterative approach that actually delivers value along the way. We recommend that BI project teams pay careful attention to issues of data quality.