What Is OLAP?


To understand how OLAP data and Analysis Services work, you need to understand a little bit of what happens behind the scenes, as well as some of the terminology used to describe portions of the data and the summarizations that are generated in OLAP.

To help introduce the principles of OLAP, a small sample database is provided in the form of an Excel workbook, named CarSales.xls, in the Chap07 folder. This sample database lists the sales of 1000 cars over two years’ time for a regional specialty automotive dealership. The first 10 records of the database are shown in Figure 7-1.

click to expand
Figure 7-1: First 10 records of the sample car sales OLAP database.

As you look at the data, notice that each record, or row, contains a lot of data. The more data, or facts, available, the more specific the business questions are that you can ask of the data. In OLAP terms, this group of records is called a fact table because it contains the detailed business facts you’ll use in meaningful business analysis.

Also notice that the data is separated into its most basic components. For example, the date is separated into day, month, year, and quarter, unlike in many databases, in which a date is presented as a single field. Using our example, in a non-OLAP database, the car information might be stored in one field for the car series and type. In the fact table, the car series and type are stored in two separate fields. Information is stored in this detailed fashion so that an OLAP software application can return data summarizations for a wide range of potential business questions in the least amount of time. With data separated into its most basic parts, the OLAP software application can quickly interchange these parts depending on the data analysis task or business question at hand.

The third thing to notice in the sample database is that the data can be organized by relating fields to each other in a hierarchical fashion. For instance

  • The Day, Month, Year, and Quarter fields can be organized into a Time category.

  • The Customer Sales Region and Customer State fields can be organized into a Sales Geography category.

  • The Sales Manager ID and Salesperson ID fields can be organized into a Sales Staff category.

  • The Car Series, Car Type, and Color fields can be organized into a Car Information category.

These categories are known as dimensions in OLAP terminology. Dimensions, the basic organizational unit of OLAP, are typically presented in terms of dates, geographical areas, product and service offerings, employee reporting structures, and other organizational hierarchies. For instance, the Customer State field could be split further into cities, counties, or sales districts.

You can classify the data in each dimension by using groupings called levels. Levels describe groupings from the most summarized (highest) to the most detailed (lowest) level of data in a dimension. For example, the Time dimension contains Year, Quarter, Month, and Day levels because you can look at summarizations by year, then by quarter, then by month, and finally by day. In this example, other levels include the following:

  • The Region level contains states.

  • The Sales Manager ID level contains salespeople’s IDs—one or more salespeople report to a particular sales manager.

  • The Car Series level contains car types, which in turn contains car colors.

The individual collections of unique values contained in a level are known as members. A dimension that does not contain other levels, such as the Payment Type dimension, has one level with the same name as the dimension. Every member has a parent level, and every level contains child levels, child members, or both. Members at the bottom-most level of a dimension are sometimes referred to as leaf members. In our sample database, note the following:

  • The Year level has 2 child members—2000 and 2001—as well as a child level (Quarter).

  • The Quarter level has 4 child members (1, 2, 3, and 4) as well as a child level (Month).

  • The Month level has 12 child members (the months numbered 1 through 12) as well as a child level (Day).

  • The Day level contains 31 child members, or leaf members (1 through 31).

  • The Customer Sales Region level contains only 1 member in this case (West) and a child level (Customer State).

  • The Customer State level contains 3 leaf members (CA, OR, and WA).

  • The Sales Manager ID level contains 4 members (100, 101, 102, and 103) as well as a child level (Salesperson ID).

  • The Salesperson ID level contains 10 leaf members, numbered 1 through 10.

  • The Car Series level contains 2 members (Standard and Deluxe) and the Car Type child level.

  • The Car Type level contains 5 members (Coupe, Minivan, Sedan, Sport Utility, and Station Wagon), as well as a child level (Color).

  • The Color level contains 5 leaf members (Black, Blue, Green, Red, and White).

  • The Payment Type level contains 4 leaf members (36 Month Lease, 48 Monthly Payments, 60 Monthly Payments, and Cash).

Where does this leave the fields with numerical measurements, such as the Price field? These fields are called measures because, in a sense, we are measuring data in terms of how much money or another numerical measurement. In this sample database, we do not consider the sales manager IDs or salesperson IDs measures because you would never ask a business question such as what is the sum of all of the salesperson IDs? Instead you might ask a business question such as what is the sum of all car sales for the salesperson with the ID of 9? Every OLAP database must contain at least one measure. An OLAP database’s dimensions, levels, members, measures, and summarizations are stored in what OLAP calls a cube.

To review, the sample database contains the following OLAP components:

  • The Time dimension, which contains the Year, Quarter, Month, and Day levels

  • The Sales Geography dimension, which contains the Customer Sales Region and Customer State levels

  • The Sales Staff dimension, which contains the Sales Manager ID and Salesperson ID levels

  • The Car Information dimension, which contains the Car Series, Car Type, and Color levels

  • The Payment Type dimension, which contains a single Payment Type level

  • The Price measure, which is the only measure in the sample database

For a graphical approximation of this sample database, see Figure 7-2.


Figure 7-2: Visual representation of the sample car sales OLAP database.

In the next section, you will learn how to put these concepts to use in analyzing OLAP data.




Accessing and Analyzing Data With Microsoft Excel
Accessing and Analyzing Data with Microsoft Excel (Bpg-Other)
ISBN: 073561895X
EAN: 2147483647
Year: 2006
Pages: 137
Authors: Paul Cornell

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net