After this lesson, you will be able to:
- Describe and build dimensions
- Build data cubes using the Cube wizard and the Cube Editor
Estimated lesson time: 45 minutes
Dimensions are the basis of cubes. They allow you to define how you want to view data. A dimension can be a single criterion by which we view data for example, by employee. Dimensions frequently represent a hierarchical view of the information in your fact table. For example,
Data cubes are based on dimensions. Using dimensions allows you to perform drill-down analysis, where you start at the highest level of the data and drill down to progressively lower, more detailed data.
For example,
Shared dimensions are created independently of any particular cube. They are
For example, the above Parent Company\Company\Division\ Department\Employee dimension would be a good candidate for a shared dimension. Those criteria and that hierarchy are likely to be used in many analyses of company data. Making them available to all cubes encourages standardization and helps makes comparisons across cubes meaningful.
Private dimensions are created within a cube. They are
Use private dimensions when defining a way of looking at data that only makes sense in a given subject area. For example, the shipper dimension would be meaningful in a cube tracking customer activity, but it has little relevance in a cube analyzing employee sales figures.
The Dimension wizard provides a graphical user interface (GUI) to simplify the process of creating dimensions. If invoked independently, it can be used to create shared dimensions. If you are using the Cube wizard, the Dimension wizard can be called to create private dimensions.
You can create calculated members by using cube data with an expression. This means that values are calculated on demand and do not have to be stored in the cube.
You should consider creating dimensions with calculated members if you want to:
For example, you can create a calculated member named Profit by subtracting the value of a member named Costs from the value of a member named Sales.
For example, you can create a calculated member named Marks that converts dollars to marks by multiplying an existing dollar measure by a conversion rate. Marks can then be displayed to users in a separate row or column.
Because values are calculated in memory, and only the calculated member definitions are stored, values in Marks are displayed to users but are not stored as cube data.
For example, an account number is a nonadditive column. You can use the COUNT function with the nonadditive column to generate results that can then be used as a measure.
TIP
OLAP Services provides an extensive function library to use with calculated members. It also supports Microsoft Visual Basic for Applications Expression Services, Microsoft Excel worksheets, and user-defined function libraries that are registered with the service.
You can use the Calculated Member Builder in the Cube Editor to define calculated members. In the Calculated Member Builder, you need to specify the following things:
For example, if you are creating a member that calculates profit from a sales measure and a cost measure, then the calculated profit measure becomes a member of Measures.
For example, you could create four calculated members that calculate whether a date in a time dimension falls in Spring, Summer, Fall, or Winter. A member of the year level, such as 1997, 1998, or 1999, could be the parent member of these calculated members.
Figure 9.5 shows the Calculated Member Builder.
Figure 9.5 The Calculated Member Builder
As we discussed earlier, aggregations are precalculated data summaries that enable a faster response time to queries. By storing precalculated values, the server does not have to retrieve all the base data and perform a sum or other numerical calculation. Using aggregations is the basis for rapid response from an OLAP system.
Cubes are the method of storing aggregations in an OLAP system. Dimensions are the means of querying the cube. The aggregations are stored at the intersections of the dimensions. Each intersection (called a cell) stores an aggregate value.
Adding more aggregations generally enhances performance. However, you should consider the following factors:
Data explosion can be a serious problem in a data warehouse. Figure 9.6 depicts four data cells: Boston Hardware, Boston Software, New York Hardware, and New York Software.
Figure 9.6 Granular data warehouse data
Figure 9.7 illustrates the data explosion that occurs by adding aggregations for higher levels of the dimensions from Figure 9.6. Five additional cells are added: Northeast Hardware, Northeast Software, Boston Computer Products, New York Computer Products, and Northeast Computer Products. Each additional dimension level would greatly increase the number of values to be calculated. Adding extra dimensions would increase the number of aggregations even more dramatically. As the number of aggregations grows, the amount of disk space used grows. The data explosion ratio depends on the number of dimensions, the levels of the hierarchies, and the parent/child ratios between levels in the hierarchy.
Figure 9.7 Data with aggregations showing explosion
OLAP Services includes the Storage Design wizard that calculates the performance gain from building a certain percentage of aggregations. The Storage Design wizard, shown in Figure 9.8, handles data explosion by finding the "80/20" rule in the data. In other words, it is common for 20 percent of all possible pre-aggregations to provide 80 percent of performance gain. The wizard also analyzes level counts for each dimension and parent/child ratios for each level.
Figure 9.8 The OLAP Services Storage Design wizard
A cube consists of measures (or quantitative data, such as sales or costs) and dimensions (or descriptive data, such as geographical regions, time, or customer demographics). In this exercise, you will use the Dimension wizard to build three dimensions for your cube: Time, Product, and Customer.
NOTE
When defining some levels, you may receive warning messages that the child level has fewer members than the parent level. This is not an error; it is a warning because most child levels have more members than their parent level. If you know that in your cube, the child level should have fewer members than its parent, then you can click Yes to continue. This warning is expected when you define the Region level in the Customer dimension. The Region column stores the state or province for each country but some countries in the Northwind_Mart database do not have Region attributes. Click Yes when this occurs.
In this exercise, you will use the Cube wizard to build the cube that will be used for sales analysis. Using the Cube wizard, you will combine the dimensions that you defined in the previous exercise with a new dimension that you will create. The new dimension contains several measures (the quantitative data derived from the columns in the Sales_Fact fact table).
You decide that you need another dimension that gives you data about employees. You can easily build this dimension right in the Cube Editor.
NOTE
Dimensions built in the Cube Editor are private dimensions; that is, they can be used only with the current cube on which you are working and cannot be shared with other cubes.
At this time, you decide you need another measure that is calculated from other measures. You can easily build a calculated member in the Cube Editor.
In the Calculated Member Builder, you will build the calculated member from the measures that have already been added to the cube.
[Measures].[Line Item Total]-[Measures].[Line Item Discount]
OLAP Services allows you to calculate aggregations, which greatly improves the efficiency and response time of queries. In the following exercise, we see how to use the Storage Design wizard to specify the way that OLAP Services stores these aggregations and to optimize query processing performance for your cube.
You are instructing OLAP Services to give a performance boost of up to 80 percent, regardless of how much disk space this requires. Administrators can use this tuning ability to balance the need to improve query performance against the disk space required to store aggregation data.
You can watch the Performance vs. Size graph on the right of the screen as OLAP Services designs the aggregations. Here you can clearly see how increasing performance gain requires additional disk space.
The Process window appears in which you can see your cube being processed. When processing is complete, a message appears, stating that processing has completed successfully.
Click Tables, and notice all the new tables with the Sales_Sales1 prefix listed in the right pane. If you do not see the tables, press F5 to refresh the information. These tables are the OLAP aggregates that are stored in the relational Northwind_Mart database because you chose ROLAP as the storage mechanism for the cube data. We discussed OLAP Services storage options in Chapter 5 and we will get into much greater detail in Lesson 4, later in this chapter.
The Cube Browser appears, displaying a grid made up of one dimension and the measures of your cube. The additional dimensions appear in the top of the browser.
This lesson described private and shared dimensions and how they are impacted by aggregations You must understand the choices that you make when creating dimensions and how they relate to your hardware requirements. You also learned how to build and edit data cubes using both the Cube wizard and the Cube Editor.