Adding and Enhancing Dimensions | Professional SQL Server Analysis Services 2005 with MDX (Programmer to Programmer)

Dimensions are an integral part of a cube. In this section, you learn about specific properties that affect a dimension's behavior within a cube as well as special types of relationships that can be defined between a dimension and a measure group. These features allow you to address special business requirements and thereby enhance overall analytical capabilities.

When you create a dimension within an Analysis Services 2005 database, you are actually creating a database dimension, which can be shared across multiple cubes within the same database or used multiple times within a single cube. Each instance of a database dimension within a cube is called a cube dimension. Right-click within the Dimensions pane of the Cube Structure as shown in Figure 9-7 to add a new cube dimension.

image from book
Figure 9-7

For each cube dimension, you can selectively exclude certain hierarchies or attributes by modifying properties. In the Dimensions pane on the Cube Structure tab, select the hierarchy or the attribute to be hidden and then change the applicable property in the Properties window to False. If you want to hide a hierarchy, use the Visible property. To hide an attribute, use the AttributeHierarchyVisible property.

Another important property of a cube dimension is AllMemberAggregationUsage. This property is associated with the cube dimension object itself. Changing the value of this property affects how Analysis Services builds aggregations to improve query performance. You learn about other cube dimension properties for hierarchies and attributes in Chapter 13.

Most changes to database dimensions, such as the addition or deletion of attributes or hierarchies as well as changes to most properties, are automatically reflected in the corresponding cube dimensions. However, certain changes, such as renaming a database dimension, will not result in a similar change to the cube dimension. In such circumstances you could either delete the existing cube dimension and then re-add the database dimension within the cube, or simply rename the cube dimension.

As soon as you add a cube dimension, Analysis Services 2005 attempts to detect relationships based on the DSV and to create appropriate relationships between the newly added dimension and the existing measure groups in the cube. You should switch to the Dimension Usage tab of the Cube Designer to verify relationships between dimensions and measure groups were detected correctly.

The most common type of relationship between a dimension and a measure group is a regular relationship, but there are several other types that could be defined: regular, referenced, fact, many-to-many, data mining, and no relationship. You reviewed regular and referenced relationships in depth in Chapter 6. In this chapter, you learned how to use the IgnoreUnrelatedDimensions property to determine whether you see values or nulls when there is no relationship between a measure group and a dimension. The following sections discuss the remaining three relationship types.

Fact Dimension

A fact relationship is a relationship that exists between a dimension and a measure group that are both based on the same relational table. In the Adventure Works DW sample project, the Internet Sales Order Details dimension and the Internet Sales measure group retrieve data from the FactInternetSales relational table. Fact dimensions are typically created to support detail-level reporting or scenarios in which the database does not have a well-structured star or snowflake schema but contains all information in a single table. Figure 9-8 shows the fact relationship defined between the measure group Internet Sales and the dimension Internet Sales Order Details.

image from book
Figure 9-8

To define a fact relationship, switch to the Dimension Usage tab, and then click the cell that intersects the measure group and the dimension. When you select the Fact relationship type, the Define Relationship dialog automatically assigns the key attribute of the dimension as the granularity attribute. You can define a fact relationship only when the dimension and measure group are based on the same table; a validation in the Define Relationship dialog enforces this requirement. Otherwise, a fact relationship is very similar to a regular relationship. For example, browsing the dimension with this measure group, whether in a cube browser or with your own custom MDX, will look similar to browsing data in a regular relationship. If that's the case why do we need a fact relationship type? Two reasons:

Specific optimizations are done by Analysis Services during drill-through (discussed later in this chapter) when a fact relationship is defined between a measure group and a ROLAP dimension.
Certain client tools can present data from this relationship in a way that makes it easier for users to interpret the data during analysis.

Many-to-Many Dimension

Analysis Services 2005 supports another new relationship type called many-to-many. You were introduced to many-to-many dimensions during the discussion of measure expressions. You can recognize a many-to-many relationship when a single fact row is associated with multiple members in the dimension table. Figure 9-9 shows an example of a many-to-many relationship that exists in the Adventure Works DW sample database.

image from book
Figure 9-9

As you learned in the previous section, the Internet Sales measure group is related to the Internet Sales Order Details dimension through a fact relationship. Figure 9-9 adds new relationships — the Sales Reasons measure group is related to the Sales Reasons dimension through a fact relationship and to the Internet Sales Order Details dimension through a regular relationship. In other words, each line item in a sales order (in the Internet Sales measure group as a single fact row) can have one or more sales reasons (in the Sales Reasons dimension). An intermediate dimension, Internet Sales Order Details, joins the two measure groups, one of which is an intermediate fact table. This intermediate fact table, Sales Reasons (the measure group) joins the intermediate dimension to the many-to-many dimension, Sales Reasons (the dimension). When Analysis Services 2005 aggregates the values for each many-to-many dimension member, it aggregates the values in the measure group based on the set of distinct regular dimension members related to the current many-to-many dimension member. As a result, data is aggregated to each level exactly once. For example, consider a sales order that has two different sales reasons A and B. If you request Internet Sales measure values for that sales order by Sales Reasons — specifically members A, B, and All — you will see that the measure values are aggregated to the All member exactly once because there is only one distinct sales order related to the All member. Similarly, there is only one distinct sales order related to A and to B, so all three members will display the same values in this example.

Many-to-many relationships are common in data warehouses, and now, you have the ability to model and analyze the data from many-to-many dimensions. You can use a many-to-many relationship to perform currency conversion as you saw previously in this chapter when learning about measure expressions. You can also perform weighted average calculations using the many-to-many relationship in combination with a measure expression. Many-to-many relationships can be modeled for any schema that contains at least one common dimension between the regular measure group and the intermediate measure group. If there are multiple common dimensions between the measure groups, Analysis Services 2005 aggregates values for each distinct combination of members in those dimensions related to the current many-to-many dimension member.

On the Dimension Usage tab, click the cell that intersects the measure group and the dimension for which you want to define a many-to-many relationship. In the Adventure Works sample cube, take a look at the many-to-many relationship between the Sales Reason dimension and the Internet Sales measure group. You can see the relationship requires the intermediate measure group Sales Reasons, as shown in Figure 9-10.

image from book
Figure 9-10

With this relationship defined, you can browse the many-to-many dimension Sales Reason along with the Internet Sales measure group, as shown in Figure 9-11.

image from book
Figure 9-11

The sum of the Internet Order Quantity values for the various Sales Reasons grouped with Sales Reason Type Other is 56,395 which is greater than the Total shown for Other, which is 51,314. Because a many-to-many relationship is defined between the Internet Sales measure group, which includes Internet Order Quantity, and the Sales Reason dimension, the aggregated measure values correctly use the distinct members to avoid double-counting

Data Mining Dimension

The technical definition of data mining is the process of automatic or semi-automatic discovery of hidden patterns in a data set that are not intuitive. Several data mining algorithms are available to discover different kinds of patterns. Some data mining algorithms predict future values based on the patterns detected. For example, you can first use data mining to classify customers of a retail store as Platinum, Gold, Silver, and Bronze members based on selected attributes, such as income, number of children, and so on. You can use data mining to automatically classify new store customers based on the patterns discovered for existing customers. Additionally, the retail store could decide to boost sales by providing coupons to targeted customers based on the buying patterns of existing customers. Analysis Services 2005 supports several data mining algorithms which you learn in detail in Chapter 14.

In Analysis Services 2005, the UDM is tightly integrated with the data mining features. You can, for example, create a data mining model not only from a relational data source, but also from an existing cube. When a data mining model is created from a cube, you can also create a data mining dimension from the mining model. You can then add this new dimension to the source cube to form a new cube with which you can perform analysis of cube data right alongside the data mining results.

Figure 9-12 shows the relationship definition for a data mining relationship. The target dimension, Clustered Customers, is a dimension that was derived from a data mining model. The source dimension, Customers, is the dimension from which the data mining model was originally created.

image from book
Figure 9-12

If you open the Mined Customers cube in the Adventure Works DW sample project and switch to the Dimension Usage tab, you can see the data mining relationship defined between the Cluster Customers dimension and Internet Sales measure group. Open the Cube Browser to view the breakdown of Internet sales based on the Clustered Customers dimension as shown in Figure 9-13. This dimension represents the data mining model's classification of all members in the Customers dimension into 10 different clusters. To see the characteristics of each cluster, you need to review the mining model. By combining data mining results with cube data, you can make specific business decisions. For example, using the sales information shown in Figure 9-13 in combination with the characteristics of clusters defined by the data mining model, you could decide to boost sales by developing promotions or other incentives for a specific set of customers.

image from book
Figure 9-13

Role-Playing Dimensions

A role-playing dimension is a database dimension that acts as multiple dimensions within a cube. Instead of requiring to create two database dimensions that serve different purposes, but depend on the same data source table(s), a single database dimension can be used to create separate cube dimensions. For example, if you have a geography dimension as a database dimension, you can add it to a cube as Customer Geography and Employee Geography cube dimensions. Similarly, you can have one Time dimension called Date, and then you can add Ship Date and Received Date as cube dimensions. In these examples, Geography and Date dimensions are role-playing dimensions because they can each serve different roles within the same cube.

Figure 9-14 shows how the Date dimension is used as a role-playing dimension in the Adventure Works cube of the Adventure Works DW sample project. The Date dimension plays the role of three date dimensions: Date, Ship Date, and Delivery Date. When a dimension plays multiple roles in a single measure group, the fact table for the measure group contains one foreign key column for each role, each of which must have a relationship to a single dimension table defined in the DSV.

image from book
Figure 9-14