Fundamental Concepts | Professional SQL Server Analysis Services 2005 with MDX (Programmer to Programmer)

The cube is the foundation of a multidimensional database. Each cube typically contains more than two dimensions. The Adventure Works cube in the sample database contains 21 dimensions. The Adventure Works sample project and relational database need to be selected explicitly during installation from Documents, Samples, and Sample Databases. Using BIDS open the sample Adventure Works project from Program Files\Microsoft SQL Server\90\Tools\Samples\AdventureWorks Analysis Services Project\Enterprise and deploy it to your Analysis Services instance. If you open the Adventure Works cube in BIDS you can see the measures and dimensions that make up the Adventure Works cube in the Cube Structure tab as shown in Figure 3-1.

image from book
Figure 3-1

The Measures object is basically a special dimension of the cube which is a collection of measures. Measures are quantitative entities which are used for analysis. Each measure is part of an entity called a measure group. Measure Groups are collections of related measures and each measure can only be part of a single measure group. Often you will want to have one measure group for each fact table in your data warehouse. Measure groups are primarily used for navigational purposes by design tools or client tools to have better readability or ease of use for end users. They are never used in MDX queries while querying measures. However, they can be used in certain MDX functions which, by the way, you will see in this chapter and in Chapter 7. By default Analysis Services generates a measure group for every fact table, so you don't have to worry about changing the measure group's design. If you want to, of course, you can.

In Figure 3-1 you can see the twenty-one dimensions that are part of the Adventure Works cube. To understand the fundamental concepts of MDX through illustrations we will use three of the dimensions Product, Customer, and Date. Each dimension has one or more hierarchies and each hierarchy contains one or more levels. You learn about dimensions, hierarchies and levels in detail in Chapter 5. We will use the hierarchies Calendar, Product Line and Country from the dimensions Date, Product and Customer respectively to understand fundamental concepts in MDX. Figure 3-2 shows a section of the Adventure Works cube using the three hierarchies Calendar, Product Line and Country. The Calendar hierarchy of the Date dimension contains five levels: Calendar Year, Calendar Semester, Calendar Quarter, Month and Date. For illustration purposes only the top three levels of the Calendar hierarchy are used in Figure 3-2. The Product Line and Country are attribute hierarchies and have two levels, the All level and the Country level. Note that Figure 3-2 does not show the All level.

image from book
Figure 3-2

Members

Each hierarchy of a dimension contains one or more items that are referred to as members. Each member corresponds to one or more occurrences of that value in the underlying dimension table. Figure 3-3 shows the members of the Calendar hierarchy in the Date dimension. In the Calendar hierarchy the items 2004, H1 CY 2004, H2 CY 2004, Q1 CY 2004, Q2 CY 2004, Q3 CY 2004, and Q4 CY 2004 are the members. You can see that the items at each level together form the collection of the members of the hierarchy. You can also query the members of a specific level. For example Q1 CY 2004, Q2 CY 2004, Q3 CY 2004, and Q4 CY 2004 are members of the level Calendar Quarter.

image from book
Figure 3-3

In MDX each member of a hierarchy is represented by a unique name. The unique name helps to identify specific members. The unique name for a member is dependent upon properties of a dimension within a cube such as MemberUniqueNameStyle and HierarchyUniqueNameStyle. The algorithm determining the unique name of a member is not discussed in this book. You can access members of a dimension by using the name path (using the name of the member) or the key path (using the key of the member). Using the default properties in BIDS to creating your cubes and dimensions you can access a member in a dimension with its dimension name, hierarchy name and level name. For example, member Q1 CY 2004 in the Calendar hierarchy is represented as

     [Date].[Calendar].[Calendar Quarter].[Q1 CY 2004]

The brackets [ and ] are used to enclose the names of the dimension, hierarchy, levels, and members. It is not necessary that these names be enclosed within the square brackets every time, but whenever you have a name that is separated by a space, or has a number in it, or if the name is an MDX keyword, brackets must be used. In the preceding expression the dimension name Date is an MDX keyword and hence must be enclosed within brackets.

The following three representations are also valid for the member Q1 CY 2004.

     [Date].[Calendar].[Q1 CY 2004]                                   (1)     [Date].[Calendar].[CY 2004].[H1 CY 2004].[Q1 CY 2004]            (2)     [Date].[Calendar].[Calendar Quarter].&[2004]&[1]                 (3)

In the first representation the member is represented in the format Dimension.Hierarchy.Member name. You can use this format so long as there are no two members of the same name. For example, if quarter 1 in each year is called Q1 then you cannot use the above format; you would need to qualify using the level name in the MDX expression. If you do use the above format it will always retrieve Q1 for the first year in the hierarchy. In the second format you can see the navigational path for the member clearly since you see all the members in the path. So far the formats for accessing members you have seen are all using the name of the members. The final representation uses the key path where you can see the key of members in a path is represented by &[membername]. When you use the key path the members are always preceded with the & symbol.

Another example is the member Australia of hierarchy Country in the Customer dimension, which would be specified as

     [Customer].[Country].Australia

Notice that there are no square brackets in the expression for the member Australia. This is due to the fact that Australia is one word and no numbers are involved. In general, you can use the following format for accessing a member.

     [DimensionName].[HierarchyName].[LevelName].[MemberName]

The above format is predominantly used in this chapter as well as the book. If you are developing client tools we recommend you make an effort to understand the unique name algorithm from product documentation based on the properties set for dimensions. In this way, you can use the best name format to access a member.

Cells

In Figure 3-2 you can see three faces of the cube. You can see the front face of the cube has been divided into 16 small squares, and each square holds a number. The number represented within each square is the measure "Internet Sales Amount" of the AdventureWorksDW cube. If you view the remaining visible faces of the cube you will realize that each square you analyzed in the front face of the cube is actually a small cube. The top-right corner square of the front face contains the value 1134; you will notice that the same number is represented on the other sides as well. This smaller cube is referred to as a cell.

The cell is an entity from which you can retrieve data that is pertinent to an intersection of the dimension members. The number of cells within a cube depends on the number of hierarchies within each dimension of a cube and the number of members in each hierarchy. As you can imagine, cells hold the data values of all measures in a cube. If the data value for a measure within a cell is not available, the corresponding measure value is a Null value.

If you are familiar with three-dimensional coordinate geometry you are aware of the three axes X, Y, and Z. Each point in three-dimensional coordinate geometry is represented by an X, Y, and Z coordinate value. Similarly, each cell within a cube is represented by dimension members. In the illustration shown in Figure 3-4, you can see the three dimensions: Product, Customer, and Date. Assume that each of these dimensions had exactly one hierarchy that is illustrated in the figure, namely, Product Line, Country, and Calendar Time. From the figure you can see that the Product Line has 4 members, the Calendar Time has 4 members (considering only the quarters), and Country has 6 members. Therefore the number of cells is equal to 4*4*6 = 96 cells.

image from book
Figure 3-4

Now that you have learned what a cell is, you need to understand how to retrieve data from it. Assume you want to retrieve the data shown by the shaded area in the cube. The Sales amount represented in this cell is 966. This cell corresponds to the intersection of Product=Mountain, Date=Quarter2, and Customer=Australia. The value 966 corresponds to the total Sales of Mountain tires in Quarter2 bought by Customers residing in Australia. To retrieve data from the cube you need to send an MDX query to Analysis Services. What you need to do is retrieve the "Sales Amount" from the Cube based on the condition that uniquely identifies the cell that contains value 966. That MDX query is

     SELECT Measures.[Internet Sales Amount] on COLUMNS     FROM [Adventure Works]     WHERE ( [Date].[Calendar].[Calendar Quarter].&[2004]&[1],         [Product].[Product Line].[Mountain],         [Customer].[Country].[Australia])

You can see from this query that you are selecting the Measures.[Internet Sales Amount] value from the Adventure Works cube based on a specific condition mentioned in the WHERE section of the MDX query. This condition specified in the WHERE clause uniquely identifies the cell. All you have done in the condition is to list the members (which you learned about in the previous section) that uniquely identify the cell, separated by commas. An MDX expression like this that uniquely identifies a cell is called a tuple.

Tuples

As you saw in the previous section, a tuple uniquely identifies a cell or a section of a cube. A tuple is represented by one member from each dimension, separated by a comma, and is enclosed within the parentheses characters, ( and ). A tuple does not necessarily have to explicitly contain members from all the dimensions in the cube. Some examples of tuples based on the Adventure Works cube are

([Customer].[Country].[Australia])
([Date].[Calendar].[2004].[H1 CY 2004].[Q1 CY 2004], [Customer].[Country].[Australia])
([Date].[Calendar].[2004].[H1 CY 2004].[Q1 CY 2004], [Product].[Product Line].[Mountain], [Customer].[Country].[Australia])

In the preceding examples, Tuples 1 and 2 do not contain members from all the dimensions in the cube. Therefore they represent sections of the cube. A section of the cube represented by a tuple is called a slice since you are slicing the cube to form a section (slice) based on certain dimension members.

When you refer to tuple ([Customer].[Country].[Australia]) you actually refer to the sixteen cells that correspond to the country Australia for the example shown in Figure 3-4 (the cells in the front face of the cube). Therefore when you retrieve the data held by the cell pointed to by this tuple you are actually retrieving the Internet Sales Amount of all the customers in Australia. The Internet Sales Amount value for the tuple ([Customer].[Country].[Australia]) is an aggregate of the cells encompassed in the front face of the cube. The MDX query to retrieve data represented by this tuple is

     SELECT Measures.[Internet Sales Amount] on COLUMNS     FROM [Adventure Works]     WHERE ([Customer].[Country].[Australia])

The result of this query is $9,061,000.58

The order of the members used to represent a Tuple does not matter. What this means is that the following tuples

([Date].[Calendar].[2005].[H1 CY 2004].[Q1 CY 2004], [Product].[Product Line].[Mountain], [Customer].[Country].[Australia])
([Product].[Product Line].[Mountain], [Customer].[Country].[Australia], ([Date].[Calendar].[2005].[H1 CY 2004].[Q1 CY 2004])
([Customer].[Country].[Australia], [Date].[Calendar].[2005].[H1CY 2005].[Q1CY 2005], [Product].[Product Line].[Mountain] )

are all equal and uniquely identify just one cell. Because a Tuple uniquely identifies a cell, it cannot contain more than one member from each dimension.

A tuple represented by a single member is called a simple tuple and does not have to be enclosed within the parentheses. ([Customer].[Country].[Australia]) is a simple tuple and can be referred to as [Customer].[Country].[Australia] or simply Customer.Country.Australia. When there is more than one member from each dimension, the tuple needs to be represented within the parentheses characters. A collection of tuples forms a new object called sets, which is frequently used in MDX queries and expressions.

Sets

A set is a collection of tuples that are defined using the exact same dimensions, both in type and number. A set is typically specified with the curly brace characters { and }. The following examples illustrate sets.

Example 1

The Tuples (Customer.Country.Australia) and (Customer.Country.Canada) are resolved to the exact same dimension. A collection of these two Tuples is a valid Set and is specified as

     {(Customer.Country.Australia), (Customer.Country.Canada)}

Example 2

The tuples (Customer.Country.Australia, [Product].[Product Line].[Mountain]) and (Product.Country .Canada, [Date].[Calendar].[2004].[H1 CY 2004].[Q1 CY 2004]) cannot be combined to form a set. Even though they are formed by two dimensions, the dimensions used to resolve the tuple are different. Both tuples have the Customer dimension but the second dimensions are different.

Example 3

Each of the following tuples has the three dimensions Date, Product, and Customer.

     1.    ([Date].[Calendar].[2004].[H1 CY 2004].[Q1 CY 2004], [Product].[Product Line].[Mountain], [Customer].[Country].[Australia]),     2.    ([Product].[Product Line].[Mountain], [Customer].[Country].[Australia], ([Date].[Calendar].[2002].[H1 CY 2002].[Q1 CY 2002])     3.    ([Customer].[Country].[Australia], Date].[Calendar].[2003].[H1 CY 2003].[Q1 CY 2003], [Product].[Product Line].[Mountain] )

The members in the date dimension of the three tuples above are different and therefore these tuples refer to different cells. As per the definition of a set, a collection of these tuples is a valid set and is shown here

     { ([Date].[Calendar Time].[2004].[H1 CY 2004].[Q1 CY 2004], [Product].[Product Line].[Mountain], [Customer].[Country].[Australia]), ([Product].[Product Line].[Mountain], [Customer].[Country].[Australia], ([Date].[Calendar].[2003].[H1 CY 2003].[Q1 CY 2003]),([Customer].[Country].[Australia], [Date].[Calendar].[2002].[H1 CY 2002].[Q1 CY 2002], [Product].[Product Line].[Mountain] )}

A set can contain zero, one, or more tuples. A set with zero tuples is referred to as an empty set. An empty set is represented as

{ }

A set can contain duplicate tuples. An example of such a set is

     {Customer.Country.Australia, Customer.Country.Canada, Customer.Country.Austrailia}

This set contains two instances of the tuple Customer.Country.Australia. Because a member from a dimension by itself forms a tuple, it can be used in MDX queries. If there is a tuple that is specified by only one dimension, you do not need the parentheses to specify the cell. Similarly, when there is a single tuple specified in the query you do not need curly braces. When an MDX query is being executed, implicitly this tuple is converted to a set in the query rather than representing the object as a tuple.

Now that you have learned the key concepts that will help you to understand MDX better, the following section dives right into the MDX query syntax, the operators used in an MDX query or an MDX expression.