Removing Empty Cells | Professional SQL Server Analysis Services 2005 with MDX (Programmer to Programmer)

The number of cells in a cube is the uniquely identifiable space within the cube. It is the product of the number of members in each attribute of each dimension. This is referred to as the cube space. As you can imagine, the entire cube space can be quite large. Of the entire cube space, the cells which constitute the product of attribute hierarchies of each dimension can potentially have fact data. However, in a typical cube most of these cells will not have fact data. For example, take a simple cube that contains dimensions product, time, and store. Assume the fact table contains IDs for dimensions, product time, and the sales amount. The product dimension table typically contains columns pertaining to the dimension, such as product name, product category, product weight, product color, and discount. The store dimension table would contain information about the store such as city, state, country, and number of employees. The time dimension might contain day, month, and year. As the owner of existing stores you might be interested in looking at the sales of various products in stores across various time periods every week, month, or quarter to make a decision on what product lines to enhance to grow your business. The store manager may be interested in identifying the sales of the products along with discounts so that he can stock products that sell the most while having discounts to maximize the profit of the store. Hence, the types of questions that might be requested from your UDM might be different based on the user.

Because the cube space is typically quite large, a vast majority of the cells might be nulls, meaning no data is available. If your queries include attributes from the same dimension such as sales of products that have 10% discount, Analysis Services automatically returns the sale of products that have exactly 10% discount. As you learned earlier in this chapter, Analysis Services uses AUTO EXISTS and eliminates all members in the products hierarchy that do not exist with the 10% discount member in the discount hierarchy. Hence the results you get will not contain null values. However, if you query for data across dimensions, you can end up with several cells that are nulls. Often you are not interested in the cells with null values and you do not want to retrieve them in the result set to begin with. Analysis Services provides several functions and keywords that help you in eliminating null values in your result set.

Assume you want to analyze the Internet Sales amount across various countries for various products. Every product might not be sold in every country and so you can end up with certain country-product combinations that do not have any sales. To eliminate the null values you can use the keyword NON EMPTY on the axis or the MDX functions NONEMPTYCROSSJOIN, NONEMPTY, and FILTER. These are some basic ways to remove empty cells from the result set and you can certainly write many more ways of removing empty cells.

The operator NON EMPTY is used on an axis to remove the members that result in empty cell values. When NON EMPTY is applied then cells with null values are eliminated in the context of member on other axes. You can see this in the following query:

     SELECT [Measures].[Internet Sales Amount] on 0,     NON EMPTY [Customer].[Customer Geography].[Country].members *     [Product].[Product Model Categories].members on 1     FROM [Adventure Works]

You can use the FILTER function as shown in the following query with the condition to eliminate null values for Internet Sales Amount:

     SELECT [Measures].[Internet Sales Amount] on 0,     FILTER ( [Customer].[Customer Geography].[Country].members *     [Product].[Product Model Categories].members, [Measures].[Internet Sales Amount])     on 1     FROM [Adventure Works]

The following two queries use the NONEMPTYCROSSJOIN and NONEMPTY functions, which have similar arguments of taking a set and a filter set and eliminating cells that contain nulls. If you use the NONEMPTY-CROSSJOIN function, you can specify the sets that need to be crossjoined, and in the final result you can specify the sets that need to be included in the result set. In this example, two sets are specified:

     SELECT [Measures].[Internet Sales Amount] on 0,     NONEMPTYCROSSJOIN ([Customer].[Customer Geography].[Country].members,     [Product].[Product Model Categories].members, [Measures].[Internet Sales Amount],2     ) on 1     FROM [Adventure Works]

If you use the NONEMPTY function, the set passed as an argument needs to be a crossjoin of the sets involved as shown in the following query:

     SELECT [Measures].[Internet Sales Amount] on 0,     NONEMPTY ([Customer].[Customer Geography].[Country].members*     [Product].[Product Model Categories].members, [Measures].[Internet Sales Amount])     on 1     FROM [Adventure Works]

When NONEMPTY function is used then the members in the set are filtered based on the argument passed (Measures. [Internet Sales Amount]) in the above example.

To sum up, there are several ways of removing the nulls in your result set so that you can analyze the results that are meaningful. You can choose one of the preceding examples to restrict the null cell values. However, you should be aware that the NONEMPTYCROSSJOIN function is being deprecated because there are certain limitations while using this function with calculated members.