Lesson 1: Data Analysis Concepts | Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit

This lesson introduces you to the common methods for analyzing data in a data warehouse. OLAP Services provides direct analysis tools as well an open architecture that tools vendors can use to support these data analysis methods. OLAP Services enhances data analysis by providing caching to improve performance and local cubes to enable offline data analysis.

After this lesson, you will be able to:

Describe basic data analysis concepts

Describe how to analyze cube data when disconnected from the network

Describe the basic functionality of the PivotTable Services

Estimated lesson time: 20 minutes

Drilling into Data

One of the basic and most frequently used practices for viewing warehouse data viewing data at different levels of granularity is referred to as drilling into the data. The two most common drilling methods are drilling down or up and drilling across or through.

Drilling Down

The drill-down method is used to see additional details for a query. Typically, you drill down the hierarchy of one or more dimension tables.

For a data analysis of performance over time, for example, you might first view the total for the year, then expand the time dimension table to see quarterly data, and then further expand it to see monthly, weekly, or daily data.

You can add additional dimensions for your query. For example, for the query of performance over time, you may also want to compare the performance of geographic regions or sales persons.

Drilling Up

You drill up through data when you show fewer levels in the dimension tables. This view allows you to see a larger scale of information. For example, if your query currently includes daily data, but you are interested in quarterly data, you could contract the time dimension table hierarchy.

Drilling Across

Occasionally, you may want to compare data within two or more of the fundamental business processes. When you do this, you must analyze the data for all the business processes at the same level of granularity. You can analyze business processes at the same level of granularity by drilling across the data joining two or more fact tables (representing the business processes) that link to a common dimension table, such as time or geography.

Be careful to ensure that you always join the fact tables to a common dimension table when drilling across. If you analyze data from more than one fact table at different levels of granularity, you may draw invalid conclusions from the analysis. It is important to remember one of the key principals of OLAP design: Incorrect (or misinterpreted) information is more dangerous than no information.

Drilling Through

Drilling through describes the process of accessing the source OLTP database automatically as the final level of detail in the drill-down analysis. Note that this capability is already present if the data is warehoused at the line-item detail level. If the data in the warehouse is summarized, drilling through requires accessing another database.

Data Mining

Data mining is a complex, model-based analysis method that uses powerful decision-making techniques based on modeling, such as linear regression and decision trees. Data mining is primarily concerned with detecting hidden trends in large amounts of data. Drilling down into data is retrospective, whereas data mining is prospective. Data mining is typically performed using third-party applications and is not discussed further in this book.

Client and Server Caching

OLAP Server caches user queries, metadata, and data. This caching makes it possible to answer new queries considerably faster by calculating answers from cached data rather than retrieving data from the disk.

Client applications connect to the OLAP Server through the client-based PivotTable Service component. PivotTable Service brings the server s multidimensional calculation engine, caching features, and query management directly to the client computer. This client/server data management model optimizes performance and minimizes network traffic.

The PivotTable Service receives metadata and data from the server in response to a query. The PivotTable Service can subsequently use data in the client cache to calculate the answer to queries without sending a new query to the server. For example, if the client cache contains values for the sales in the four quarters of a year and the user later asks for the total sales for the same year, PivotTable Service calculates the answer from the cached data.

Analyzing Data Offline

In this section, we introduce you to the idea of storing cubes locally on a client. This makes it possible to use decision support data without being connected to the Microsoft SQL Server OLAP (online analytical processing) Services Server. This mobile solution enables you to take the data offsite for analysis and presentation.

Analyze Local Cube Subset

You can analyze a subset of an existing cube without being connected to the OLAP Server by creating a local cube and storing the cube data locally in a file.

For example, you could create a local cube containing the data of one sales region to produce reports and compare product sales, client order histories, and individual performance prior to attending a meeting of regional sales representatives.

Implemented through PivotTable Service

The creation of a local cube of data is implemented through PivotTable Service. Excel 2000, other Microsoft Office 2000 applications, and custom applications can use PivotTable Service to retrieve multidimensional data from an OLAP Server for online access to OLAP Server data or from a local cube file for offline data analysis. PivotTable Service functions as a client of OLAP Services.

PivotTable Service:

Offers Data Definition Language (DDL) and other statements that enable you to define new local cubes and populate them with data.

Provides client applications a single component for connecting to OLAP Services and for stand-alone desktop analysis services.

Enables client applications to create a local cube directly from a relational data source.

Functions as a mobile desktop OLAP client that enables users to download from data sources and store the data in multidimensional structures on local computers for offline analysis.

Can work with only a single local cube partition. PivotTable Service does not implement an internal aggregation management system for defining or storing aggregations, so performance is directly related to the amount of data that it manages.

Lesson Summary

Drilling into data and data mining are the common methods for analyzing data in a data warehouse. OLAP Services provides direct-analysis tools as well an open architecture that tools vendors can use to support these data analysis methods. The PivotTable Service in OLAP Services enhances data analysis by providing caching to improve performance and local cubes to enable offline data analysis.