Lesson 2: Data Warehousing Life Cycle | Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit

The goal throughout the rest of this book is to transform the OLTP-oriented Northwind database into a fully functional data warehouse. You will build a data warehouse that will service the needs of other activities performed within the enterprise, like querying and reporting on the data. These activities would not be as efficient if they were pulling directly from a highly normalized system such as those typically found in transaction processing applications.

In this lesson, you will be introduced to the Northwind database and review the life cycle for creating a data warehouse.

After this lesson, you will be able to:

Describe the structure of the Northwind database

Understand the life cycle of data in a warehouse

Identify the components of the Data Warehousing Framework in the context of a data warehousing architecture

Estimated lesson time: 25 minutes

Introduction to the Northwind Database

The Northwind sample database is installed with SQL Server. The database contains the sales data for a fictitious company called Northwind Traders, which imports and exports specialty foods from around the world.

This is a highly normalized relational database. The design complements the activities of a transaction processing system. Northwind Traders has a series of people who use a data entry screen to take orders for the products that they sell. Normalized databases are better suited for these activities because of the many data inserts and modifications.

Exercise 1: Introduction to the Northwind Database

In this exercise, you will review the relationship diagram of the Northwind database to get more information about the design of the database. In the exercise, you will use the Diagram Editor in SQL Server Enterprise Manager to view the diagram; there is a copy of the diagram in Appendix B, Database Schemas. The Northwind database is the starting point for our data warehousing process.

To open the relationship diagram for the Northwind database

From the Start menu, point to Programs, point to Microsoft SQL Server 7.0, and then click Enterprise Manager.

Connect to SQL Server on your computer by clicking the + sign to the left of SQL Server Group and then clicking the + sign to the left of your computer name.

Click the + sign to expand the Databases folder on your server.

Click the + sign to expand the Northwind database so that you can see the different types of objects in the database.

Click the Diagrams folder. Double-click the Relationships diagram in the details pane to open it.

You will see the tables and their relationships in the Diagram Editor. This tool allows you to create tables and their relationships and is very efficient for building and maintaining the structure of your databases.

To gain background information on the Northwind database

Notice the tables used to store the information on an order. These will allow you to answer the following questions for each order:

Who purchased the products ordered?

Who took the orders from the customers?

Who will ship the products ordered?

Notice the tables that reflect information for the products. The Product table contains attributes (table columns) that describe each product. Further related information about the supplier from which a product was purchased and the category to which it belongs is stored in the Suppliers and Categories tables. This design is a prime example of normalizing the structure to separate the data into smaller tables to describe different entities.

A closer look at the diagram will show that each order is actually stored in two tables. The header information, which is stored in the Orders table, explicitly describes the order. The detail information is stored in the Order Details table, which is able to store multiple products for each order. Each detail record is unique because a product and the order identification explicitly define it in the table.

The Data Warehousing Life Cycle

Figure 2.1 shows the parts of the life cycle when a data warehouse is being built. These parts are described in the following topics.

click to view at full size

Figure 2.1 The Data Warehousing Life Cycle

Source OLTP Systems

OLTP systems are the beginning point for data that will end up in a warehouse. In this sense, OLTP systems could be client/server databases, legacy mainframe systems, and stand-alone file-based PC applications. The focus of these systems is the details of business transactions.

Transformation Tools

Data must be collected from disparate sources. In some cases, it must be transformed from one format to another. It must then be moved into the warehouse database.

The Data Warehouse and Data Marts

OLTP data is moved into a denormalized database, which is your data warehouse. The warehouse may then be split into data marts.

Cubes

Cubes are built on top of the denormalized database to build an OLAP system; they give you better response times to queries.

A data warehouse does not require cubes or other OLAP tools to be considered a data warehouse. The warehouse is the centralized, denormalized database that holds enterprise data, and cubes are just structures built on top of the warehouse to speed queries. Cubes are explored in greater detail in Chapter 9: Microsoft SQL Server OLAP Services.

Exercise 2: Browsing an OLAP Cube

In this exercise, you will use OLAP Services to browse a prebuilt multidimensional cube that represents the sales data for a sample company named FoodMart. You will use the Cube browser to filter data and to drill down through the data to see the lower-level detail or drill up to see higher-level data. You should notice the differences between browsing data stored in a transaction processing system and browsing that in a data warehouse. This exercise gives you a first look at the Cube browser; you will work with the OLAP Manager and the Cube browser in detail in later chapters.

To start the OLAP Manager and open the Cube browser

From the Start menu, point to Programs, point to Microsoft SQL Server 7.0, and then in the OLAP Services group, click OLAP Manager.

In the console tree, click the + sign to the left of OLAP Server, then click the + sign to the left of your computer name.

Expand the FoodMart database, and then expand the Cubes folder in the FoodMart database.

Right-click the Sales cube, and then click Browse Data.

The Cube browser appears, displaying a grid made up of one dimension and the measures of your cube. The additional dimensions appear at the top of the browser.

To change a dimension in the grid

In this procedure, you will replace one dimension in the grid with another.

Drag the dimension from the top box and drop it on the dimension to the left of the grid.

Make sure that your pointer looks like this when you drop the dimension:

Select the Product dimension and drag it to the grid, dropping it directly on top of MeasuresLevel. Your pointer should look like this when you drop the Products dimension:

To filter your data by time

Select the drop-down list for the Time dimension.

Expand All Time, then expand 1997, and then click Q1.

You will see that your filtered data now reflects figures for the first quarter of 1997 only.

To drill down through the data

Double-click the column header in the grid that says + Food.

The cube expands to include the Product Department level.

Close the Product Department level by double-clicking the expanded Food column header.

Click Close to close the Cube browser.

Lesson Summary

Data in a warehouse has a life cycle. It originates in other systems, usually OLTP systems. The data is transformed and imported into a data warehouse. Once it is there, it is possible to build cubes, which are precalculated multidimensional views of the data. Cubes provide speed and ease of access to querying tools and therefore to end users.