The MicrosoftВ Data Warehouse Toolkit: With SQL ServerВ 2005 and the MicrosoftВ Business Intelligence Toolset - page 30


In this chapter we described how the Microsoft BI toolset maps to industry standard terminology. We made a compelling argument that you can build your entire business intelligence system using software only from Microsoft. Youre not tied to Microsoft for your entire project, however: the components are open , in the sense that they are linked together by published interfaces. If you wish, you can use a non-Microsoft product for any component of your BI system. Many people will develop a heterogeneous BI systemits particularly common for an enterprise to require the relational database or front end tools to use the corporate standard.

We described the basic recommended architecture of a Microsoft BI/DW system and introduced the studio toolsSQL Server Management Studio and BI Development Studio. We will provide much more information about these tools in upcoming chapters.

Part Two: Developing and Populating the Databases

Chapter List

Chapter 4: Setup and Physical Design
Chapter 5: Designing the ETL System
Chapter 6: Developing the ETL System
Chapter 7: Designing the Analysis Services OLAP Database

Chapter 4: Setup and Physical Design

Where the rubber meets the road


Up to this point, weve been talking about project management, business requirements, logical data models, and system architectures. Until now, you havent needed an instance of SQL Server 2005 installed in your organization. That changes in this chapter, as we discuss issues surrounding the setup of your development, test, and production systems, and get you ready to start the development process.

As you can see in Figure 4.1, the physical design issues addressed in this chapter fall squarely in the middle of the Data Track of the Business Dimensional Lifecycle.

image from book
Figure 4.1: The Business Dimensional Lifecycle

We begin by helping you get a handle on the size of your business intelligence system, so you can make decisions about its basic physical configuration. Will you install all the server software components for your DW/BI system on a single machine or several? Will you use clustering or web farms? Do you need to budget for 64-bit hardware or expensive storage networks? We cant answer these questions for you, but weve provided some guidance that should help you answer them for yourself.

The decisions you make about your production hardware and software configuration should be reflected, as much as economically feasible , in your test or quality assurance system. It may seem wasteful to spend money on test systems, but if youre serious about delivering good-quality service to your existing business users, you need to be serious about testing before you roll system changes into production.

The next section of this chapter discusses some of the issues in moving from the logical data model to the physical data model, particularly for the relational data warehouse database. As we described in Chapter 2, the dimensional logical model should be nearly identical to its corresponding physical model. However, when you actually create the database you need to worry about the placement of files on disk, as well as the creation of some housekeeping tables and columns that might not show up in your logical design. We also discuss exciting design issues such as index creation and key constraints.

One of the important considerations of the physical design of your relational data warehouse database is whether to partition the fact table. Table partitioning is a new feature in SQL Server 2005, and its important for large-scale DW/BI systems. Finally, we discuss the physical design issues around aggregate, staging, and metadata tables.

We suspect many readers will be tempted to skip this chapter. At least one person on your project team should read it, even if you have very extensive experience in managing a SQL Server 2000 DW/BI system.

In this chapter, we address the following specific questions:

  • Early in the design process, how can you determine how large your DW/BI system will be? What are the usage factors that will push you to a larger and more complex configuration?

  • How should you configure your system? How much memory do you need, how many servers, what kind of storage and processors?

  • How do you install the SQL Server software on the development, test, and production servers? What do different members of the DW/BI team need to install on their workstations?

  • How do you convert the logical data model into a physical data model in the relational database? Whats a good initial indexing plan? Should you use relational table partitioning?