Flylib.com

Books Software

 
 
 

Next Steps


Next Steps

Now that we have created a well-structured data warehouse database, we need to look at techniques for loading data from the source systems (covered in Chapter 4) and providing the information to users in a meaningful way (Chapter 5, "Building an Analysis Services Database").

Extending the Manufacturing Solution

We have really only scratched the surface of a full manufacturing data warehouse in this chapter, but remember that our goal is to achieve business value by focusing on delivering solutions that work, not by trying to model the entire business in one project. Based on the business problem described at the beginning of the chapter, we can include some new and interesting subject areas in the next versions of the data warehouse. One valuable area is to include quota or budget numbers so that users can track their performance against their targets. Also, we can look at adding information and calculations to support business objectives such as improving on-time deliveries.

Using the BI Development Studio to Generate the Data Warehouse

We will be using the BI Development Studio extensively in the next couple of chapters to develop the ETL process and to build some Analysis Services cubes. Instead of starting by manually building the data warehouse database as we did in this chapter, we could have started by defining our dimensions and facts at a logical level in the BI Development Studio, and then have generated the corresponding database tables.

This is a useful technique that enables you to use the BI Development Studio as a rapid prototyping tool to develop cubes and dimensions without an underlying data source. However, if you have experience in building data warehouse schemas by hand, you might find this approach somewhat limited in that you don't have complete control over the table structures and data types generated. BI developers who are not comfortable with building databases directly might find it much more intuitive, however, and then an experienced database designer could adjust the tables after they are generated.

To use the BI Development Studio in this way, you can create a new Analysis Services project with no data source, and then when adding dimensions, select the "Build without using a data source" option. After you have added attributes and hierarchies to the dimensions, you can select Generate Relational Schema from the Database menu to create the database.



Summary

In this chapter, we built a data warehouse that consolidates information from ERP systems and other sources. The data warehouse acts as a central repository that can support information- intensive business initiatives with reduced cost and complexity.

In this first phase of development, we focused on the shipments business process to provide sales and profitability analysis. We established the grain of the Shipments fact table and identified the key dimensions and attributes that will be used to analyze the information. The Product dimension has a natural hierarchy with three levels, so we used a snowflake design to make the ETL process easier and to improve performance of Analysis Services dimension loading.

The Customer and Time dimension tables both have multiple relationships with the Shipments fact table, so we added a column to the fact table for each relationship, such as ShippingCustomerKey and BillingCustomerKey. The Shipments fact table also includes several degenerate dimension columns such as InvoiceNumber.

We implemented surrogate keys for every dimension table using SQL Server's IDENTITY columns, and added a clustered index on the business key to improve the performance of key lookups in the ETL process. We used a clustered index on the order date key for the fact table. We created views for all fact and dimension tables and granted read access to these views to Windows groups for direct user access and the Analysis Services service, and set up another Windows group with read and write access to the tables to support the ETL process.

For deployment, we used the Simple recovery model for the database and set up a backup process synchronized with the data load cycle. We also defined a maintenance plan for the database and scheduled it to run automatically.