In this chapter, we built a data warehouse that consolidates information from ERP systems and other sources. The data warehouse acts as a central repository that can support information-intensive business initiatives with reduced cost and complexity.
In this first phase of development, we focused on the shipments business process to provide sales and profitability analysis. We established the grain of the Shipments fact table and identified the key dimensions and attributes that will be used to analyze the information. The Product dimension has a natural hierarchy with three levels, so we used a snowflake design to make the ETL process easier and to improve performance of Analysis Services dimension loading.
The Customer and Time dimension tables both have multiple relationships with the Shipments fact table, so we added a column to the fact table for each relationship, such as ShippingCustomerKey and BillingCustomerKey. The Shipments fact table also includes several degenerate dimension columns such as InvoiceNumber.
We implemented surrogate keys for every dimension table using SQL Server's IDENTITY columns, and added a clustered index on the business key to improve the performance of key lookups in the ETL process. We used a clustered index on the order date key for the fact table. We created views for all fact and dimension tables and granted read access to these views to Windows groups for direct user access and the Analysis Services service, and set up another Windows group with read and write access to the tables to support the ETL process.
For deployment, we used the Simple recovery model for the database and set up a backup process synchronized with the data load cycle. We also defined a maintenance plan for the database and scheduled it to run automatically.