Deploying and operating this solution follows the usual model for Analysis Services, data warehouse, and Integration Services but with some additional steps to take the SCD loading and processing requirements into account. DeploymentOne area that we need to pay particular attention to is the initial load of data for dimension tables. If you initially load all your dimension records with the current system date at that time, the SQL for surrogate key lookups will not work correctly for older fact records. The easiest solution to this is to use an old date, such as 1900/01/01, as the start date for all the dimension records when you initially load the dimension. That way, there will be one dimension record with an appropriate date range for each of the business keys in the fact table. Of course, you need to check that your facts themselves aren't arriving with "special" dates that are even earlieralthough these rarely represent the truth anyway, and so they should be replaced during the ETL process. OperationsThe daily processing for solutions using slowly changing dimensions is usually handled by a series of Integration Services packages to load the new and updated dimension data first followed by the fact data, and then reprocess Analysis Services objects as required. Processing Slowly Changing Dimensions in Analysis ServicesTo handle any changes or additions to the dimension tables, the Analysis Services processing will need to include running a Process Update on the dimensions. This is usually performed using an Analysis Services Processing Task in an Integration Services package that is executed after the new data has been loaded into the data warehouse. Process Update will read all the data in the dimension table and add any new members to the dimension (as well as update any changed attribute values for existing members). Tip: Use the Summary Pane to Select Multiple Dimensions for Processing If you are working on fixing an issue with SCDs' ETL and you need to reprocess several dimensions, you may have tried to select multiple dimensions at once in the free view in SQL Server Management Studio without any success. The trick is to click the Dimensions folder instead, which then shows the list of all dimensions in the Summary pane on the right, and then you can select multiple dimensions simultaneously in the Summary pane and right-click to select Process. Permission to Process ObjectsIn an environment with specialized operations staff who perform daily tasks such as managing the overnight load process, there is often a requirement to create a role that only has permission to process some database objects but has no permissions to do anything else. You can create an Analysis Services role that has the Process permission for either the whole database or specific cubes or dimensions. However, if you want to allow them to use the management tools such as SQL Server Management Studio, you also need to grant them the Read Definition permission, which means that they can see the metadata for all objects such as the data source connection string, which might not be what you want. A better choice is probably to grant them only the Process permission and then create an XMLA script that they can run that does the required job, as described in the following sidebar "Scripting Actions in SQL Server Management Studio."
BackupsIt's important to regularly back up the data warehouse when SCDs are involved because you might not be able to rebuild the data warehouse from the source if the source is only a snapshot. However, even though this means that the data warehouse contains data no longer available anywhere else, you can still use the Simple recovery model rather than the Full recovery model because you are in complete control of when updates are made to the data warehouse, and can arrange your backup strategy to match. For example, you would typically back up the data warehouse as soon as a data load has been completed and checked to make sure that you could restore to that point in time. If you are not completely sure that there are no updates trickling in between batch loads and backups, you must use the Full recovery model; otherwise, you risk losing data. |