Managing the Solution | Practical Business Intelligence with SQL Server 2005

As we have described, the changes you need to make to enable real-time BI mostly relate to Analysis Services partition settings and Integration Services packages. The management of a real-time solution does not differ much from that of standard BI solutions, except that you need to pay careful attention to the complete process of handling new data. (Because there is no overnight processing window to give you time to smooth out any mistakes, users will be aware of any problems right away.)

Operations

Operation of a real-time OLAP database does not differ much from a regular OLAP database. You have the additional task of creating new partitions and merging old partitions.

Backup

In a real-time application, data usually trickles into the relational database on a continuous basis. There is no batch loaded at regular intervals. This means you must use the Full recovery model and back up the transaction log files to protect your database. You only need to back up the partitions that are read-write, which should just be the most recent partition.

Unlike a relational database with partitions, for Analysis Services you need to back up the entire cube each time. You may have expected to get away with only backing up the current partition, but you can't because you cannot select which partitions to back up.

Although you can technically delete the relational tables underlying the partitions after the data has been loaded into Analysis Services, you should not do this. Partitions set to use proactive caching may revert to ROLAP mode depending on the settings and require the relational tables. More important, if you ever need to reprocess or rebuild the partitions, you must have the relational data. So, be sure that the data sources are formally backed up and are accessible to you.

Rolling Partitions

Remember why we created partitions in the first place: so we wouldn't have to reprocess more data than we had to. Therefore, we created a partition for history and another for the current month. We only reprocess the partition for the current month. In a perfect world, June data arrives in June, and July data arrives in July. However, this is not our experience. Data is delayed, or post-dated. It arrives out of sequence. Because of the reality of our business, we cannot simply close off the previous month and open the current month's partition to updates.

Depending on your business rules, you might need to keep monthly partitions open for two or three months, and the current month's partition must be open to all future dates. To maintain your partitions, you create enough "current" partitions to accommodate the latest data that could arrive out of sequence. If that's 90 days, for example, you want three monthly partitions. Each partition will have a definite start and end date specified in the query to restrict the rows. The exception is the last partition, which does not restrict the end date of the fact rows.

When you create a new partition, you will likely retire an earlier partition by merging it with the historical partition.

After merging a number of current partitions into the historical partition, you need to reprocess the historical partition to improve the aggregations and efficiency of their organization. You should do this during a maintenance period due to the resource consumption of the operation. You need to monitor the performance of the server to determine how frequently this needs to be done.