Data Backup and Recovery


After spending several million dollars on your BI decision-support environment, you want to make certain that you will never lose the content of the BI target databases and that you will never be deprived of the analytical capabilities of the BI applications for a long period of time.

There is a school of thought that says, "Don't worry about backing up your BI target databases because that data is derived from other systems ”if the data is destroyed , simply rebuild it." This is a careless and expensive attitude when dealing with a very large database (VLDB). Although backing up a database is time-consuming and takes the database offline for several hours, the alternative of reloading years' worth of data into a VLDB will take much longer ”if it can be done at all. Not every organization opts to keep all source extract files for years just in case it needs to reprocess them.

It is mandatory to back up the BI target databases on a regular basis, but the sheer size of VLDBs make this a technological challenge. Many of the hardware platforms on which BI applications reside often have limitations on the amount of data that can be backed up on a regular basis. These limitations are due to the slow speed of data transfers between the server and the backup device. Several backup strategies are available to mitigate this problem.

  • Incremental backup: One strategy is to take advantage of the grow-only aspect of BI target databases (no updating of rows) by backing up only the actual changes to a database (new rows) since the last update rather than the entire database. This incremental ("net change") backup strategy is even possible for most daily backups. However, since there are usually multiple databases in the BI decision-support environment, and since the summarized data must stay synchronized with the detail data, no loads or refreshes can occur to any of these databases until the backups of all databases have completed successfully.

  • High-speed mainframe backup: Another possibility is to use the mainframe transfer utilities to pass BI data back to the mainframe for a high-speed backup, which is supported only on the mainframe. Channel connects on the mainframe allow speeds that cannot yet be approached on most midrange servers. This is an expensive solution, but it is a robust one that usually works.

  • Partial backup: Another strategy relies on partitioning the database tables by date to support partial backups. While one partition is being backed up, the other partitions can remain available. Considerations about this strategy are listed below.

    - Databases, which support parallelization of backups, have a major advantage with this strategy since multiple partitions can be backed up at the same time.

    - If your BI target databases are loaded daily, group multiple days into one partition rather than setting up a new partition for each day. During backup, the data for all days in the partition being backed up would not be available.

    - A big drawback of this strategy is that if the table is partitioned by a date column for backup purposes (which means it is clustered by the date column), it cannot be clustered in any other way for access purposes. This can affect performance when running the reports and queries, unless database parallelism is used.



Business Intelligence Roadmap
Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications
ISBN: 0201784203
EAN: 2147483647
Year: 2003
Pages: 202

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net