Maintaining data in a data warehouse is an ongoing effort that requires careful planning in order to provide continuous access to data. You need to identify routine maintenance tasks and implement a plan for performing these tasks regularly. SQL Server provides tools for automating many of these tasks.
After this lesson, you will be able to:
- Develop a maintenance plan
- Automate administrative tasks
Estimated lesson time: 20 minutes
Figure 13.1 shows the data stores and databases that are involved in a data warehouse. You need to plan for the ongoing maintenance of each of these parts in order to keep your data warehouse healthy.
Figure 13.1 Data stores and databases of a data warehouse
The first step in creating a maintenance and recovery plan for your data warehouse is to determine how mission-critical is each part of your data warehouse. This information will be combined with cost and downtime information to decide between multiple maintenance and recovery plans. Maintaining the infrastructure of a data warehouse is similar to maintaining the systems and databases that you already know. To make the task easier, you can identify and set up the core maintenance tasks that are necessary to ensure that your data warehouse performs well. These tasks include regular backups and checks for data inconsistencies.
Consider all of the following parts of the data warehouse when you identify the routine maintenance tasks.
This element is the point at which database maintenance starts. It is critical that data consistency checks and frequent database and log backups are maintained for all systems that will be providing data to our data warehouse.
Make sure that you have backup copies of all of the DTS packages, objects, or scripts that you use to transform and scrub data. It is good practice to make new backups whenever these items change and to add them to a scheduled backup to cover the rare times when manual backups are not created.
This is one of the most critical data warehousing elements. Operational data requires frequent valid backups. If the data warehouse data is lost, it can take a long time to gather the data from the OLTP or archive and retransform the data.
If the data warehouse is created first, it is typically easy to repopulate the data mart with data from the data warehouse. If the data mart is developed first, it is important to maintain frequent, valid backups.
OLAP data is derived from the data warehouse or data mart. Comparing the time required to perform a database restore with the time required to rebuild OLAP data will typically show how important it is to keep current backups of your OLAP data.
In general, you should review and perform these maintenance tasks periodically:
Keep a record of all service packs that you install for both Microsoft Windows NT Server and Microsoft SQL Server. Keep records of the network libraries, the security mode, and the system administrator password.
You can schedule many administrative tasks to run automatically. By automating recurring administrative tasks and responses to server events, you can free time to perform administrative tasks that require creativity and lack predictable or programmable responses.
You can schedule the following maintenance tasks to run automatically:
This ensures that database pages contain an equally distributed amount of data and free space. Evenly distributing the free space effectively reserves space for new data, thereby accelerating data inserts. Eliminating free space may allow a read-only database to use cache more efficiently.
Index statistics provide information about the data that is stored in the database for the query optimizer so that it can determine the best way to access data. Although SQL Server periodically updates index statistics automatically, this option can force statistics to update immediately. This is particularly useful if a large amount of data has been added or changed.
You can retain database and log backups for specified periods. This allows you to create a history of backups that you can use when you need to restore the database to a time before the last database backup.
SQL Server provides the Database Maintenance Plan wizard for automating many of these tasks, thereby reducing the work involved in database maintenance. With this wizard, you can run database integrity checks, update database statistics, and perform database backups.
You can specify that the results of maintenance tasks are written as a report to a text file, a Hypertext Markup Language (HTML) file, or the sysdbmaintplan_history tables in the msdb database. You also can send the report by e-mail to an operator.
You can use Data Transformation Services (DTS) to automate additional administration tasks that are related to data movement. When you use DTS to automate administration tasks, you must
To define jobs, operators, and alerts, you can use SQL Server Enterprise Manager, Transact-SQL scripts, or SQL-DMO objects.
In this exercise, you will use the Database Maintenance Plan wizard to create a database maintenance plan for the Northwind_Mart database.
In this procedure, you will create a plan by using the Database Maintenance Plan wizard.
Option | Value | Schedule |
---|---|---|
Update statistics used by query optimizer | 10% | Every three months on the first day of the month at 1:00 A.M. |
Check database integrity | Yes Include indexes | The first Saturday of each month at 11:00 A.M. |
Back up the database as part of the maintenance plan | Yes | The first Saturday of each month at 11:00 A.M. |
Remove files older than (for the database backup) | Three months | NA |
Back up the transaction log as part of the maintenance plan | Yes | Every week on Sunday at 11:00 P.M. |
Remove files older than (for the transaction log backup) | Four weeks | NA |
Write report to a text file in directory | Yes C:\MSSQL7\LOG | NA |
Write history to the msdb.dbo.sysdbmaintplan _history tableon this server | Yes Limit rows in the table to 1,000 rows for this plan | NA |
Plan name | Nwind_Mart_Maintenance | NA |
In this procedure, you will view the database maintenance plan that you created, and then execute a job and review its history.
This starts the Integrity Checks job manually.
A number of routine tasks need to be performed to keep your data warehouse healthy. You must identify these tasks and implement a plan to ensure that they occur regularly and reliably. SQL Server makes it possible to automate most of these tasks. You need to check the reports and logs generated by the automated tasks to ensure that they are running correctly.