One of the most important management tasks for any database is taking backups of the data. It may seem obvious, but you will be surprised how many companies jeopardize their business by taking backups infrequently or by not taking care of their backup tapes.
The problems with backing up a data warehouse are slightly different from backing up a typical production system. Since warehouses usually receive large loads of data, typically overnight, backups have to be scheduled along with this work. Plus, we have the added complication of deciding how to back up the database.
Why is this a problem? Well, normally, a database is backed up in its entirety, but that may not be possible with your data warehouse, especially if it is particularly large (multiple terabyte size). Therefore, incremental backups and backups of one or more tablespaces may be more practical. Careful management of the backup tapes is critical, or you could find yourself unable to rebuild the data warehouse.
Prior to Oracle 8i backups were usually taken using the Operating System Backup Utility by copying all the relevant files. Now you can consider using Recovery Manager (RMAN) to back up your database. RMAN will be discussed in this section.
In this section, we will discuss the various techniques you can use to back up your data warehouse. However, it is important to realize that if your data warehouse is very large, taking a full backup of it may be almost impossible due to time and other constraints. The following types of backups will be discussed:
Full backups
Incremental backups
Tablespace backups
Selecting the database from the Navigator window, right-clicking on the database, and then selecting Backup Management will start the Oracle 9i RMAN Backup Manager. A list of tools is provided that includes both backup and recovery; however, start by selecting Create Backup Configuration.
A backup configuration is a set of defaults used for backup operations. It determines where the backup is to be stored, what the backup medium will be (tape or disk), and whether you want to store information describing the backup in a recovery catalog. The recovery catalog will be described later in section 7.3.2.
You can create as many backup configurations as are required for your system. The purpose of this step is to specify where the backup is to be placed and how it is to be named. Figure 7.19 illustrates the first step in this wizard where the name of the backup configuration is specified.
Figure 7.19: Creating a backup configuration.
From here you now click on the various tabs to set up the options. It is not necessary to visit all of the screens, but you will have to click on the Channels tab, shown in Figure 7.19. This is where you specify the location and name of the backup file.
Once again, you can create as many channels as you require. Think of the channel as where you specify all of the physical output attributes, such as whether the backup is to go to a tape or a disk.
When it comes to naming the file, various options are available so that unique file names can be created. In Figure 7.20, we have chosen %s, which returns the backup set number and %t, which returns the backup set time stamp.
Figure 7.20: Selecting the channels in a backup configuration.
Another useful parameter is %u, which generates an eight-character name comprising the backup set number and the time the backup was taken. There is no browse capability for the directory name; therefore, ensure that it is spelled correctly.
Hint: | The directory for the backup must end in a \ or /, depending on your hardware platform. |
A complete list of all the options can be found in the Oracle 9i Backup and Recovery Manual. If you are backing up frequently, it is important to name your backup files very carefully; otherwise, you could find that you are unable to recognize a backup file.
In this example, the backup will go to disk, but in a data warehouse, it will have to go to tape. To store backups on tape RMAN requires the installation of a media management library, available from a third-party software supplier such as Legato or Veritas. Refer to Oracle's Backup Solutions Partners Program for the current list of certified vendors.
It is extremely important to ensure that the physical labels placed on the tapes match the contents. Too many times we have experienced problems where it is impossible to recreate the database from the backups, because the labels on the tapes do not match the actual tape contents. Sloppy management practices like this could mean that you lose your entire data warehouse.
Whenever you take a full backup of the system, do not discard the previous full backup set, because if there were a problem with the current backup, you would lose your entire data warehouse. Therefore, save as many full backup sets as is possible before you recycle the tapes.
There are several other tabs that can be selected. If the default user for this database cannot connect as SYSDBA, then you must click on the Preferred Credentials tab to specify a different user name and password.
The size of your data warehouse will determine how frequently a full backup is taken. Even if your warehouse is very large, a backup should always be taken periodically.
One of the tools provided in Oracle Enterprise Manager is the Backup Manager, which includes a Backup Wizard, where you can define a pre-defined backup strategy and customize your own strategy.
The first example shown here illustrates how to set up a predefined backup strategy. The first question asked by the wizard, illustrated in Figure 7.21, is how frequently the backup is to be performed and when it is to be run.
Figure 7.21: Backup frequency.
For those readers unfamiliar with creating backup procedures, the wizard advises on when backups should be taken based on the type of applications using this database. You can see that, for a data warehouse, it is suggesting once a week and picks Sunday, which it expects to be a quiet day, suitable for running backups. Of course, you can change the frequency and day.
The next screen presented by the wizard asks at what time the backup will be run. A review of the options is displayed in Figure 7.22. This is where you specify which backup configuration is to be used. The backup configuration was created in the previous section and is used to specify where the backup is to be stored and whether the data is backed up to tape or disk.
Figure 7.22: Backup configuration selection.
Now we have to choose which database will be backed up. In Figure 7.23, we have selected our data warehouse database EASYDW.
Figure 7.23: Selecting the database.
The definition of the backup procedure is almost complete. In Figure 7.24, we can review all of the options that we have selected. To change any of the values, press the Back key to return to the appropriate screen, and make the necessary modifications.
Figure 7.24: Reviewing the backup procedure.
Once the definition of the backup is complete, it will be placed on the console job queue, as illustrated in Figure 7.25, until it is time for it to run.
Figure 7.25: Backup Job on the console job queue.
An alternative to the predefined backup is to create a custom backup strategy.
The first question asked by the wizard is what type of backup you require, as illustrated in Figure 7.26. If Entire Database is selected, you will then have the choice of either backing up the entire database or taking an incremental backup. An incremental backup could be of interest to a data warehouse user, because it allows you to back up only the data that has changed, instead of the entire data warehouse.
Figure 7.26: Type of custom backup.
Whether or not the database is running in archive log mode will determine the types of backup options available. In Figure 7.26, more options are available because archive log mode is enabled. Enabling archive logging means that all changes to the database are copied into the archive log prior to the redo log being reused.
The steps in a custom backup are very similar to those we have seen previously: You select the type of backup, the channel, when it is to be run, and then name it, as shown in Figure 7.27. This is when you have the option of including this job in the job library.
Figure 7.27: Including backup job in library.
The job will now run automatically and is available for future use because it has been stored in the job library.
We mentioned earlier that a type of backup very useful in the data ware-house is the incremental backup, because it allows us only to back up the parts of the database that have changed. The screens presented for an incremental backup are very similar to those for a full backup, except that the screen in Figure 7.28 will appear.
Figure 7.28: Incremental backup level.
Here you must specify the incremental backup level. It is extremely important to select the correct level; otherwise, you may get a full backup by mistake.
To clarify how the backup level is specified, refer to Figure 7.29. A complete backup is taken on Sunday; this is our Level 0. At close of business on Monday, a Level 2 incremental backup is taken, which backs up all the changes since Sunday. On Tuesday, another Level 2 incremental backup is taken, which will contain all the changes to the database since Monday. On Wednesday, a Level 1 incremental backup is taken. Since 1 is greater than 0 but less than 2, all the changes since Sunday will be placed in this backup file. On Thursday, a Level 2 backup is taken, so the most recent backup with a value less than or equal to 2 is the Level 1 backup from Wednesday; therefore, this backup will contain all the changes since Wednesday.
Figure 7.29: Selecting an incremental backup.
Using this approach may, at first, seem complicated, but when automatic recovery is used, the Recovery Manager will decide for itself which complete and incremental backups are needed to recover the database.
Another technique for backing up the database is a tablespace backup. Using this technique, you have the ability to back up a specific tablespace, which, in a data warehouse, could make for a very nice backup strategy.
In Figure 7.30, we see one of the screens from the Backup Wizard asking us which tablespaces we want to back up. In our example, we have selected the two tablespaces that contain the January data.
Figure 7.30: Selecting a tablespace to back up.
Here is a good example of where designing the database for management should be considered during the initial construction process. If we can group our updates to the database by date, and we know that after a given point in time there will be no changes made within this tablespace, then, if we set the tablespace to be read-only and back it up, we can rest assured that we have captured the data in that tablespace. Since it will never change, we won't have to back it up again.
Changing the status of a tablespace to read-only is very simple. You can either enter the SQL command, shown in the SQL Text box in Figure 7.31, or select Storage on the Navigator tree, select the tablespace, and then click on the box at the bottom of the screen marked Read Only. Finally, clicking on the Apply button will make this a read-only tablespace. You can change it back to a read/write tablespace at any time.
Figure 7.31: Making a tablespace read-only.
Of course, it makes sense to periodically take full backups, just in case there are any problems with the backup files you have taken. Taking only tablespace backups is okay, but it still makes it possible for somebody to accidentally overwrite the tape containing all of the data for a given month.
When a database is being built, designers are constantly aware of the size of the database. However, that information seems to get lost when people start thinking about backing up the database. Everyone tends to say: Well it's 300 GB, so it will take x minutes to back up. What people forget to add is that the backup file will need y gigabytes of space.
In the examples shown here, we have used disks to store our backups. For a real data warehouse, however, you will be storing them on tape; so don't forget how many tapes you will require for the backup strategy you will be implementing. It has been known for sites to actually run out of tapes! Our Easy Shopping database is small and occupies approximately 1.33 GB on an NT system. In Figure 7.32, we can see the different sizes for the various backup files.
Figure 7.32: Backup file sizes.
A full backup occupies 527 MB, which is about 25 percent of the full size of the database. Therefore, you can see that you can save a significant amount of storage space by using the backup provided by Oracle 9i. If a backup were taken using the Operating System backup utility, which may or may not compress the file, then 1.33 GB of storage would be needed, instead of 527 MB.
The even smaller files are the incremental and tablespace backups, and here we can see the big advantage of using this utility.
Admittedly, only a few changes were made to the database. But you can see that even for the incremental backups, the size of the file ranged from 3.85 to 10.8 MB, which is a big difference from the requirements for a full backup.
By taking incremental backups, you can make backups of your ware-house more frequently and not have to worry about when you are going to find the time to back up the entire warehouse.
One final reminder on this topic: During the testing of the data ware-house, don't forget to test your backup procedures and obtain some estimates of possible run times for backups. Then you can discuss your requirements with the operations department, so that your management tasks can be included with all of the other work that has to be done.
Information describing your RMAN backups can be stored in the target database's control file or in a recovery catalog. The recovery catalog should always be placed in a database different from the data warehouse. It can be stored in the same database as your OEM repository. Using a recovery catalog is a safer alternative than storing it in the control file alone, since the information is stored separately from the database. It also allows you to store information about all of your backups in a central place.
When creating a backup configuration using OEM, there is a tab to specify whether a recovery catalog is to be used. In the example shown in Figure 7.33, we have stated that the recovery catalog is to be used. It can be found in the database orcl.us.oracle.com, and we must supply a user name and password to access that information.
Figure 7.33: Recovery catalog.
As mentioned previously, the recovery catalog should always be placed in a different database from the one being backed up. We have used the same database for illustration purposes only. Also, don't forget that the recovery catalog must also be backed up regularly, because it is stored inside a database.
A recovery catalog should be stored in its own tablespace. In our example, the OEM_REPOSITORY tablespace is used. A new user, who will own the recovery catalog schema in the recovery catalog database, should be created. In the following example we have created a user called rman_user and granted it the appropriate privileges.
SQL> CREATE USER rman_user IDENTIFIED BY rman_user TEMPORARY TABLESPACE temp DEFAULT TABLESPACE oem_repository; SQL> GRANT recovery_catalog_owner TO rman_user; SQL> GRANT connect,resource TO rman_user;
The next step is to actually create the recovery catalog, which is achieved by running the RMAN utility. As soon as the RMAN prompt appears, as illustrated in the following code, connect to the database that will hold the catalog. Since you are using the syntax CONNECT CATALOG, an error message will appear saying that the recovery catalog is not installed. Ignore this message. Then issue the CREATE CATALOG command. Once complete, you are now ready to use the catalog.
rman Recovery Manager: Release 9.2.0.1.0 - Production Copyright <c> 1995, 2002, Oracle Corporation. All rights reserved. RMAN> CONNECT CATALOG rman_user/rman_user; Connected to recovery catalog database Recovery catalog is not installed RMAN> CREATE CATALOG TABLESPACE oem_repository; Recovery catalog created
Before a database can be included in the recovery catalog, it must be registered. Otherwise, you may see the message, Rman-20001 target database not found in recovery catalog. Do not be alarmed by the message.
To register the target database, you need to connect to both the database where the recovery catalog is held and the target database, which is the database to be backed up, in the same session, as shown in the following example.
RMAN> CONNECT TARGET easydw/easdydw@orcl; Connected to target database: ORCL <DBID=9915458574> RMAN> CONNECT CATALOG rman_user/rman_user@orcl; Connected to recovery catalog database
When you register the target database, RMAN populates the catalog tables with information from the control file. The database ID, database name, tablespaces, data files, redo logs, and archive logs are examples of information stored in the catalog. The output shows a full resynch, indicating the catalog tables are updated with the latest information in the control file.
RMAN> REGISTER DATABASE; Database registered in recovery catalog Starting full resynch of recovery catalog Full resynch complete
Once the database has been registered in the recovery catalog, the catalog can then be used to report information about the backups that have been taken.
To see what databases have been registered in the recovery catalog use the LIST INCARNATION command. In the following example, one database has been registered. If there had been multiple databases registered in the catalog, they would be listed here.
RMAN> LIST INCARNATION; List incarnation; List of Database Incarnations DB Key Inc Key DB Name DB ID CUR Reset SCN Reset Time ------- ------- -------- ------------ --- ---------- ---------- 1 2 ORCL 9915458574 YES 190578 05-JUN-02
Once inside the RMAN utility, you can do many things in addition to backing up the database. To obtain information about the tablespaces in the database to be backed up, use the REPORT SCHEMA command, as shown in Figure 7.34.
Figure 7.34: RMAN schema report showing the target database.
The LIST command can be used to report various backup information. The example in Figure 7.35 shows the LIST BACKUP command, which shows the datafiles backed up as part of a full backup taken on July 7, and stored in the file located at C:\DW_BACKUP\EASYDW_466605312_2.
Figure 7.35: RMAN lists the contents of a backup.
Another useful command is REPORT. You can see if there are any orphan backups, obsolete backups, and backups that need more than a certain number of days of archive logs. Obsolete backups are no longer needed, either because they are redundant or because they are not needed for recovery within a recovery window. Orphan backups are no longer usable to restore the database, because they belong to a prior incarnation of the database. Figure 7.36 shows a report of files whose recovery needs more than two days of archived logs. Once again, there are many options available with the REPORT command, and it is suggested that you consult the Oracle 9i Backup and Recovery Manual for detailed information.
Figure 7.36: RMAN report files needing backup.
Although this has been a brief introduction to the recovery catalog, hopefully you can begin to see some of the benefits of using it, especially being able to see which backups exist. Using this information, you will see later how automatic recovery is possible, although, for our data warehouse, we may prefer to do it manually.
Restoring a database from a backup is a task that most DBAs probably fear. They are always concerned that the backup may fail, leaving them with no database. Unfortunately, unless you test every backup, you can never be sure that a backup file will actually work. If you can, it is a good idea to periodically restore your backup files onto a test system so that you know that the files and your procedures are good.
Hint: | If you are restoring during a serious database problem, try to restore to another location so that if the restore operation fails, you still have the original database. |
One of the first problems you encounter when restoring a database is knowing what backup files to use. If you have been using the recovery catalog, then it is very simple, since the recovery catalog will automatically figure it out; otherwise, you will have to check your records to determine the correct backup file.
Don't forget that restoring a database could require a number of full and incremental backup files. This is when the RMAN Recovery Wizard is extremely useful, especially if you want to recover to the latest point in time.
The Recovery Wizard can be started in a number of ways:
By selecting it from Tools on the toolbar
By using the right mouse button when selecting database from the Navigator window
By checking on the database tool icon
Irrespective of the method chosen, the Recovery Wizard will start, the screen in Figure 7.37 will appear, and you will be asked for the type of restoration that is required.
Figure 7.37: Selecting type of restore.
In Figure 7.37, we have the option of restoring the entire database, or restoring specific tablespaces, data files, or archive logs. Recovery can be to a specific point in time, such as to Monday at 11:30 a.m. Usually this type of recovery is unnecessary. If a specific problem corrupted the database, however, then you may want to restore to just prior to the job running against the database. In this example we have chosen to restore a tablespace.
After selecting which tablespaces should be restored, it will ask where the files are to be restored to, as illustrated in Figure 7.38. Depending on the reason for the restore, you may want them to go to another location, which can be specified here. The other advantage of this screen is that it provides an opportunity to check which files are going to be restored.
Figure 7.38: Location of data files.
The next step illustrates the reason why you should be familiar with your restoration procedure, because the recovery configuration screen will appear, as shown in Figure 7.39. The wizard automatically selects a configuration. Since backup files are the ones that exist, full backup is selected.
Figure 7.39: Recovery configuration.
Using this configuration, the wizard knows where the backup files are located and is given the information to access the recovery catalog.
You will now have to respond to other screens to check what is to be restored. Finally, a job to perform the restore will be submitted to the Enterprise Manager job queue. You can now monitor the job from the console.
When the job is finished, you can check the restore by viewing the log from the job, which is available from the Job sections of the console under Job History. If you double-click on the job of interest, the various stages of the job are displayed. If you then double-click on the line that says completed, the log of the job will be displayed, as shown in Figure 7.40.
Figure 7.40: Recovery log.
In this example we can see only a portion of the log, but note that because we were using the recovery catalog, the utility automatically worked out which backup files were required. Although we have illustrated how to recover a tablespace, the procedure is almost the same for a data file or the entire database.
Hopefully, this section has given you some idea of how to back up and recover an Oracle 9i data warehouse. This is such an extensive subject that it is highly recommended that you read the Oracle 9i Release 2 Backup and Recovery Manual for detailed information on functionality and many more ideas on how to design, create, and run backup and recovery operations.