As previously mentioned, the ideal backup strategy is to do a full backup of all the filesystems frequently. This way, should you need to restore a single file or the whole system, you need to access only the latest backup tape set and go from there. Unfortunately, daily full backups are only possible for small systems where you have enough low system usage time (a backup "window") to create a complete backup. It also becomes expensive to continue purchasing new tapes (because none of the existing tapes can be reused during the archive-retaining time period). For most systems, a combination of full and incremental backups coupled with a tape rotation scheme, where a given set of tapes is reused, is the best option. NOTE Always schedule your system backup to take place during a time of little or no user activityfor two reasons. First, backup procedures take up system resources such as CPU cycles and put a high demand on hard disk access. This could degrade the system performance, and in some extreme cases, the backup process can consume considerable resources, resulting in a temporarily denial of service. Second, when users are on the system, there will always be opened files, which are not backed up (unless your backup software has an "open file agent" that handles them). Therefore, to back up as many changed files as possible, during the time that your backup job runs, you should shut down any applications, such as inventory database programs, that keep files open constantly, and you should also restrict user access to files. (Also see the "Database Backups: Cold or Hot?" section later in this chapter.) The main drawback to incremental backups appears when you need to perform a full system restore. You need to first restore the last full backup and then apply all the incremental backups from that point onward. Therefore, you save some time during the backup process, but the restore phase takes a little longer. In the case of a partial restore, you can easily do that from the incremental backup, but you would have to scan through a number of media to locate the one where the desired data is stored. There are two types of incremental backups: backup files changed since the last complete backup (often referred to as differential backups) or backup files changed since the last incremental backup. Assume that you have set up a backup schedule as listed in Table 10.1.
If you need to restore a file lost on Thursday, you need to access only one tape: either the differential tape created on Wednesday (if the file was changed during the current week) or the full backup tape created on Sunday (if the file was not changed during the current week). To fully restore the system, you need only two tapes: the full backup tape and the latest differential tape. Under this schedule, the backup time gets longer as the week progresses because more and more files need to be backed up. However, it makes restoring files simple. This example is a simplification of the Grandfather-Father-Son rotation method. NOTE The main drawback of differential backups is that, as the week progresses, you have more and more changed files to back up as you are backing up files changed since the last full backup. Therefore, it is likely that by Friday, your backup time will take twice as long as it did on Monday. Table 10.2 shows a different backup schedule. This one does a full backup at the beginning of the month, a weekly incremental on Mondays, and a daily incremental for the rest of the week.
Using this schedule, restoring files is a little more complicated than it was in the previous example. For instance, to restore a file you lost, you need to do the following:
The advantage of this sample schedule is that it takes less time per day for the backups because it backs up only those files changed from the previous workday. The downside is that a little more work is required to restore a file. The preceding two examples do not take into account multiple tape sets that would be necessary to go back to data from the previous week or month. The Grandfather-Father-Son and Tower of Hanoi rotation systems described in the following sections, on the other hand, use multiple tape sets. These two rotation methods are among the most often used by backup software. Grandfather-Father-Son Rotation MethodThe Grandfather-Father-Son rotation scheme (GFS for short) uses three "generations" of tapes (hence, the name), as illustrated in Table 10.3. It uses a total of 21 tapes. Of these 21 tapes, 4 are daily tape sets labeled Monday, Tuesday, Wednesday, and Thursday. Another 4 tapes are weekly tape sets labeled Friday1, Friday2, Friday3, and Friday4; for months that have five Fridays, a fifth weekly tape set labeled Friday5 is used. Also, 12 tapes labeled January, February, and so on through December act as monthly tapes.
This rotation scheme recycles the daily tapes the following week (the "sons" because they have the shortest life span), the weekly backup tapes after five weeks (the "fathers"), and the monthly tapes the following year (the "grandfathers"). NOTE The monthly tapes are full backups, whereas the daily and weekly tapes are incrementals. As to which type of incremental backup (weekly or daily) you use, the choice is up to you. However, you should base your decision on these factors: how large a backup window you have, the amount of data to back up, and the throughput of your backup device. CAUTION The daily tapes get the most use; therefore, they are most prone to failure. Check these tapes regularly and often for wear-and-tear before using them. Tower of Hanoi Rotation MethodThe Tower of Hanoi rotation scheme is named after an ancient mathematical game of the same name. The rotation scheme is sometimes referred to as the ABACABA rotation method, based on the frequency with which tapes are rotated. Five or more tapes are needed in this implementation. To simplify the discussion, five tapes labeled A, B, C, D, and E are used. NOTE The French mathematician Edouard Lucas invented the Tower of Hanoi game, sometimes referred to as the Tower of Brahma or the End of the World Puzzle, in 1883. The basic idea is that each of the five tapes is used at different rotation intervals. For example, tape A is used every other day; tape B, every fourth day; tape C, every eighth day; and tapes D and E, every sixteenth day. Typically, tapes A, B, and C are incremental backups, and tapes D and E are full backups. Table 10.4 shows the rotation pattern.
Notice that the pattern recycles itself every 31 days (one month), with the use of either tape D or E between the cycles. If you use fewer than five tapes, the cycle repeats itself every 15 days, which doesn't "map" nicely to the requirement of monthly backups. In the case where five tapes are used (as in the example presented here), tapes D and E are alternated in their usage within the cycle, so they are used once every 16 days. This difference is shown in bold in Table 10.4. Some Tips and TricksHaving chosen a backup media rotation scheme does not mean you now have a viable backup strategy. You also need to decide what to back up, how often to back up, and how best to keep track and safeguard your backup tapes. The following are some points to consider:
NOTE A root filesystem generally contains everything needed to support a functional Linux system, such as the following:
Before we discuss the actual backup tools, there is one more topic to consider as part of your backup strategy: how best to back up a database application, such as Oracle or MySQL. Database Backups: Cold or Hot?When you are backing up files belonging to a database application, such as Oracle or MySQL, or applications that constantly keep certain files open, such as Lotus Notes, you need to give some extra thought than you would when backing up typical documents, such as OpenOffice files. There are two methods of performing a backup on a database: cold and hot. In a cold backup, an application is taken offline, which means there's no user access to the data, and the data is backed up; this is the way backups are normally done. In a hot backup, on the other hand, the application remains online, and user access is retained while the backup is performed. A cold backup is usually the optimal solution for those applications that can tolerate multiple hours of downtime to perform the backup. Some applications that used to be backed up cold have now grown so large that the backup cannot be completed during the allotted time window. If a cold backup is still desired, one way is to take a point-in-time "snapshot" of the data, and within a matter of minutes (depending on the size of the data files involved), the application is brought online. The snapshot can then be mounted back onto the application server, or mounted directly to the backup server, and backed up. Total downtime for the application in such a case is the time required to stop the application, perform the snapshot, and then restart the application. NOTE To take a snapshot of an application's database, either you need an application that provides this feature, or you need to obtain additional software and/or hardware. TIP It is possible to create a snapshot device that is an alias for an existing Logical Volume (LV). The snapshot device, which can be accessed read-only, contains a point-in-time image of the LV; in other words, while applications continue to change the data on the LV, this logical device contains the unchanging image of the LV at the time when the snapshot was created. This makes it possible for you to do a consistent backup without shutting anything down or using any special software. This method is independent of any software because it happens in the LV Manager abstraction layer. SUSE has included a Logical Volume Manager since SUSE LINUX 6.3. For details on performing backups using LVM snapshots, see http://www.tldp.org/HOWTO/LVM-HOWTO/index.html. If you want to perform a hot backup on an application that has constantly open files, the application must have a hot backup feature, and the backup software needs hot backup support for the specific application. Generally speaking, in hot backup mode, instead of writing to the live data, the application queues up the updates in a special file so the backup software can get a complete backup of the database. The special file is backed up next. After this is done, the application is then allowed to apply the queued-up changes to the database, thus bringing everything up to date. Therefore, to decide whether you should perform a cold or hot backup of your database application files, you need to take the following factors into consideration:
If you have small downtime window but have sufficient disk space, perhaps using the Logical Volume snapshot feature is an option. |