11.6 Network Backup Systems

So far, we've considered onlybackups and restores of disks on a local computer system. However, many organizations need to take a more unified and comprehensive approach to their total backup needs. We will consider various available solutions for this problem in this section.

11.6.1 Remote Backups and Restores

The simplest way to move beyond the single-system backup view is to consider remote backup and restores. It is very common to want to perform a backup over the network. The reasons are varied: your system may not have a tape drive at all since not all systems come with one by default any more, there may be a better (faster, higher capacity) tape drive on another system, and so on.

Most versions of dump and restore can perform network-based operations (Tru64 requires you to use the separate rdump and rrestore commands). This is accomplished by specifying a device name of the form host:local_device as an argument to the -f option. The hostname may also optionally be preceded by a username and at-sign; for example, -f chavez@hamlet:/dev/rmt1 performs the operation on device /dev/rmt1 on host hamlet as user chavez.

This capability uses the same network services as the rsh and rcp commands. Remote backup facilities depend on the daemon /usr/sbin/rmt (which is often linked to /etc/rmt).^[18] To be allowed access on the remote system, there needs to be a .rhosts in its root directory, containing at least the name of the (local) host from which the data will come. This file must be owned by root, and its mode must not allow any access by group or other users (for example, 400). This mechanism has the mechanism's usual negative security implications (see Section 7.6).

^[18] On a few older systems, you'll need to create the link yourself.

Some versions of the tar command can also use the rmt remote tape facility.

The HP-UX fbackup and frestore utilities accept remote tape drives as arguments to the normal -f option. For example:

# fbackup -0u -f backuphost:/dev/rmt/1m -i /chem

11.6.2 The Amanda Facility

Amanda is theAdvanced Maryland Automated Network Disk Archiver. It was developed at theUniversity of Maryland (James da Silva was the initial author). The project's home page is http://www.amanda.org, where it can be obtained free of charge. This section provides an overview of Amanda. Consult Chapter 4 of Unix Backup and Recovery for a very detailed discussion of all of Amanda's features (this chapter is also available on the Amanda home page).

11.6.2.1 About Amanda

Amanda allows backups from a network of clients to be sent to a single designated backup server. The package operates by functioning as a wrapper around native backup software like GNU tar and dump. It can also back up files from Windows clients via the Samba facility (smbtar). It has a number of nice features:

It uses its own network protocols and thus does not suffer from the security problems inherent in the rmt approach.
It supports many common tape and other backup devices (including stackers and jukeboxes).
It can perform full and incremental backups and decide the backup level automatically based on specified configuration parameters.
It can take advantage of hardware compression features, or it can compress archives prior to writing them to tape (or other media) when the former is not available. Software compression may be performed either by the main server or by the client system.
It provides excellent protection against accidental media overwriting.
It can use holding disks as intermediate storage for backup archives to maximize tape write performance and to ensure that data is backed up in spite of tape errors (allowing the backup set to be written to backup media at a later time).
It can use Kerberos-based authentication in addition to providing its own authentication scheme. Kerberos encryption can also be used to protect the data as it is transmitted across the network.

At present, Amanda does have a couple of annoying limitations:

It cannot split a backup archive across multiple tapes. When it encounters an end-of-tape mark while saving a backup archive, it begins writing the archive from the beginning on the next tape.
It cannot produce individual backup archives larger than a single tape. This is a consequence of the first limitation.
Only a single backup server is supported.

11.6.2.2 How Amanda works

Amanda uses a combination of full and incremental backups to save all of the data for which it is responsible, using the smallest possible daily backup set that can do so. Its scheme first computes the total amount of data to be backed up. It uses this total, along with a couple of parameters defined by the system administrator, to figure out what to do in the current run. These are the key parameters:

The number of runs in a backup cycle: At a rate of one Amanda run per day, this corresponds to the desired number of days between full backups.
The percentage of data that changes between Amanda runs: In the single run per day case, this is the percentage of the data that changes each day.

Amanda's overall strategy is twofold: to complete a full backup of the data within each cycle and to be sure that all changed data has been backed up between full dumps. The traditional method of doing this is to perform the full backup followed by incrementals on the days between them. Amanda operates differently.

Each run (night), Amanda performs a full backup of part of the data, specifically, the fraction that is required to back up the entire data set in the course of a complete backup cycle. For example, if the cycle is 7 days long (with one run per day), 1/7 of the data must be backed up each day to complete a full backup in 7 days. In addition to this "partial" full backup, Amanda also performs incremental backups for all data that has changed since its own last full backup.

Figure 11-1 illustrates an Amanda backup cycle lasting 4 days, in which 15% of the data changes from day to day. The box at the top of the figure stands for the complete set of data for which Amanda is responsible; we have divided it into four segments to represent the part of the data that gets a full backup at the same time.

Figure 11-1. The Amanda backup scheme

The contents of the nightly backups are shown at the bottom of the figure. The first three days represent a start-up period. On the first night, the first quarter of the data is fully backed up. On the second night, the second quarter is fully backed up, and the 15% of the data from the previous night that changed during day 2 is also saved. On day 3, the third quarter of the total data is fully backed up, as well as the changed 15% of day 2's backup. In addition, 15% of the portion backed up on the first night is written for each of the intervening nights since its full backup: in other words, 30% of that quarter of the total data.

By day 4, the normal schedule is in force. Each night, one quarter of the total data is backed up in full, and incrementals are performed for each of the other quarters as appropriate to the time that has passed since their last full backup.

This example uses only first-level incremental backups. In actual practice, Amanda uses multiple levels of incremental backups to minimize backup storage requirements.

To restore files from an Amanda backup, you may need one complete cycle of media.

Let's now consider a numeric example. Suppose we have 100 GB of data that we need to back up. Table 11-3 illustrates four Amanda backup schedules based on differing cycle lengths and per-day change percentages.

Table 11-3. Sample Amanda backup sizes (total data=100 GB)
	3-day cycle10% change	5-day cycle10% change	7-day cycle10% change	7-day cycle15% change
Full portion	33.3	20.0	14.3	14.3
1^st previous day	3.4	2.0	1.4	2.2
2^nd previous day	6.8	4.0	2.8	4.4
3^rd previous day		6.0	4.2	6.6
4^th previous day		8.0	5.6	8.8
5^th previous day			7.0	11.0
6^th previous day			8.4	13.2
Daily size (GB)	43.5	40.0	43.7	60.5

The table columns illustrate the data that would comprise each daily backup, breaking it down by the full backup portion and the incremental data from each previous full backup within the cycle.

Note that Amanda computes what should be backed up every time it is run, so it is not as static as the preceding examples suggest, but the examples nevertheless provide a general picture of how the facility operates.

In the next section, we consider how the backup size depends on the backup cycle more formally, including some expressions that can be used to decide on an appropriate backup cycle for specific conditions.

NOTE

figs/armadillo_tip.gif

You can use the find command to help estimate the daily change rate:

$ find dir -newer /var/adm/yesterday -ls | \    awk '{sum+=$7}; END {print "diff =",sum}'

Repeat the command as needed to cover all the data to be backed up. Use touch to update the time for the file /var/adm/yesterday after all the find commands are run.

Then, divide this value by the total used space (e.g., taken from df output). Repeat the process for several days or weeks to determine an average rate.

11.6.2.3 Doing the math

Next, we consider some expressions that can be used to compute starting parameters forAmanda (which can be fine-tuned over time, based on actual use). If this sort of mathematical analysis is of no interest to you, just skip this section.

We will use the following variables:

T = total amount of data

p = percentage change between runs (in decimal form: e.g. 12%=0.12)

n = number of runs in a complete cycle (often days)

S = amount of data that must be backed up every run (day)

F = fraction of the total data that must be backed up every run (day): S/T

To compute per-run amount of data that must be backed up, use this expression for S:

For example, 70 GB of data that changes by 10% per day using a 1 week backup cycle requires that 31 GB be backed up every night (70/7 + 70 x 0.1 x 6/2 = 10 + 42/2 = 10 + 21 = 31). If 31 GB is larger than the maximum capacity that you have in the available time, you'll need to adjust the other parameters (see below).

Alternatively, if you have a fixed amount of backup capacity per run, you can figure out the required cycle length. Refer to the discussion of capacity planning earlier in this chapter for information on determining how much capacity you have.

To compute n for a given nightly capacity, use this expression:

where

We have introduced the variable x to make the expression for n simpler. Suppose that you have a nightly backup capacity of 40 GB for the same scenario (70 GB total data, changing at 10% per day). Then x = 0.1/2 + 40/70 = 0.05 + 0.57 = 0.62. We can now compute = 6.24.2.

This calculation yields solutions of 2 and 11 (rounding to integers). We can either do full backups of about half the data every night or use a much longer 11-day cycle and still be able to get the backups all done. Note that these values take maximum advantage of the available capacity.

Now suppose that you have a nightly backup capacity of only 20 GB for the same scenario (70 GB total data, changing at 10% per day). Then x = 0.1/2 + 40/70 = 0.05 + 0.29 = 0.34. We can now compute . The square-root term is now imaginary (since 0.12-0.20 is negative), indicating that this proposed configuration will not work in practice.^[19] The available capacity is simply too small.

^[19] Mathematically, there are no real solutions to the underlying quadratic equation.

In general, you can compute the minimum per-run capacity for a given per-run percentage change (p) with this expression (which introduces F as the fraction of the total data that must be backed up):

F indicates the fraction of that data that must be backed up each run in order for the system to succeed. So, in our case of a 10% change rate, . Note that this expression is independent of T (the total backup data); whenever the data changes by about 10% per run, you must be able to back up at least 40% of the total data every run for success. In our case, this corresponds to a minimum nightly capacity of 0.4 x 70 = 29 GB.

Alternatively, you can compute the run cycle n that is required to minimize F (and thus S) for a given value of p with this expression:^[20]

^[20] Mathematically, the value of n where F / n = 0. In this specific example, the mathematical region around the minimum is quite flat.

In our case, the cycle period which minimizes the amount of data to be backed up is . Again, this value is independent of the amount of data. In our case, when the data is changing by 10% per day, a cycle time of 5 days will minimize the amount of data that must be backed up every night. This is the most efficient cycle length with the minimum nightly backup capacity.

Thus, both the minimum time cycle and per-run fraction of data to back up are determined only by the rate at which the data is changing, and the actual per-run backup size for a given amount of total backup data can be easily computed from them. Thus, having an accurate estimate for p is vital to rational planning.

This discussion ignores compression in analyzing backup procedures. If your tape drive can compress data, or if you decide to compress it with software before writing it to tape, you will need to take the expected compression factor into account in your computations.

11.6.2.4 Configuring Amanda

Building and installing Amanda is generally straightforward, and the process is well-documented, so we will not consider it here.

TheAmanda system includes the following components:

Client programs, of which amandad is the most important. This daemon communicates with the Amanda server during backup runs, calling other client programs as appropriate: selfcheck (verify local Amanda configuration), sendsize (estimate backup size), sendbackup (perform backup operations), and amcheck (verify Amanda setup). These programs are part of the Amanda client system; on the Amanda server, these programs are found with the package's other helper programs, in /usr/local/lib/amanda or /usr/lib/amanda.
Server programs to perform the various phases of the actual backup operations. The amdump program is the one that initiates an Amanda run, and it is usually run periodically from cron. It controls a number of other programs, including planner (determine what to backup), driver (interface to device), dumper (communicate with client amandad processes), taper (write data to media), and amreport (prepare report for an Amanda run).
Administrative utilities to perform related tasks. They include amcheck (verify Amanda configuration is valid and the facility is ready to run), amlabel (prepare media for use with Amanda), amcleanup (clean up after an aborted run or system crash), amflush (force data from the holding area to backup media), and amadmin (perform various administrative functions).
Configuration files that specify Amanda operations, such as what to back up and how often to do so, as well as the locations and characteristics of the tape device. These files are amanda.conf and disklist, and they reside in a subdirectory of the main Amanda directory (canonically, this location is /usr/local/etc/amanda, but it can be /etc/amanda when the package is preinstalled). A typical name is Daily. Each subdirectory corresponds to an Amanda "configuration," a distinct set of settings and options referred to by the directory name.
The amrestore utility, which can be used to restore data from Amanda backups. In addition, the amrecover utility supports interactive file restoration. It relies on a couple of daemons to do its job: amindexd and amidxtaped.

11.6.2.4.1 Setting up an Amanda client

Once you have installed the Amanda software on a client system, there are a few additional steps to take. First, you must add entries to the /etc/inetd.conf and /etc/services files to enable support for theAmanda network services:

/etc/services: amanda    10080/udp /etc/inetd.conf: amanda  dgram   udp   wait   amanda   /path/amandad   amandad

The Amanda daemon runs as user amanda in this example; you should use whatever username you specified when you installed the Amanda software.

In addition, you'll need to ensure that all the data that you want to be backed up is readable by the Amanda user and group. Similarly, the file /etc/dumpdates must exist and be writeable by the Amanda group.

Finally, you must set up the authorization scheme that amandad will use. This is usually selected at compile time. You may use normal .rhosts-based authentication, Kerberos authentication (see below) or a separate .amandahosts (the default mechanism). The .amandahosts file is similar to a .rhosts file, but it applies only to the Amanda facility and so does not carry the same level of risk. Consult the Amanda documentation for full information about authentication options.

11.6.2.4.2 Selecting an Amanda server

Selecting an appropriate system as theAmanda server is crucial to good performance. You should keep the following items in mind:

The system should have the best tape drives (or other backup devices) possible.
The system should have sufficient network bandwidth for the estimated data flow.
The system should have sufficient disk space for the holding area. A good size is at least twice the size of the largest per-run dump size.
If the server will be performing software compression on the data, a fast CPU is necessary.
Large amounts of memory will have little effect on backup performance, so there is no reason to overconfigure the system with memory.

11.6.2.4.3 Setting up the Amanda server

There are several steps necessary to configure the Amanda server once the software is installed. First of all, you must add entries to the same network configuration files as those for Amanda clients:

/etc/services: amanda        10080/udp amandaidx     10082/tcp amidxtape     10083/tcp /etc/inetd.conf: amandaidx  stream  tcp  nowait  amanda /path/amindexd   amindexd amidxtape  stream  tcp  nowait  amanda /path/amidxtaped amidxtaped

Next, you must configure Amanda by creating the required configuration files. Create a new subdirectory under etc/amanda in the top-level Amanda directory (i.e., /usr/local or /), if necessary. We will use Daily as our example. Then, create and modify amanda.conf and disklist configuration files in this subdirectory (the Amanda package contains example files that can be used as a starting point).

We will begin with amanda.conf and consider its contents in groups of related entries. We will examine an annotated sample amanda.conf file.

The initial entries in the file typically specify information about the local site and locations of important files:

org "ahania.com"                Organization name for reports. mailto "amanda-rep"             Mail reports to this user. dumpuser "amanda"               Amanda user account. printer "tlabels"               Printer for tape labels. logdir "/var/log/amanda"        Put log files here. indexdir "/var/adm/amindex"     Store backup set index data here.

The next few entries specify the basic parameters for the backup procedure:

# fundamental parameters dumpcycle 7 days     Length of the backup cycle (default=10 days). runspercycle 5       Amanda runs per cycle (if < 1/day). # network-related resource settings netusage 400 kps     Maximum network bandwidth (default=300). inparallel 20        Max. simultaneous backups (default=10). ctimeout 120         Client timeout period (default=30 seconds). # incremental level bump parameters bumpsize 20 mb       Min. savings for level 2 incrs. (default=10). bumpdays 1           Required # days at each level (default=2). bumpmult 2           Multiply bumpsize by this for each higher incremental level                       (default=1.5).

The incremental bump level parameters specify when Amanda should increase the incremental backup level in order to make the backup set size smaller. Using these settings, Amanda will switch from level 1 incrementals to level 2 incrementals whenever it will save at least 20 MB of space. The multiplication factor has the effect of requiring additional savings to move to each higher incremental level. The threshold for each level is this factor times the saving required for the previous level, i.e., 40 for levels 2 to 3, 80 for levels 3 to 4, and so on. This strategy is designed to ensure that the added complexity of multiple levels of incremental backups also bring significant savings in the size of the backup set.

These next entries specify information about the tape drive and media to use:

# number of tapes in use         Set to at least # tapes required for one full cycle tapecycle 25                       plus a few spares (default=15). labelstr "Daily[0-9][0-9]*"      Format of the table labels (regular expression). tapedev "/dev/rmt/0" tapetype "DLT" #changerdev "/dev/whatever" #tpchanger "script-path"          Script to change to next tape (supplied). #runtapes 4                       Maximum number of tapes per run.

The first two entries specify the number of tapes in use and the pattern used by their electronic labels. Note that tapes must be prepared with amlabel prior to use (discussed below).

The next two entries specify the location of the tape drive and its type. The final three entries are used with tape changers and are commented out in this example. Only one of tapedev and tpchanger must be used.

Tape types are defined elsewhere in the configuration file with stanzas like this:

define tapetype DLT {     comment "DLT with 10 GB tapes"     length 12500 mb     Tape capacity (takes compression into account).     speed 1536 kps      Drive speed.     lbl-templ "file"    PostScript template file for printed labels. }

The example configuration file includes many defined tape types. The length and speed parameters are used only for estimation purposes (e.g., how many tapes will be required). When performing the actual data transfer to tape, Amanda will keep writing until it encounters an end-of-tape mark.

The following entry and holdingdisk stanza defines a disk holding area:

# When media is unavailable, save this % of holding space  # for degraded-mode incremental backups. reserve 50                Default is 100%.  holdingdisk amhold0 {     Name is amhold0.    comment "Primary holding disk"    directory "/scratch/amanda" # amount of space to use (+) or save (-); 0=use all (default)    use -2 Gb              Always leave this much space. }

More than one holding disk may be defined.

The final task to be done in the configuration file is to define various dump types: generalized backup actions having specific characteristics (but independent of the data to be backed up). Here is an example for the normal backup type (you can choose any names you like):

define dumptype normal {    comment "Ordinary backup"    holdingdisk yes     Use a holding disk.    index yes           Maintain index info on contents.    program "DUMP"      Backup command.    priority medium     Specify backup relative priority. #  use 24-hour clock without punctuation    starttime 2000      Don't begin backup before this time (8 P.M. here). }

This dump type uses a holding disk, creates an index for the backup set contents for interactive restoration and uses the dump program to perform the actual backup. It runs at medium priority compared to other backups (the possibilities are low (0), medium (1), high (2) and an arbitrary integer, with higher numbers meaning the backup will be performed sooner). Backups using this method will not begin before 8 pm regardless of when the amdump command is issued.

Amanda provides several pre-defined dump types in the example amanda.conf file which can be used or customized as desired.

Here are some other parameters that are useful in dump type definitions:

program "GNUTAR"           Use the GNU tar program for backups.                            This is also the value to use for Samba backups. exclude ".exclude"         GNU tar exclusion file (located in top-level                            of the filesystem to be backed up). compress server "fast"     Use software compression on server using the                            fastest compression method. Other keywords are                            "client" and "best". auth "krb4"                Use Kerberos 4 user authentication. kencrypt yes               Encrypt transmitted data. ignore yes                 Do not run this backup type.

Amanda's disklist configuration file specifies the actual filesystems to be backed up. Here are some sample entries:

# host     partition      dumptype    spindle    hamlet      sd1a           normal        -1      hamlet      sd2a           normal        -1        dalton      /chem          srv_comp      -1      leda        //leda/e       samba         -1   # Win2K system astarte     /data1         normal         1 astarte     /data2         normal         1 astarte     /home          normal         2   # dump all alone

The columns in this file hold the hostname, disk partition (specified by file in /dev, full special file name, or mount point), the dump type, and a spindle parameter. The latter serves to control which backups can be done at the same time on a host. A value of -1 says to ignore this parameter. Other values define backup groups within a host; Amanda will only run backups from the same group in parallel. For example, on host astarte, the /home filesystem must be backed up separately from the other two (the latter may be backed up simultaneously if Amanda so wishes).

There are a few final steps that are needed to complete the Amanda server setup:

Prepare media with the amlabel command. For example, the following command will prepare a tape labeled "DAILY05" for use with the Amanda configuration named Daily:
```
$ amlabel Daily DAILY05
```
Similarly, the following command will prepare the tape in slot 5 of the associated tape device as "CHEM101" for use with the Chem configuration:
```
$ amlabel Chem CHEM101 slot 5
```
Use the amcheck command to check and verify the Amanda configuration.
Create a cron job for the Amanda user to run the amdump command on a regular basis (e.g., nightly). This command takes the desired configuration as its argument.

Amanda expects the proper tape to be in the tape drive when the backup process begins. You can determine the next tape needed for the Daily configuration by running the following command:

# amadmin Daily tape

The Amanda system will need some ongoing administration, including tuning and cleanup. The latter is accomplished via the amflush and amcleanup commands. amflush is used to force the data in the holding disk to backup media, and it is typically required after a media failure occurs during an Amanda run. In such cases, the backup data is still written to the holding disk. The amcleanup command needs to be run after an Amanda run aborts or after a system crash.

Finally, you can temporarily disable an Amanda configuration by creating a file named hold in the corresponding subdirectory. While this file exists, the Amanda system will pause. This can be used to keep the configuration information intact in the event of a hardware failure on the backup device or a device being temporarily needed for another task.

11.6.2.5 Amanda reports and logs

TheAmanda system produces a report for each backup run and sends it by electronic mail to the user specified in the amanda.conf configuration file. The reports are quite detailed and contain the following sections:

The dump date and time and estimated media requirements:

These dumps were to tape DAILY05. Tonight's dumps should go onto one tape: DAILY05.

A summary of errors and other aberrations encountered during the run:
```
FAILURE AND STRANGE DUMP SUMMARY: dalton.ahania.com /chem lev 0 FAILED [request ... timed out.]
```
Host dalton was down so the backup failed.

Statistics about the run, including data sizes and write rates (output has been shortened):

STATISTICS:                           Total      Full      Daily                        --------  --------   -------- Dump Time (hrs:min)        2:48      2:21       0:27    Output Size (meg)        9344.3    7221.1     2123.2 Original Size (meg)      9344.3    7221.1     2123.2 Avg Compressed Size (%)     --        --         -- Tape Used (%)              93.4      72.2       21.2 Filesystems Dumped           10         2          8 Avg Dump Rate (k/s)      1032.1    1322.7      398.1 Avg Tp Write Rate (k/s)  1234.6    1556.2     1123.8

Additional information about some of the errors/aberrations, when available.

Informative messages from the various subprograms called by amdump:

NOTES:    planner: Adding new disk hamlet.ahania.com:/sda2    taper: tape DAILY05 9568563 kb fm 1 [OK]

A summary table listing the data that was backed up and related information:

DUMP SUMMARY:                         DUMPER STATS              TAPER STATS HOST   DISK  L ORIG-KB OUT-KB COMP% MMM:SS  KB/s MMM:SS  KB/s ------------------------------------------------------------- hamlet sd1a  1   28255  28255   --    2:36 180.3  0:21 1321.1    hamlet sd2a  0  466523 466523   --   36:51 211.1  5:33 1400.8 dalton /chem 1  FAILED--------------------------------------- ada    /home 1   39781  39781   --    5:16 125.7  0:29 1356.7 ...

You should examine the reports regularly, especially the sections related to errors and performance.

Amanda also produces log files for each run, amdump.n, and log.date.n, located in the designated log file directory. These are more verbose versions of the email report, and they can be helpful in tracking some sorts of problems.

11.6.2.6 Restoring files from an Amanda backup

Amanda provides the interactive amrecover utility for restoring files fromAmanda backups. It requires that backup sets be indexed (using the index yes setting) and that the two indexing daemons mentioned previously be enabled. The utility must be run as root from the appropriate client system.

Here is a sample session:

# amrecover Daily AMRECOVER Version 2.4.2. Contacting server on depot.ahania.com ... ...  Setting restore date to today (2001-08-12) 200 Working date set to 2001-08-14. 200 Config set to Daily. 200 Dump host set to astarte.ahania.com. $CWD '/home/chavez/data' is on disk '/home' mounted at '/home'. 200 Disk set to /home. amrecover> cd chavez/data /home/chavez/data amrecover> add jetfuel.jpg Added /chavez/data/jetfuel.jpg amrecover> extract Extracting files using tape drive /dev/rmt0 on host depot... The following tapes are needed: DAILY02 Restoring files into directory /home Continue? [Y/n]: y Load tape DAILY02 now Continue? [Y/n]: y warning: ./chavez: File exists Warning: ./chavez/data: File exists Set owner/mode for '.'? [yn]: n amrecover> quit

In this case, the amrecover command is very similar to the standard restore command in its interactive mode.

The amrestore command can also be used to restore data from an Amanda backup. It is designed to restore entire images from Amanda tapes. See its manual page or the discussion in Unix Backup and Restore for details on its use.

11.6.3 Commercial Backup Packages

There are several excellent commercialbackup facilities available. An up-to-date list of current packages can be obtained from http://www.storagemountain.com. We won't consider any particular package here but, rather, briefly summarize the important features of a general-purpose backup package, which can potentially serve as criteria for comparing and evaluating any products your site is considering.

You should expect the following features from a high-end commercial backup software package suitable for medium-sized and larger networks:

The ability to define backups sets as arbitrary lists of files that can be saved and reloaded into the utility as needed.
A capability for defining and saving the characteristics and data comprising standard backup operations.
A facility for exclusion lists, allowing you to create, save, and load lists of files and directories to exclude from a backup operation (including wildcard specifications).
An automated backup scheduling facility accessed and controlled from within the backup utility itself.
The ability to specify default settings for backup and restore operations.
The ability to back up all important file types (e.g., device files, sparse files) and attributes (e.g., access control lists).
The ability to back up open files or to skip them entirely without pausing (at your option).
The ability to define and initiate remote backup and restore operations.
Support for multiple backup servers.
Support for high-end backup devices, such as stackers, jukeboxes, libraries and silos.
Support for tape RAID devices, in which multiple physical tapes are combined into a single high-performance logical unit via parallel write operations.
Support for non-tape backup devices, such as removable disks.
The capability to perform multiple operations to distinct tape devices simultaneously.
Support for multiplexed backup operations in which multiple data streams are backed up to a single tape device at the same time.
Support for clients running all of the operating systems in use at your site.
Compatibility with the standard backup utilities, which may be important to some sites (so that saved files can be restored to any system).
Facilities for automatic archiving of inactive files to alternate online storage devices (for example, jukeboxes of optical disks) to conserve disk space and reduce backup requirements.
Inclusion of some kind of database manager so that you (and the backup software) can perform queries to find the media needed to restore files.

See Chapter 5 of Unix Backup and Recovery for an extended discussion of commercial backup package features.