Hack 39 Automate Remote Backups


figs/expert.gif figs/hack39.gif

Make remote backups automatic and effortless.

One day, the IDE controller on my web server died, leaving the files on my hard disk hopelessly corrupted. I faced what I had known in the back of my mind all along: I had not been making regular remote backups of my server, and the local backups were of no use to me now that the drive was corrupted.

The reason for this, of course, is that doing remote backups wasn't automatic and effortless. Admittedly, this was no one's fault but my own, but my frustration was sufficient enough that I decided to write a tool that would make automated remote snapshots so easy that I wouldn't ever have to worry about it again. Enter rsnapshot.

4.6.1 Installing and Configuring rsnapshot

Installation on FreeBSD is a simple matter of:

# cd /usr/ports/sysutils/rsnapshot # make install

I didn't include the clean target here, as I'd like to keep the work subdirectory, which includes some useful scripts.

If you're not using FreeBSD, see the original HOWTO at the project web site for detailed instructions on installing from source.


The install process neither creates nor installs the config file. This means that there is absolutely no possibility of accidentally overwriting a previously existing config file during an upgrade. Instead, copy the example configuration file and make changes to the copy:

# cp /usr/local/etc/rsnapshot.conf.default /usr/local/etc/rsnapshot.conf

The rsnapshot.conf config file is well commented, and much of it should be fairly self-explanatory. For a full reference of all the various options, please consult man rsnapshot.

rsnapshot uses the /.snapshots/ directory to hold the filesystem snapshots. This is referred to as the snapshot root. This must point to a filesystem where you have lots of free disk space.

Note that fields are separated by tabs, not spaces. This makes it easier to specify file paths with spaces in them.


4.6.1.1 Specifying backup intervals

rsnapshot has no idea how often you want to take snapshots. In order to specify how much data to save, you need to tell rsnapshot which intervals to keep, and how many of each.

By default, a snapshot will occur every four hours, or six times a day (these are the hourly intervals). It will also keep a second set of snapshots, taken once a day and stored for a week (or seven days):

interval    hourly  6 interval    daily   7

Note that the hourly interval is specified first. This is very important, as the first interval line is assumed to be the smallest unit of time, with each additional line getting successively bigger. Thus, if you add a yearly interval, it should go at the bottom, and if you add a minutes interval, it should go before the hourly interval. It's also worth noting that the snapshots are pulled up from the smallest interval to the largest. In this example, the daily snapshots are pulled from the oldest hourly snapshot, not directly from the main filesystem.

The backup section tells rsnapshot which files you actually want to back up:

backup      /etc/      localhost/etc/

In this example, backup is the backup point, /etc/ is the full path to the directory we want to take snapshots of, and localhost/etc/ is a subdirectory inside the snapshot root where the snapshots are stored. If you are taking snapshots of several machines on one dedicated backup server, it's a good idea to use hostnames as directories to keep track of which files came from which server.

In addition to full paths on the local filesystem, you can also back up remote systems using rsync over ssh. If you have ssh enabled (via the cmd_ssh parameter), specify a path similar to this:

backup      backup@example.com:/etc/     example.com/etc/

This behaves fundamentally the same way as specifying local pathnames, but you must take a few extra things into account:

  • The ssh daemon must be running on example.com.

  • You must have access to the specified account on the remote machine (in this case, the backup user on example.com). See [Hack #38] for instructions on setting this up.

  • You must have key-based logins enabled for the specified user at example.com, without passphrases.

  • This backup occurs over the network, so it may be slower. Since this uses rsync, this is most noticeable during the first backup. Depending on how much your data changes, subsequent backups should go much faster.

One thing you can do to mitigate the potential damage from a backup server breach is to create alternate users on the client machines with their UIDs and GIDs set to 0, but with a more restrictive shell, such as scponly [Hack #63] .


4.6.1.2 Preparing for script automation

With the backup_script parameter, the second column is the full path to an executable backup script, and the third column is the local path in which you want to store it. For example:

backup_script      /usr/local/bin/backup_pgsql.sh     localhost/postgres/

You can find the backup_pgsql.sh example script in the utils/ directory of the source distribution. Alternatively, if you didn't include the clean target when you installed the FreeBSD port, the file will be located in /usr/ports/sysutils/rsnapshot/work/rsnapshot-1.0.9/utils.


Your backup script only needs to dump its output into its current working directory. It can create as many files and directories as necessary, but it should not put its files in any predetermined path. This is because rsnapshot creates a temp directory, changes to that directory, runs the backup script, and then syncs the contents of the temp directory to the local path you specified in the third column. A typical backup script might look like this:

#!/bin/sh /usr/bin/mysqldump -uroot mydatabase > mydatabase.sql /bin/chown 644 mydatabase.sql

There are a couple of example scripts in the utils/ directory of the rsnapshot source distribution to give you more ideas.

Remember that backup scripts will be invoked as the user running rsnapshot. Make sure your backup scripts are not writable by anyone else.


4.6.1.3 Testing your config file

After making your changes, verify that the config file is syntactically valid and that all the supporting programs are where you think they are:

# rsnapshot configtest

If all is well, the output should say Syntax OK. If there's a problem, it should tell you exactly what it is.

The final step to test your configuration is to run rsnapshot with the -t flag, for test mode. This will print out a verbose list of the things it will do, without actually doing them. For example, to simulate an hourly backup:

# rsnapshot -t hourly

4.6.1.4 Scheduling rsnapshot

Now that you have your config file set up, it's time to schedule rsnapshot to run from cron. Add the following lines to root's crontab:

0 */4 * * *       /usr/local/bin/rsnapshot hourly 30 23 * * *       /usr/local/bin/rsnapshot daily

4.6.2 The Snapshot Storage Scheme

All backups are stored within a configurable snapshot root directory. In the beginning it will be empty. rsnapshot creates subdirectories for the various defined intervals. After a week, the directory should look something like this:

# ls -l /.snapshots/ drwxr-xr-x    7 root     root         4096 Dec 28 00:00 daily.0 drwxr-xr-x    7 root     root         4096 Dec 27 00:00 daily.1 drwxr-xr-x    7 root     root         4096 Dec 26 00:00 daily.2 drwxr-xr-x    7 root     root         4096 Dec 25 00:00 daily.3 drwxr-xr-x    7 root     root         4096 Dec 24 00:00 daily.4 drwxr-xr-x    7 root     root         4096 Dec 23 00:00 daily.5 drwxr-xr-x    7 root     root         4096 Dec 22 00:00 daily.6 drwxr-xr-x    7 root     root         4096 Dec 29 00:00 hourly.0 drwxr-xr-x    7 root     root         4096 Dec 28 20:00 hourly.1 drwxr-xr-x    7 root     root         4096 Dec 28 16:00 hourly.2 drwxr-xr-x    7 root     root         4096 Dec 28 12:00 hourly.3 drwxr-xr-x    7 root     root         4096 Dec 28 08:00 hourly.4 drwxr-xr-x    7 root     root         4096 Dec 28 04:00 hourly.5

Each of these directories contains a full backup of that point in time. The destination directory paths you specified as the backup and backup_script parameters are placed directly under these directories. In the example:

backup          /etc/           localhost/etc/

the /etc/ directory will initially back up into /.snapshots/hourly.0/localhost/etc/.

Each subsequent time rsnapshot is run with the hourly command, it will rotate the hourly.X directories, "copying" the contents of the hourly.0 directory (using hard links) into hourly.1.

When rsnapshot daily runs, it will rotate all the daily.X directories, then copy the contents of hourly.5 into daily.0.

hourly.0 will always contain the most recent snapshot, and daily.6 will always contain a snapshot from a week ago. Unless the files change between snapshots, the full backups are really just multiple hard links to the same files. This is how rsnapshot uses space so efficiently. If the file changes at any point, the next backup will unlink the hard link in hourly.0, replacing it with a brand new file. This will now use twice the disk space it did before, but it is still considerably less space than 13 full, unique copies would occupy.

Remember, if you are using different intervals than the ones in this example, the first interval listed is the one that gets updates directly from the main filesystem. All subsequently listed intervals pull from the previous snapshots.

4.6.3 Accessing Snapshots

When rsnapshot first runs, it will create the configured snapshot_root directory. It assigns this directory the permissions 0700 since the snapshots will probably contain files owned by all sorts of users on your system.

The simplest but least flexible solution is to disallow access to the snapshot root altogether. The root user will still have access, of course, and will be the only one who can pull backups. This may or may not be desirable, depending on your situation. For a small setup, this may be sufficient.

If users need to be able to pull their own backups, you will need to do a little extra work up front. The best option seems to be creating a container directory for the snapshot root with 0700 permissions, giving the snapshot root directory 0755 permissions, and mounting the snapshot root for the users as read-only using NFS or Samba.

Let's explore how to do this using NFS on a single machine. First, set the snapshot_root variable in rsnapshot.conf:

snapshot_root       /usr/.private/.snapshots/

Then, create the container directory, the real snapshot root, and a read-only mount point:

# mkdir /usr/.private/ # mkdir /usr/.private/.snapshots/ # mkdir /.snapshots/

Set the proper permissions on these new directories:

# chmod 0700 /usr/.private/ # chmod 0755 /usr/.private/.snapshots/ # chmod 0755 /.snapshots/

In /etc/exports, add /usr/.private/.snapshots/ as a read-only NFS export:

/usr/.private/.snapshots/  127.0.0.1(ro)

If your version of NFS supports it, include the no_root_squash option. (Place it within the brackets after ro with a comma not a space as the separator.) This option allows the root user to see all the files within the read-only export.


In /etc/fstab, mount /usr/.private/.snapshots/ read-only under /.snapshots/:

localhost:/usr/.private/.snapshots/   /.snapshots/   nfs    ro   0 0

Restart your NFS daemon and mount the read-only snapshot root:

# /etc/rc.d/nfsd restart # mount /.snapshots/

To test this, try adding a file as the superuser:

# touch /.snapshots/testfile

This should fail with insufficient permissions. This is what you want. It means that your users won't be able to mess with the snapshots either.

Users who wish to recover old files can go into the /.snapshots directory, select the interval they want, and browse through the filesystem until they find the files they are looking for. NFS will prevent them from making modifications, but they can copy anything that they had permission to read in the first place.

4.6.4 See Also

  • man rsnapshot

  • The original rsnapshot HOWTO (http://www.rsnapshot.org/rsnapshot-HOWTO.html)



BSD Hacks
BSD Hacks
ISBN: 0596006799
EAN: 2147483647
Year: 2006
Pages: 160
Authors: Lavigne

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net