Make remote backups automatic and effortless.
One day, the IDE controller on my web server died, leaving the files on my hard disk hopelessly corrupted. I faced what I had known in the back of my mind all along: I had not been making regular remote backups of my server, and the local backups were of no use to me now that the drive was corrupted.
The reason for this, of course, is that doing remote backups wasn't automatic and effortless. Admittedly, this was no one's fault but my own, but my frustration was sufficient enough that I decided to write a tool that would make automated remote snapshots so easy that I wouldn't ever have to worry about it again. Enter rsnapshot.
4.6.1 Installing and Configuring rsnapshot
Installation on FreeBSD is a simple matter of:
# cd /usr/ports/sysutils/rsnapshot # make install
I didn't include the clean target here, as I'd like to keep the work subdirectory, which includes some useful scripts.
The install process neither creates nor installs the config file. This means that there is absolutely no possibility of accidentally overwriting a previously existing config file during an upgrade. Instead, copy the example configuration file and make changes to the copy:
# cp /usr/local/etc/rsnapshot.conf.default /usr/local/etc/rsnapshot.conf
The rsnapshot.conf config file is well commented, and much of it should be fairly self-explanatory. For a full reference of all the various options, please consult man rsnapshot.
rsnapshot uses the /.snapshots/ directory to hold the filesystem snapshots. This is referred to as the snapshot root. This must point to a filesystem where you have lots of free disk space.
188.8.131.52 Specifying backup intervals
rsnapshot has no idea how often you want to take snapshots. In order to specify how much data to save, you need to tell rsnapshot which intervals to keep, and how many of each.
By default, a snapshot will occur every four hours, or six times a day (these are the hourly intervals). It will also keep a second set of snapshots, taken once a day and stored for a week (or seven days):
interval hourly 6 interval daily 7
Note that the hourly interval is specified first. This is very important, as the first interval line is assumed to be the smallest unit of time, with each additional line getting successively bigger. Thus, if you add a yearly interval, it should go at the bottom, and if you add a minutes interval, it should go before the hourly interval. It's also worth noting that the snapshots are pulled up from the smallest interval to the largest. In this example, the daily snapshots are pulled from the oldest hourly snapshot, not directly from the main filesystem.
The backup section tells rsnapshot which files you actually want to back up:
backup /etc/ localhost/etc/
In this example, backup is the backup point, /etc/ is the full path to the directory we want to take snapshots of, and localhost/etc/ is a subdirectory inside the snapshot root where the snapshots are stored. If you are taking snapshots of several machines on one dedicated backup server, it's a good idea to use hostnames as directories to keep track of which files came from which server.
In addition to full paths on the local filesystem, you can also back up remote systems using rsync over ssh. If you have ssh enabled (via the cmd_ssh parameter), specify a path similar to this:
backup firstname.lastname@example.org:/etc/ example.com/etc/
This behaves fundamentally the same way as specifying local pathnames, but you must take a few extra things into account:
184.108.40.206 Preparing for script automation
With the backup_script parameter, the second column is the full path to an executable backup script, and the third column is the local path in which you want to store it. For example:
backup_script /usr/local/bin/backup_pgsql.sh localhost/postgres/
Your backup script only needs to dump its output into its current working directory. It can create as many files and directories as necessary, but it should not put its files in any predetermined path. This is because rsnapshot creates a temp directory, changes to that directory, runs the backup script, and then syncs the contents of the temp directory to the local path you specified in the third column. A typical backup script might look like this:
#!/bin/sh /usr/bin/mysqldump -uroot mydatabase > mydatabase.sql /bin/chown 644 mydatabase.sql
There are a couple of example scripts in the utils/ directory of the rsnapshot source distribution to give you more ideas.
220.127.116.11 Testing your config file
After making your changes, verify that the config file is syntactically valid and that all the supporting programs are where you think they are:
# rsnapshot configtest
If all is well, the output should say Syntax OK. If there's a problem, it should tell you exactly what it is.
The final step to test your configuration is to run rsnapshot with the -t flag, for test mode. This will print out a verbose list of the things it will do, without actually doing them. For example, to simulate an hourly backup:
# rsnapshot -t hourly
18.104.22.168 Scheduling rsnapshot
Now that you have your config file set up, it's time to schedule rsnapshot to run from cron. Add the following lines to root's crontab:
0 */4 * * * /usr/local/bin/rsnapshot hourly 30 23 * * * /usr/local/bin/rsnapshot daily
4.6.2 The Snapshot Storage Scheme
All backups are stored within a configurable snapshot root directory. In the beginning it will be empty. rsnapshot creates subdirectories for the various defined intervals. After a week, the directory should look something like this:
# ls -l /.snapshots/ drwxr-xr-x 7 root root 4096 Dec 28 00:00 daily.0 drwxr-xr-x 7 root root 4096 Dec 27 00:00 daily.1 drwxr-xr-x 7 root root 4096 Dec 26 00:00 daily.2 drwxr-xr-x 7 root root 4096 Dec 25 00:00 daily.3 drwxr-xr-x 7 root root 4096 Dec 24 00:00 daily.4 drwxr-xr-x 7 root root 4096 Dec 23 00:00 daily.5 drwxr-xr-x 7 root root 4096 Dec 22 00:00 daily.6 drwxr-xr-x 7 root root 4096 Dec 29 00:00 hourly.0 drwxr-xr-x 7 root root 4096 Dec 28 20:00 hourly.1 drwxr-xr-x 7 root root 4096 Dec 28 16:00 hourly.2 drwxr-xr-x 7 root root 4096 Dec 28 12:00 hourly.3 drwxr-xr-x 7 root root 4096 Dec 28 08:00 hourly.4 drwxr-xr-x 7 root root 4096 Dec 28 04:00 hourly.5
Each of these directories contains a full backup of that point in time. The destination directory paths you specified as the backup and backup_script parameters are placed directly under these directories. In the example:
backup /etc/ localhost/etc/
the /etc/ directory will initially back up into /.snapshots/hourly.0/localhost/etc/.
Each subsequent time rsnapshot is run with the hourly command, it will rotate the hourly.X directories, "copying" the contents of the hourly.0 directory (using hard links) into hourly.1.
When rsnapshot daily runs, it will rotate all the daily.X directories, then copy the contents of hourly.5 into daily.0.
hourly.0 will always contain the most recent snapshot, and daily.6 will always contain a snapshot from a week ago. Unless the files change between snapshots, the full backups are really just multiple hard links to the same files. This is how rsnapshot uses space so efficiently. If the file changes at any point, the next backup will unlink the hard link in hourly.0, replacing it with a brand new file. This will now use twice the disk space it did before, but it is still considerably less space than 13 full, unique copies would occupy.
Remember, if you are using different intervals than the ones in this example, the first interval listed is the one that gets updates directly from the main filesystem. All subsequently listed intervals pull from the previous snapshots.
4.6.3 Accessing Snapshots
When rsnapshot first runs, it will create the configured snapshot_root directory. It assigns this directory the permissions 0700 since the snapshots will probably contain files owned by all sorts of users on your system.
The simplest but least flexible solution is to disallow access to the snapshot root altogether. The root user will still have access, of course, and will be the only one who can pull backups. This may or may not be desirable, depending on your situation. For a small setup, this may be sufficient.
If users need to be able to pull their own backups, you will need to do a little extra work up front. The best option seems to be creating a container directory for the snapshot root with 0700 permissions, giving the snapshot root directory 0755 permissions, and mounting the snapshot root for the users as read-only using NFS or Samba.
Let's explore how to do this using NFS on a single machine. First, set the snapshot_root variable in rsnapshot.conf:
Then, create the container directory, the real snapshot root, and a read-only mount point:
# mkdir /usr/.private/ # mkdir /usr/.private/.snapshots/ # mkdir /.snapshots/
Set the proper permissions on these new directories:
# chmod 0700 /usr/.private/ # chmod 0755 /usr/.private/.snapshots/ # chmod 0755 /.snapshots/
In /etc/exports, add /usr/.private/.snapshots/ as a read-only NFS export:
In /etc/fstab, mount /usr/.private/.snapshots/ read-only under /.snapshots/:
localhost:/usr/.private/.snapshots/ /.snapshots/ nfs ro 0 0
Restart your NFS daemon and mount the read-only snapshot root:
# /etc/rc.d/nfsd restart # mount /.snapshots/
To test this, try adding a file as the superuser:
# touch /.snapshots/testfile
This should fail with insufficient permissions. This is what you want. It means that your users won't be able to mess with the snapshots either.
Users who wish to recover old files can go into the /.snapshots directory, select the interval they want, and browse through the filesystem until they find the files they are looking for. NFS will prevent them from making modifications, but they can copy anything that they had permission to read in the first place.
4.6.4 See Also