2.1. I'm Afraid
of Losing Data
Backups are fundamental in the life of any Linux
geek. You need to know at least how to back up the files on your
home directory. And if you administer systems, you need to know how
to keep the data on those systems
backed
up. In many cases, you'll
need real-time
backups
, as discussed in the
next
annoyance.
There are a wide variety of
excellent
backup
tools, many of which require (and have) dedicated books covering
their use. Some of them work in the GUI, including the Archive
Manager described in the last annoyance in this chapter.
But despite the concern for the end
user
shown
in the introduction to this chapter, backups are primarily a
function for administratorsin other words, you as the Linux geek.
Therefore, I focus on command-line tools in this annoyance.
|
If your users are interested in backing up their
own home directories, instructions for creating a
zipped
archive
are available in the last annoyance in this chapter.
|
|
Specifically, I focus on a couple of the simpler
tools for backups: the
rsync
command, which can synchronize data with remote locations, and the
Secure Shell (
ssh
) command, which
can encrypt data that
travels
over networks.
I assume that some of you may not be completely
familiar with
rsync
and
ssh
. Therefore, I start by
describing the basics of each of these utilities and then show how
you can use these commands together to securely back up your home
directoryon a remote system. Next, I'll
demonstrate
how to
configure SSH connections without passwords, and finally I'll
explain how to configure the backup as a cron job. These details
may be basic for some Linux
geeks
, but
rsync
is so important that I'll take some time
to describe the features we'll find useful in this annoyance.
2.1.1. The rsync
Command
The
rsync
command has a lot of powerful and advanced features, some under the
hood. When you use the command to synchronize files and/or
directories, it might be a bit slow to begin with. But the next
time you synchronize, all it needs to do is transfer the bits that
have changed in each directory or file since last time. In other
words, if you've updated a large database file,
rsync
transfers just the differences between
the old and new versions. For all of our distributions, the
associated package is also named
rsync
.
This command also supports archive mode, which
allows you to preserve just about everything associated with each
of your files and directories, including:
-
User and
group
ownership, file times, and
permissions
-
Symbolic links
-
Device files
-
Recursive synchronization through all
subdirectories
It also supports synchronization through remote
connection tools, including the Secure Shell. Some of
rsync
's key options are described in Table
2-1.
Table 2-1. rsync command options
|
Option
|
Description
|
|
-a
|
Archive mode, functionally equivalent to
-rlptgoD
|
|
-D
|
Preserves device files
|
|
-e
|
Supports transfer over a remote shell command
such as
ssh
|
|
-g
|
Preserves group ownership
|
|
-H
|
Copies hard-linked files
|
|
-l
|
Copies
symlinks
, also known as soft-linked
files
|
|
-o
|
Preserves user ownership
|
|
-p
|
Retains permissions
|
|
-r
|
Runs recursively into subdirectories
|
|
-v
|
Verbose mode;
-vv
is extra verbose mode
|
|
-z
|
Compresses data
|
|
--rsh=/
path
|
Specifies full path to the remote shell command
of your choice; similar to
-e
|
If you prefer to use the Secure Shell
(
ssh
) to
rsync
data remotely, you can use the
-e ssh
option, or you can change
the default
rsync
behavior to use
the Secure Shell. To do so, set the
RSYNC_RSH
environment
variable with the following command:
env RSH_RSYNC=ssh
You can set this variable as the default for all
users by adding it to
/etc/profile
(or for SUSE, to
/etc/profile.local
); alternatively, you can
add it to an individual user's
~/.bashrc
or
~/.bash_profile
,
assuming
that you use the
default bash shell.
The
-a
option
combines the functionality of many other useful options, so your
rsync
command does not have to be
complex. For example, the following command copies my home
directory to
/tmp/michael
:
rsync -aHvz /home/michael /tmp/
If you're not familiar with
rsync
, try a similar command on your own
system. Examine the results in the target directory. In this case,
I find the files from my home directory in
/tmp/michael
. If the command were slightly
different, with the trailing forward slash, I'd find my files in
the
/tmp
directory:
rsync -aHvz /home/michael/ /tmp/
Naturally, backups aren't very useful unless you
can copy data to a different hard drive. A backup to a remote
computerin this case, via the Secure Shellis one preferred
method.
2.1.2. The Secure
Shell Command
The basic use of the Secure Shell command,
ssh
, is simple and suffices for
most uses with
rsync
. I address
only those essentials of
ssh
required to facilitate rsync backups here; for more information,
see the "Configure SSH" annoyance in Chapter 11. Once the service
is installed and started (and any current firewalls are properly
configured), it's easy to log in to a Secure Shell server. From any
client, just run the following command:
ssh
username@remotepc
The
username
must be valid on
the remote PC; substitute the host, FQDN, or IP address for
remotepc
.
2.1.3. Backing Up
Your Home Directory
Now, I'll show you how you can combine the
ssh
and
rsync
commands to back up your home directory
from a remote computer. For example, on my Red Hat Enterprise Linux
(RHEL) 4 computer, I've backed up my wife's home directory from our
SUSE workstation with the following command:
rsync -aHvz -e ssh
donna@suse1
:/home/donna /home/michael/
This command prompts you for the user's
password. If it's the first time you've connected with SSH, you're
prompted to confirm the connection. If you get an error message
related
to the
known_hosts
list,
edit or delete the local
.ssh/known_hosts
file.
Once the transfer is complete, I can find a
mirror image of the files from Donna's home directory on the local
/home/michael/donna
directory.
In this case, I don't want to use the trailing
slash when referring to Donna's home directory. If I use a trailing
slash (i.e.,
/home/donna)
, the
individual contents of her directory are transferred instead of the
directory as a unit. This means her default shell, browser,
profile, KDE, GNOME settings, and more will overwrite those on my
home directory! The trailing slash makes an
enormous
difference,
which you can understand by running some experiments with and
without it.
2.1.4.
Configuring SSH Without Passwords
While you can back up home directories with
passwords, that's not enough if you want to automate the process.
If you want to configure a cron job for daily backups, a naive
solution would be to store passwords in clear text in a cron job in
the
/etc/cron.daily
directory, but
that's
generally
a bad idea. The better alternative is to configure
SSH to connect between computers without passwords using public-key
exchange.
For more information on how this is done, see
"My Other Computer Has No Monitor" in Chapter 11. When done
properly, password-free access is limited to specific computers or
a specific command such as
rsync
.
For the purpose of this annoyance, I assume that you've created and
configured the keys that you need to log in without a password.
2.1.5. Creating a
cron Job for Your Backups
Now you can create a cron job for your backups.
Assuming you want a daily backup, you'll need to create an
appropriate script in the
/etc/cron.daily/
directory; each script there
runs regularly, thanks to the background cron process. Individual
users can create their own cron jobs with the
crontab -e
command. If you need a model for
cron jobs, see your
/etc/crontab
configuration file, and sample cron jobs in the
/etc/cron.daily
directory.
Generally, cron jobs collect commands run in a
regular shell. So start by opening the filename of your choice
under
/etc/cron.daily/
with a text
editor and add the following directive, which
tells
Linux to expect
regular bash shell commands:
#!/bin/sh
While not
absolutely
necessary, it is good
practice to start every script by declaring the shell associated
with your scripts. As there is normally no Bourne shell installed,
the Linux distributions I know actually link
/bin/sh
to the default
/bin/bash
(The Bourne Again Shell) shell.
To minimize security risks, cron jobs do not
inherit the
PATH
from any user. You can either define a
PATH
explicitly before your shell commands, or just
specify the full path to each command.
Now we can insert the command shown previously,
which backed up Donna's home directory.
/usr/bin/rsync -aHvz -e /usr/bin/ssh
donna@suse1
:/home/donna /home/michael/
If you've already set the
RSYNC_RSH=ssh
environmental variable, the required command is simpler:
/usr/bin/rsync -aHvz
donna@suse1
:/home/donna /home/michael/
Naturally, you may want to expand this backup
method to cover all files on a workstation. To do so, you'll need
root
permissions and therefore
will need to create public and private authentication keys in the
workstation and server's
/root/.ssh/
directories. You can then create a
cron job to back up all files on the workstation, using the
techniques described in this section.
|