Section 2.1. I m Afraid of Losing Data

2.1. I'm Afraid of Losing Data

Backups are fundamental in the life of any Linux geek. You need to know at least how to back up the files on your home directory. And if you administer systems, you need to know how to keep the data on those systems backed up. In many cases, you'll need real-time backups, as discussed in the next annoyance.

There are a wide variety of excellent backup tools, many of which require (and have) dedicated books covering their use. Some of them work in the GUI, including the Archive Manager described in the last annoyance in this chapter.

But despite the concern for the end user shown in the introduction to this chapter, backups are primarily a function for administratorsin other words, you as the Linux geek. Therefore, I focus on command-line tools in this annoyance.

If your users are interested in backing up their own home directories, instructions for creating a zipped archive are available in the last annoyance in this chapter.

Specifically, I focus on a couple of the simpler tools for backups: the rsync command, which can synchronize data with remote locations, and the Secure Shell (ssh) command, which can encrypt data that travels over networks.

I assume that some of you may not be completely familiar with rsync and ssh. Therefore, I start by describing the basics of each of these utilities and then show how you can use these commands together to securely back up your home directoryon a remote system. Next, I'll demonstrate how to configure SSH connections without passwords, and finally I'll explain how to configure the backup as a cron job. These details may be basic for some Linux geeks, but rsync is so important that I'll take some time to describe the features we'll find useful in this annoyance.

2.1.1. The rsync Command

The rsync command has a lot of powerful and advanced features, some under the hood. When you use the command to synchronize files and/or directories, it might be a bit slow to begin with. But the next time you synchronize, all it needs to do is transfer the bits that have changed in each directory or file since last time. In other words, if you've updated a large database file, rsync transfers just the differences between the old and new versions. For all of our distributions, the associated package is also named rsync.

This command also supports archive mode, which allows you to preserve just about everything associated with each of your files and directories, including:

  • User and group ownership, file times, and permissions

  • Symbolic links

  • Device files

  • Recursive synchronization through all subdirectories

It also supports synchronization through remote connection tools, including the Secure Shell. Some of rsync's key options are described in Table 2-1.

Table 2-1. rsync command options




Archive mode, functionally equivalent to -rlptgoD


Preserves device files


Supports transfer over a remote shell command such as ssh


Preserves group ownership


Copies hard-linked files


Copies symlinks, also known as soft-linked files


Preserves user ownership


Retains permissions


Runs recursively into subdirectories


Verbose mode; -vv is extra verbose mode


Compresses data


Specifies full path to the remote shell command of your choice; similar to -e

If you prefer to use the Secure Shell (ssh) to rsync data remotely, you can use the -e ssh option, or you can change the default rsync behavior to use the Secure Shell. To do so, set the RSYNC_RSH environment variable with the following command:

 env RSH_RSYNC=ssh 

You can set this variable as the default for all users by adding it to /etc/profile (or for SUSE, to /etc/profile.local); alternatively, you can add it to an individual user's ~/.bashrc or ~/.bash_profile, assuming that you use the default bash shell.

The -a option combines the functionality of many other useful options, so your rsync command does not have to be complex. For example, the following command copies my home directory to /tmp/michael:

 rsync -aHvz /home/michael /tmp/ 

If you're not familiar with rsync, try a similar command on your own system. Examine the results in the target directory. In this case, I find the files from my home directory in /tmp/michael. If the command were slightly different, with the trailing forward slash, I'd find my files in the /tmp directory:

 rsync -aHvz /home/michael/ /tmp/ 

Naturally, backups aren't very useful unless you can copy data to a different hard drive. A backup to a remote computerin this case, via the Secure Shellis one preferred method.

2.1.2. The Secure Shell Command

The basic use of the Secure Shell command, ssh, is simple and suffices for most uses with rsync. I address only those essentials of ssh required to facilitate rsync backups here; for more information, see the "Configure SSH" annoyance in Chapter 11. Once the service is installed and started (and any current firewalls are properly configured), it's easy to log in to a Secure Shell server. From any client, just run the following command:

 ssh username@remotepc 

The username must be valid on the remote PC; substitute the host, FQDN, or IP address for remotepc.

2.1.3. Backing Up Your Home Directory

Now, I'll show you how you can combine the ssh and rsync commands to back up your home directory from a remote computer. For example, on my Red Hat Enterprise Linux (RHEL) 4 computer, I've backed up my wife's home directory from our SUSE workstation with the following command:

 rsync -aHvz -e ssh donna@suse1:/home/donna /home/michael/ 

This command prompts you for the user's password. If it's the first time you've connected with SSH, you're prompted to confirm the connection. If you get an error message related to the known_hosts list, edit or delete the local .ssh/known_hosts file.

Once the transfer is complete, I can find a mirror image of the files from Donna's home directory on the local /home/michael/donna directory.

In this case, I don't want to use the trailing slash when referring to Donna's home directory. If I use a trailing slash (i.e., /home/donna), the individual contents of her directory are transferred instead of the directory as a unit. This means her default shell, browser, profile, KDE, GNOME settings, and more will overwrite those on my home directory! The trailing slash makes an enormous difference, which you can understand by running some experiments with and without it.

2.1.4. Configuring SSH Without Passwords

While you can back up home directories with passwords, that's not enough if you want to automate the process. If you want to configure a cron job for daily backups, a naive solution would be to store passwords in clear text in a cron job in the /etc/cron.daily directory, but that's generally a bad idea. The better alternative is to configure SSH to connect between computers without passwords using public-key exchange.

For more information on how this is done, see "My Other Computer Has No Monitor" in Chapter 11. When done properly, password-free access is limited to specific computers or a specific command such as rsync. For the purpose of this annoyance, I assume that you've created and configured the keys that you need to log in without a password.

2.1.5. Creating a cron Job for Your Backups

Now you can create a cron job for your backups. Assuming you want a daily backup, you'll need to create an appropriate script in the /etc/cron.daily/ directory; each script there runs regularly, thanks to the background cron process. Individual users can create their own cron jobs with the crontab -e command. If you need a model for cron jobs, see your /etc/crontab configuration file, and sample cron jobs in the /etc/cron.daily directory.

Generally, cron jobs collect commands run in a regular shell. So start by opening the filename of your choice under /etc/cron.daily/ with a text editor and add the following directive, which tells Linux to expect regular bash shell commands:


While not absolutely necessary, it is good practice to start every script by declaring the shell associated with your scripts. As there is normally no Bourne shell installed, the Linux distributions I know actually link /bin/sh to the default /bin/bash (The Bourne Again Shell) shell.

To minimize security risks, cron jobs do not inherit the PATH from any user. You can either define a PATH explicitly before your shell commands, or just specify the full path to each command.

Now we can insert the command shown previously, which backed up Donna's home directory.

 /usr/bin/rsync -aHvz -e /usr/bin/ssh donna@suse1:/home/donna /home/michael/ 

If you've already set the RSYNC_RSH=ssh environmental variable, the required command is simpler:

 /usr/bin/rsync -aHvz donna@suse1:/home/donna /home/michael/ 

Naturally, you may want to expand this backup method to cover all files on a workstation. To do so, you'll need root permissions and therefore will need to create public and private authentication keys in the workstation and server's /root/.ssh/ directories. You can then create a cron job to back up all files on the workstation, using the techniques described in this section.

Linux Annoyances for Geeks
Linux Annoyances for Geeks: Getting the Most Flexible System in the World Just the Way You Want It
ISBN: 0596008015
EAN: 2147483647
Year: 2004
Pages: 144
Authors: Michael Jang © 2008-2017.
If you may any questions please contact us: