11.1. Too Many Computers to Update over the Internet
If all you administer is one or two Linux computers, updates are a straightforward process. All you need to do is configure updates from the most appropriate mirror on the Internet. If desired, you can automate downloads and installations of updates using a cron job. For more information on how to configure updates from yum and apt-based mirrors, see Chapter 8.
However, when you administer a large number of Linux computers, the updates can easily overload standard high-speed Internet connections. For example, if you're downloading updates to the OpenOffice.org suite, you could be downloading hundreds of megabytes of packages. If you're downloading these packages on 100 computers simultaneously, that may be too much for your Internet connection, especially when other jobs are pending.
In this annoyance, I'll show you how you can create a local mirror of your favorite update server. You can then share the appropriate directory and configure your updates locally.
Where possible, I'll show you how you can limit what you mirror to updates. For example, Fedora Linux includes dedicated update directories. Most downloads are associated with updates, so it's appropriate to limit what you mirror to such packages.
One other approach is to download just the packages and create the repository systems yourself. For example, the createrepo command strips the headers from each RPM and configures a database that helps the yum command find the dependencies associated with every package.
I assume you have the hard disk space you need on your mirror server. Repositories can be very demanding with respect to disk space; be aware, if you're synchronizing repositories for multiple architectures and distributions, that downloaded mirrors can easily take up hundreds of gigabytes of space.
11.1.1. Available Mirror Tools
There are a number of ways to download the files associated with a mirror. The most common standard is based on the rsync command. With rsync, you can synchronize your mirrors as needed, downloading only those parts of those packages that are new or have otherwise changed. I'll show you how you can use rsync in this annoyance.
There are a number of other tools available. Naturally, you can use any FTP client to download mirrors to local directories. Commands such as wget and curl do an excellent job with large downloads. If you're working with an apt repository, the apt-mirror project provides another excellent alternative (http://freshmeat.net/projects/apt-mirror/).
11.1.2. Basic Steps
To create your mirror, you can take these steps, which I'll detail in the following subsections:
11.1.3. Find the Best Update Mirror
The best update mirror may not be the one that is physically closest to your network. Some mirrors have faster connections to the Internet. Others have less traffic. Some mirror administrators may discourage full mirror downloads or even limit the number of simultaneous connections. And many public mirrors don't support rsync connections.
Our selected distributions have "official" lists of update mirrors. More may be available. If a mirror includes a Fedora repository, it may also include a SUSE repository. For example, while the University of Mississippi is not (currently) on the official list of mirrors for SUSE Linux, updates are available from its server at http://mirror.phy.olemiss.edu/mirror/suse/suse/. Here's where to find the "official" list of mirrors for our selected distributions:
To see if a mirror works with the rsync protocol, run the rsync command with the URL in question. For example, if you want to check the mirror specified in the Debian Mirror List from the University of Southern California, run the following command (and don't forget the double colon at the end):
When I ran this command, I saw a long list of directories, clearly associated with various Linux distributions, including SUSE, Fedora, and others. If there is no rsync server at your desired site, the rsync command will time out, or you'll have to press Ctrl-C to return to the command line.
Finding the best update mirror is somewhat subjective. Yes, you could go by objective measures, such as the time required for the download. But conditions change. Internet traffic can slow down in certain geographic areas. Servers do go down. Some trial and error may be required.
11.1.4. Make Room for the Updates
Updates can consume gigabytes of space. The choices you make can make a significant difference in the space you need. Key factors include:
You may want to create a dedicated partition for your update repositories. That way, you can be sure that the space required by the repository does not crowd out the rest of your system.
11.1.5. Synchronize the Mirror
Along with perhaps most of the world of Linux, I like the rsync command. With appropriate switches, it's easy to use this command to copy the files and directories that you want. Once you've set up a mirror, you can use the rsync command as needed to keep your local mirror up-to-date.
The rsync command is straightforward; I use it to back up the home directory from my laptop computer with the following command:
rsync -a -e ssh email@example.com:/home/michael/* /backup
In the following subsections, I illustrate some simple examples of how you can create your own rsync mirror on our selected distributions. This assumes you're using an appropriate directory, possibly configured on a separate disk or partition.
220.127.116.11. Synchronizing a Fedora mirror
For this exercise, assume you want to synchronize your local update mirror with the one available from kernel.org. The entry in the list of Fedora mirrors is a little deceiving. When you see the following:
You'll need to run the following command to confirm that rsync works on that server, as well as to view the available directories (don't forget the trailing forward slash):
When I ran this command, I saw the result shown here:
MOTD: Welcome to the Linux Kernel Archive. MOTD: MOTD: Due to U.S. Exports Regulations, all cryptographic software on this MOTD: site is subject to the following legal notice: MOTD: MOTD: This site includes publicly available encryption source code MOTD: which, together with object code resulting from the compiling of MOTD: publicly available source code, may be exported from the United MOTD: States under License Exception "TSU" pursuant to 15 C.F.R. Section MOTD: 740.13(e). MOTD: MOTD: This legal notice applies to cryptographic software only. MOTD: Please see the Bureau of Industry and Security, MOTD: http://www.bis.doc.gov/ for more information about current MOTD: U.S. regulations. MOTD: drwxr-xr-x 4096 2005/06/09 09:40:43 . drwxr-xr-x 4096 2004/03/01 08:39:30 1 drwxr-xr-x 4096 2004/05/14 04:18:24 2 drwxr-xr-x 4096 2004/11/03 15:00:14 3 drwxr-xr-x 4096 2005/06/09 09:41:47 4 drwxrwsr-x 4096 2005/12/16 23:49:44 development drwxr-xr-x 4096 2005/11/22 06:14:23 test drwxrwsr-x 4096 2005/06/07 08:29:19 updates [michael@FedoraCore4 rhn]$
Naturally, Fedora Core production releases (which should also be available on the installation CDs/DVDs) are associated with the numbered directories. But the focus in this annoyance is on updates, which is the last directory listed on the server. Hopefully, this directory includes updates divided by Fedora Core releases.
To make sure this server includes the updates I need, I ran the following command:
I continued the process until I confirmed that this server included the update RPMs that I wanted to mirror. I wanted to create an Apache-based repository, so I mirrored the RPMs to the /var/www/html/yum/Fedora/Core/updates/4/i386 directory.
Then, to synchronize the local and remote update directories, I ran the following command:
rsync -a mirrors.kernel.org::fedora/core/updates/4/i386/. \ /var/www/html/yum/Fedora/Core/updates/4/i386
18.104.22.168. Synchronizing a SUSE mirror
Because the SUSE list of mirrors doesn't specify which are rsync servers, some trial and error is required. For this exercise, I attempted to synchronize my local update mirror with that available from the University of Utah. The listing that I saw in the SUSE mirror list as of this writing was:
I tried the following command, which led to an error message:
rsync suse.cs.utah.edu::pub/ @ERROR: Unknown module 'pub' rsync: connection unexpectedly closed (0 bytes received so far) [receiver] rsync error: error in rsync protocol data stream (code 12) at io.c(359)
So I tried the top-level directory and found the SUSE repositories at the top of the list:
rsync suse.cs.utah.edu:: suse The full /pub/suse directory from ftp.suse.com. people The full /pub/people directory from ftp.suse.com. projects The full /pub/projects directory from ftp.suse.com.
And, with a little browsing, as described in the previous section, I found the SUSE update directories with the following command:
I wanted to download updates associated with SUSE 10.0 to the following directory:
I could run the following command to synchronize all updates from the update directory at the University of Utah (the -v uses verbose mode, and the -z compresses the transferred data):
rsync -avz suse.cs.utah.edu::suse/i386/update/10.0/. \ /var/lib/YaST2/you/mnt/i386/update/10.0/
But that might transfer more than you need. If you explore a bit further, you'll find source packages as well as packages built for 64-bit and PPC CPU systems. If you have only 32-bit workstations, you don't need all this extra data. You can use the --exclude switch to avoid transferring these packages:
rsync -avz --exclude=*.src.rpm --exclude=*.ppc --exclude=*x86_64* \ suse.cs.utah.edu: :suse/i386/update/10.0/. \ /var/lib/YaST2/you/mnt/i386/update/10.0/
22.214.171.124. Synchronizing a Debian mirror
Debian mirrors are somewhat different. Besides the different package format, Debian mirrors do not include any separate update servers. Therefore, if you want to mirror a Debian update server, you'll have to install all the packages in the server (except any that you specifically exclude).
Because the Debian list of mirrors does not specify rsync servers, some trial and error may be required. For this exercise, I wanted to synchronize my local update mirror with that available from the University of California at Berkeley. The listing that I saw from this mirror was:
rsync linux.csua.berkeley.edu:: debian debian-non-US debian-cd
In other words, this revealed the directories associated with Debian CDs as well as non-U.S. packages. For now, I assume that you want to mirror the regular Debian repositories. I found them with the following command:
But as you can see from the output shown below, there are a number of directories full of packages that you may not need, unless you want to include the installers, as well as the binary packages associated with the full Debian range of architectures:
drwxr-sr-x 4096 2005/06/04 10:20:54 . drwxr-sr-x 4096 2005/12/17 00:33:29 binary-alpha drwxr-sr-x 4096 2005/12/17 00:39:50 binary-arm drwxr-sr-x 4096 2005/12/17 00:48:56 binary-hppa drwxr-sr-x 4096 2005/12/17 00:55:50 binary-i386 drwxr-sr-x 4096 2005/12/17 01:01:22 binary-ia64 drwxr-sr-x 4096 2005/12/17 01:07:29 binary-m68k drwxr-sr-x 4096 2005/12/17 01:15:06 binary-mips drwxr-sr-x 4096 2005/12/17 01:23:07 binary-mipsel drwxr-sr-x 4096 2005/12/17 01:29:11 binary-powerpc drwxr-sr-x 4096 2005/12/17 01:35:33 binary-s390 drwxr-sr-x 4096 2005/12/17 01:41:44 binary-sparc drwxr-sr-x 4096 2004/01/04 11:47:29 debian-installer drwxr-sr-x 4096 2005/03/24 00:22:16 installer-alpha drwxr-sr-x 4096 2005/03/24 00:22:16 installer-arm drwxr-sr-x 4096 2005/03/24 00:22:17 installer-hppa drwxr-sr-x 4096 2005/03/24 00:22:17 installer-i386 drwxr-sr-x 4096 2005/03/24 00:22:17 installer-ia64 drwxr-sr-x 4096 2005/03/24 00:22:17 installer-m68k drwxr-sr-x 4096 2005/03/24 00:22:17 installer-mips drwxr-sr-x 4096 2005/03/24 00:22:17 installer-mipsel drwxr-sr-x 4096 2005/03/24 00:22:17 installer-powerpc drwxr-sr-x 4096 2005/03/24 00:22:17 installer-s390 drwxr-sr-x 4096 2005/03/24 00:22:17 installer-sparc drwxr-sr-x 4096 2005/12/17 01:45:08 source drwxr-sr-x 4096 2005/06/04 11:40:37 upgrade-kernel
To download just the directories that you need, you can go into the appropriate subdirectory, or you can make extensive use of the --exclude switch. Debian recommends the latter. For example, if all of your workstations include Intel Itanium CPUs, you can run a command that excludes all files and directories not associated with the IA64 architecture. Debian recommends that you include the --recursive, --times, --links, --hard-links, and --delete switches, too. The basic steps to creating your mirror are:
If I wanted to limit the downloads to the ia64 directory, I would include the following switches:
rsync -avz --recursive --times --links --hard-links --delete --exclude binary-alpha/ --exclude *_alpha.deb --exclude binary-arm/ --exclude *_arm.deb --exclude binary-hppa/ --exclude *_hppa.deb --exclude binary-i386/ --exclude *_i386.deb --exclude binary-m68k/ --exclude *_m68k.deb --exclude binary-mips/ --exclude *_mips.deb --exclude binary-mipsel/ --exclude *_mipsel.deb --exclude binary-powerpc/ --exclude *_powerpc.deb --exclude binary-s390/ --exclude *_s390.deb --exclude binary-sparc/ --exclude *_sparc.deb
But things are beginning to get complicated. Debian provides a script that can help. All you'll need to do before running the script is to specify a few directives, including the rsync server, directory, and architectures to exclude. To see the script, navigate to http://www.debian.org/mirror/anonftpsync. For additional discussion of this rsync script, see http://www.debian.org/mirror/ftpmirror.
11.1.6. Making Your Mirror Work with Your Update System
Now that you have a local mirror of Linux updates, you'll need to make sure it's usable through your update system. For our selected distributions, I'm assuming that you're using yum for Fedora, apt for Debian, or YaST for SUSE Linux. This step involves creating the database that your packaging system consults on each host to know what it's already updated and to stay in sync.
I also assume that you've shared the update directory using a standard sharing service, such as FTP, HTTP, or NFS. I've described the basic methods associated with yum and apt updates in Chapter 8. If you're connecting to a shared NFS directory, substitute file:/// (with three forward slashes) for http:// or ftp://.
Generally, when you use rsync to copy and synchronize to local mirrors, you've also downloaded the directories that support the apt or yum databases.
126.96.36.199. Creating apt repository database files
If you're using apt for updates, such as for Debian Linux, you may already have the key database files: Packages.gz for regular binary packages and Sources.gz for source packages. Based on the Debian mirror described earlier, you can find these files in the following directories:
If you need to create your own versions of these database files, navigate to the directory with the binary packages and run the following command:
dpkg-scanpackages . /dev/null | gzip -9c > Packages.gz
And for the database of source packages, navigate to the directory with those packages and run the following command:
dpkg-scansources . /dev/null | gzip -9c > Sources.gz
For more information on this process, see http://www.interq.or.jp/libra/oohara/apt-gettable/.
188.8.131.52. Creating yum repository database files
There are two ways to create a yum repository database. Through Fedora Core 3, the standard was the yum-arch command, which is included in the yum RPM. Since that time, the standard has become the createrepo command, based on a package of the same name. For the older Fedora distributions (as well as the rebuild distributions of Red Hat Enterprise Linux 3 and 4, which use yum for updates), you can create your own yum repository database by navigating to the package directory and running the following command:
As yum "digests" the package headers, it collects them in a headers/ subdirectory.
For later Fedora distributions, assuming the packages are in the directory described earlier for Fedora updates, you'd run the following command:
This command creates an XML database in the repodata/ subdirectory. If your mirror process already copied either of these directories, you don't need to create it.
11.1.7. Test a Local Update
Now you'll want to test a local update. I described some of the update systems in Chapter 8. To summarize, for any of our three distributions, you'll need to make some configuration changes to point the package manager to the update server you created on your local network:
Once you change the appropriate configuration file, you can test updates from the local server that you created.
11.1.8. Automate the Synchronization Process
When you're satisfied that the local update server meets your needs, you'll want to automate the synchronization process. To do so, insert the rsync command(s) that you used in a cron job file. If you had to create yum or apt database files, you'll want to add those commands described earlier to the cron job.
Even after the first time you create a mirror, the downloads for updates can be extensive. For example, updates to the OpenOffice.org suite alone can occupy several hundred megabytes.
Therefore, you'll want to schedule the cron job for a time when few or no other jobs are running. And that depends on the schedule of other cron jobs, as well as any other jobs (such as database processing) that may happen during off-hours.
11.1.9. Connecting Local Workstations
Once you've tested your local mirror, and then configured regular updates to that mirror, you're ready to connect your local workstations to it. You'll need to modify the same files as described earlier in the "Test a Local Update" section.
If you want to configure automatic updates on your workstations from your local repositories, you'll need to configure cron jobs on each host.
Some distributions support GUI configuration of automated updates; SUSE supports it directly via YaST (which is saved to /etc/cron.d/yast2-online-update).
If you've installed the latest version of yum on Fedora Core, there's a cron job already configured in /etc/cron.daily/yum.cron. To let it run, you'll need to activate the yum service in the /etc/init.d directory.
Creating an update script is a straightforward process, with the following general steps: