Section 11.1. Too Many Computers to Update over the Internet


11.1. Too Many Computers to Update over the Internet

If all you administer is one or two Linux computers, updates are a straightforward process. All you need to do is configure updates from the most appropriate mirror on the Internet. If desired, you can automate downloads and installations of updates using a cron job. For more information on how to configure updates from yum and apt-based mirrors, see Chapter 8.

However, when you administer a large number of Linux computers, the updates can easily overload standard high-speed Internet connections. For example, if you're downloading updates to the OpenOffice.org suite, you could be downloading hundreds of megabytes of packages. If you're downloading these packages on 100 computers simultaneously, that may be too much for your Internet connection, especially when other jobs are pending.

In this annoyance, I'll show you how you can create a local mirror of your favorite update server. You can then share the appropriate directory and configure your updates locally.

Where possible, I'll show you how you can limit what you mirror to updates. For example, Fedora Linux includes dedicated update directories. Most downloads are associated with updates, so it's appropriate to limit what you mirror to such packages.

One other approach is to download just the packages and create the repository systems yourself. For example, the createrepo command strips the headers from each RPM and configures a database that helps the yum command find the dependencies associated with every package.

I assume you have the hard disk space you need on your mirror server. Repositories can be very demanding with respect to disk space; be aware, if you're synchronizing repositories for multiple architectures and distributions, that downloaded mirrors can easily take up hundreds of gigabytes of space.

11.1.1. Available Mirror Tools

There are a number of ways to download the files associated with a mirror. The most common standard is based on the rsync command. With rsync, you can synchronize your mirrors as needed, downloading only those parts of those packages that are new or have otherwise changed. I'll show you how you can use rsync in this annoyance.

There are a number of other tools available. Naturally, you can use any FTP client to download mirrors to local directories. Commands such as wget and curl do an excellent job with large downloads. If you're working with an apt repository, the apt-mirror project provides another excellent alternative (http://freshmeat.net/projects/apt-mirror/).

11.1.2. Basic Steps

To create your mirror, you can take these steps, which I'll detail in the following subsections:

  1. Find an appropriate update mirror, specifically the one that gives you the best performance for individual updates. Some trial and error may be required. While the best update mirror is usually geographically close to you, that may not always be the case.

  2. Make room for the updates. Several gigabytes may be required, especially if you're making room for updates for multiple distributions and/or versions. You may even consider using a dedicated partition or drive.

  3. Synchronize the mirror locally. The first time you download a mirror, you may be downloading gigabytes of data.

  4. If required, make your local mirror usable through your preferred update system.

  5. Test a local update after you've downloaded a mirror to make sure it works.

  6. Automate the synchronization process.

  7. Point your clients to the local mirror.

11.1.3. Find the Best Update Mirror

The best update mirror may not be the one that is physically closest to your network. Some mirrors have faster connections to the Internet. Others have less traffic. Some mirror administrators may discourage full mirror downloads or even limit the number of simultaneous connections. And many public mirrors don't support rsync connections.

Red Hat Enterprise Linux Updates

As updates for Red Hat Enterprise Linux (RHEL) are closely controlled, there are no authorized public mirrors available. However, if you've paid for RHEL subscriptions for enough workstations or desktops, Red Hat may allow you to configure a proxy server or a satellite server to distribute updates from your local network. These servers ensure that updates are applied only to subscribed systems.

Alternatively, you could use the yum package to create your own update repositories. The so-called "rebuild" distributions, such as CentOS (http://www.caosity.org) and WhiteBox Linux (http://www.whiteboxlinux.org), use yum to power their updates. You can use their yum package and associated configurations on your subscribed RHEL systems, and update your other RHEL systems from that repository, which saves you the trouble of learning how to configure the RHEL Proxy Server or Satellite Servers. (Naturally, non-RHEL packages such as yum and createrepo are not supported by Red Hat.)


Our selected distributions have "official" lists of update mirrors. More may be available. If a mirror includes a Fedora repository, it may also include a SUSE repository. For example, while the University of Mississippi is not (currently) on the official list of mirrors for SUSE Linux, updates are available from its server at http://mirror.phy.olemiss.edu/mirror/suse/suse/. Here's where to find the "official" list of mirrors for our selected distributions:


Fedora Core Linux

http://fedora.redhat.com/download/mirrors.html includes a list of mirrors accessible through the rsync protocol; don't limit yourself to those specified, as others may also work with rsync.


SUSE Linux

Official mirrors of the open source SUSE distribution can be found at http://en.opensuse.org/Mirrors_Released_Version. Trial and error is required to find rsync-capable mirrors.


Debian Linux

Official Debian mirrors can be found at http://www.debian.org/mirror/list. Many support a limited number of architectures. Trial and error is required to find rsync-capable mirrors.

To see if a mirror works with the rsync protocol, run the rsync command with the URL in question. For example, if you want to check the mirror specified in the Debian Mirror List from the University of Southern California, run the following command (and don't forget the double colon at the end):

 rsync mirrors.usc.edu:: 

When I ran this command, I saw a long list of directories, clearly associated with various Linux distributions, including SUSE, Fedora, and others. If there is no rsync server at your desired site, the rsync command will time out, or you'll have to press Ctrl-C to return to the command line.

Finding the best update mirror is somewhat subjective. Yes, you could go by objective measures, such as the time required for the download. But conditions change. Internet traffic can slow down in certain geographic areas. Servers do go down. Some trial and error may be required.

Fedora had implemented an "apt-get mirror select" for apt-based repositories. But Fedora is moving away from the apt commands, and Red Hat developers are working on plug-ins for yum that function in the same way.


11.1.4. Make Room for the Updates

Updates can consume gigabytes of space. The choices you make can make a significant difference in the space you need. Key factors include:


Architectures

Every architecture that you maintain locally can multiply the space you need. For example, if you're rolling out both 64-bit and 32-bit workstations, you'll need at least double the space.


Distributions

If you're maintaining mirrors for more than one distribution, your space requirements increase accordingly.


Distribution Versions

If you're maintaining mirrors for more than one version of a distribution (such as for Fedora Core 4 and 5), your space requirements can multiply.


Installation Files

Many administrators find it convenient to include a copy of the installation trees in the update repository partition. This increases the space required by the size of the installation CDs/DVDs.

You may want to create a dedicated partition for your update repositories. That way, you can be sure that the space required by the repository does not crowd out the rest of your system.

If you're configuring mirrors for 64-bit Linux RPM-based distributions, focus on yum. The apt tools currently have trouble with repositories that mix 32-bit and 64-bit packages, as is currently required for a number of applications. I know of no similar problems for Debian distributions.


11.1.5. Synchronize the Mirror

Along with perhaps most of the world of Linux, I like the rsync command. With appropriate switches, it's easy to use this command to copy the files and directories that you want. Once you've set up a mirror, you can use the rsync command as needed to keep your local mirror up-to-date.

The rsync command is straightforward; I use it to back up the home directory from my laptop computer with the following command:

 rsync -a -e ssh michael@laptop.example.com:/home/michael/* /backup 

If you've set the environment variable ENV_RSYNC=ssh, you don't need the -e ssh option. For more information on rsync, see the "I'm Afraid of Losing Data" annoyance in Chapter 2.


In the following subsections, I illustrate some simple examples of how you can create your own rsync mirror on our selected distributions. This assumes you're using an appropriate directory, possibly configured on a separate disk or partition.

11.1.5.1. Synchronizing a Fedora mirror

For this exercise, assume you want to synchronize your local update mirror with the one available from kernel.org. The entry in the list of Fedora mirrors is a little deceiving. When you see the following:

 rsync://mirrors.kernel.org/fedora/core/ 

You'll need to run the following command to confirm that rsync works on that server, as well as to view the available directories (don't forget the trailing forward slash):

 rsync mirrors.kernel.org::fedora/core/ 

When I ran this command, I saw the result shown here:

 MOTD:   Welcome to the Linux Kernel Archive. MOTD: MOTD:   Due to U.S. Exports Regulations, all cryptographic software on this MOTD:   site is subject to the following legal notice: MOTD: MOTD:   This site includes publicly available encryption source code MOTD:   which, together with object code resulting from the compiling of MOTD:   publicly available source code, may be exported from the United MOTD:   States under License Exception "TSU" pursuant to 15 C.F.R. Section MOTD:   740.13(e). MOTD: MOTD:   This legal notice applies to cryptographic software only. MOTD:   Please see the Bureau of Industry and Security, MOTD:   http://www.bis.doc.gov/ for more information about current MOTD:   U.S. regulations. MOTD: drwxr-xr-x        4096 2005/06/09 09:40:43 . drwxr-xr-x        4096 2004/03/01 08:39:30 1 drwxr-xr-x        4096 2004/05/14 04:18:24 2 drwxr-xr-x        4096 2004/11/03 15:00:14 3 drwxr-xr-x        4096 2005/06/09 09:41:47 4 drwxrwsr-x        4096 2005/12/16 23:49:44 development drwxr-xr-x        4096 2005/11/22 06:14:23 test drwxrwsr-x        4096 2005/06/07 08:29:19 updates [michael@FedoraCore4 rhn]$ 

Naturally, Fedora Core production releases (which should also be available on the installation CDs/DVDs) are associated with the numbered directories. But the focus in this annoyance is on updates, which is the last directory listed on the server. Hopefully, this directory includes updates divided by Fedora Core releases.

To make sure this server includes the updates I need, I ran the following command:

 rsync mirrors.kernel.org::fedora/core/updates/ 

I continued the process until I confirmed that this server included the update RPMs that I wanted to mirror. I wanted to create an Apache-based repository, so I mirrored the RPMs to the /var/www/html/yum/Fedora/Core/updates/4/i386 directory.

By default, the DocumentRoot associated with the default Fedora Apache configuration points to the /var/www/html directory; if I configure a local Apache server, I can use the Fedora/Core/updates/4/ subdirectory.


Then, to synchronize the local and remote update directories, I ran the following command:

 rsync -a mirrors.kernel.org::fedora/core/updates/4/i386/. \ /var/www/html/yum/Fedora/Core/updates/4/i386 

11.1.5.2. Synchronizing a SUSE mirror

Because the SUSE list of mirrors doesn't specify which are rsync servers, some trial and error is required. For this exercise, I attempted to synchronize my local update mirror with that available from the University of Utah. The listing that I saw in the SUSE mirror list as of this writing was:

 suse.cs.utah.edu/pub/ 

I tried the following command, which led to an error message:

 rsync suse.cs.utah.edu::pub/ @ERROR: Unknown module 'pub' rsync: connection unexpectedly closed (0 bytes received so far) [receiver] rsync error: error in rsync protocol data stream (code 12) at io.c(359) 

So I tried the top-level directory and found the SUSE repositories at the top of the list:

 rsync suse.cs.utah.edu::  suse             The full /pub/suse directory from ftp.suse.com. people           The full /pub/people directory from ftp.suse.com. projects         The full /pub/projects directory from ftp.suse.com. 

And, with a little browsing, as described in the previous section, I found the SUSE update directories with the following command:

 rsync suse.cs.utah.edu::suse/i386/update/10.0/ 

I wanted to download updates associated with SUSE 10.0 to the following directory:

 /var/lib/YaST2/you/mnt/i386/update/10.0/ 

I could run the following command to synchronize all updates from the update directory at the University of Utah (the -v uses verbose mode, and the -z compresses the transferred data):

 rsync -avz suse.cs.utah.edu::suse/i386/update/10.0/. \ /var/lib/YaST2/you/mnt/i386/update/10.0/ 

But that might transfer more than you need. If you explore a bit further, you'll find source packages as well as packages built for 64-bit and PPC CPU systems. If you have only 32-bit workstations, you don't need all this extra data. You can use the --exclude switch to avoid transferring these packages:

 rsync -avz --exclude=*.src.rpm --exclude=*.ppc --exclude=*x86_64* \ suse.cs.utah.edu: :suse/i386/update/10.0/. \ /var/lib/YaST2/you/mnt/i386/update/10.0/ 

11.1.5.3. Synchronizing a Debian mirror

Debian mirrors are somewhat different. Besides the different package format, Debian mirrors do not include any separate update servers. Therefore, if you want to mirror a Debian update server, you'll have to install all the packages in the server (except any that you specifically exclude).

Because the Debian list of mirrors does not specify rsync servers, some trial and error may be required. For this exercise, I wanted to synchronize my local update mirror with that available from the University of California at Berkeley. The listing that I saw from this mirror was:

 rsync linux.csua.berkeley.edu:: debian debian-non-US debian-cd 

In other words, this revealed the directories associated with Debian CDs as well as non-U.S. packages. For now, I assume that you want to mirror the regular Debian repositories. I found them with the following command:

 rsync linux.csua.berkeley.edu::debian/dists/Debian3.1r0/main/ 

But as you can see from the output shown below, there are a number of directories full of packages that you may not need, unless you want to include the installers, as well as the binary packages associated with the full Debian range of architectures:

 drwxr-sr-x        4096 2005/06/04 10:20:54 . drwxr-sr-x        4096 2005/12/17 00:33:29 binary-alpha drwxr-sr-x        4096 2005/12/17 00:39:50 binary-arm drwxr-sr-x        4096 2005/12/17 00:48:56 binary-hppa drwxr-sr-x        4096 2005/12/17 00:55:50 binary-i386 drwxr-sr-x        4096 2005/12/17 01:01:22 binary-ia64 drwxr-sr-x        4096 2005/12/17 01:07:29 binary-m68k drwxr-sr-x        4096 2005/12/17 01:15:06 binary-mips drwxr-sr-x        4096 2005/12/17 01:23:07 binary-mipsel drwxr-sr-x        4096 2005/12/17 01:29:11 binary-powerpc drwxr-sr-x        4096 2005/12/17 01:35:33 binary-s390 drwxr-sr-x        4096 2005/12/17 01:41:44 binary-sparc drwxr-sr-x        4096 2004/01/04 11:47:29 debian-installer drwxr-sr-x        4096 2005/03/24 00:22:16 installer-alpha drwxr-sr-x        4096 2005/03/24 00:22:16 installer-arm drwxr-sr-x        4096 2005/03/24 00:22:17 installer-hppa drwxr-sr-x        4096 2005/03/24 00:22:17 installer-i386 drwxr-sr-x        4096 2005/03/24 00:22:17 installer-ia64 drwxr-sr-x        4096 2005/03/24 00:22:17 installer-m68k drwxr-sr-x        4096 2005/03/24 00:22:17 installer-mips drwxr-sr-x        4096 2005/03/24 00:22:17 installer-mipsel drwxr-sr-x        4096 2005/03/24 00:22:17 installer-powerpc drwxr-sr-x        4096 2005/03/24 00:22:17 installer-s390 drwxr-sr-x        4096 2005/03/24 00:22:17 installer-sparc drwxr-sr-x        4096 2005/12/17 01:45:08 source drwxr-sr-x        4096 2005/06/04 11:40:37 upgrade-kernel 

To download just the directories that you need, you can go into the appropriate subdirectory, or you can make extensive use of the --exclude switch. Debian recommends the latter. For example, if all of your workstations include Intel Itanium CPUs, you can run a command that excludes all files and directories not associated with the IA64 architecture. Debian recommends that you include the --recursive, --times, --links, --hard-links, and --delete switches, too. The basic steps to creating your mirror are:

  • Recursively download and synchronize files from all subdirectories

  • Preserve the date and time associated with each file

  • Re-create any existing symlinks

  • Include any hard-linked files

  • Delete any files that no longer exist on the mirror

If I wanted to limit the downloads to the ia64 directory, I would include the following switches:

 rsync -avz --recursive --times --links --hard-links --delete --exclude binary-alpha/ --exclude *_alpha.deb --exclude binary-arm/ --exclude *_arm.deb --exclude binary-hppa/ --exclude *_hppa.deb --exclude binary-i386/ --exclude *_i386.deb --exclude binary-m68k/ --exclude *_m68k.deb --exclude binary-mips/ --exclude *_mips.deb --exclude binary-mipsel/ --exclude *_mipsel.deb --exclude binary-powerpc/ --exclude *_powerpc.deb --exclude binary-s390/ --exclude *_s390.deb --exclude binary-sparc/ --exclude *_sparc.deb 

But things are beginning to get complicated. Debian provides a script that can help. All you'll need to do before running the script is to specify a few directives, including the rsync server, directory, and architectures to exclude. To see the script, navigate to http://www.debian.org/mirror/anonftpsync. For additional discussion of this rsync script, see http://www.debian.org/mirror/ftpmirror.

11.1.6. Making Your Mirror Work with Your Update System

Now that you have a local mirror of Linux updates, you'll need to make sure it's usable through your update system. For our selected distributions, I'm assuming that you're using yum for Fedora, apt for Debian, or YaST for SUSE Linux. This step involves creating the database that your packaging system consults on each host to know what it's already updated and to stay in sync.

I also assume that you've shared the update directory using a standard sharing service, such as FTP, HTTP, or NFS. I've described the basic methods associated with yum and apt updates in Chapter 8. If you're connecting to a shared NFS directory, substitute file:/// (with three forward slashes) for http:// or ftp://.

Generally, when you use rsync to copy and synchronize to local mirrors, you've also downloaded the directories that support the apt or yum databases.

11.1.6.1. Creating apt repository database files

If you're using apt for updates, such as for Debian Linux, you may already have the key database files: Packages.gz for regular binary packages and Sources.gz for source packages. Based on the Debian mirror described earlier, you can find these files in the following directories:

 linux.csua.berkeley.edu/debian/dists/Debian3.1r0/main/binary-i386/ linux.csua.berkeley.edu/debian/dists/Debian3.1r0/main/source/ 

If you need to create your own versions of these database files, navigate to the directory with the binary packages and run the following command:

 dpkg-scanpackages . /dev/null | gzip -9c > Packages.gz 

And for the database of source packages, navigate to the directory with those packages and run the following command:

 dpkg-scansources . /dev/null | gzip -9c > Sources.gz 

For more information on this process, see http://www.interq.or.jp/libra/oohara/apt-gettable/.

11.1.6.2. Creating yum repository database files

There are two ways to create a yum repository database. Through Fedora Core 3, the standard was the yum-arch command, which is included in the yum RPM. Since that time, the standard has become the createrepo command, based on a package of the same name. For the older Fedora distributions (as well as the rebuild distributions of Red Hat Enterprise Linux 3 and 4, which use yum for updates), you can create your own yum repository database by navigating to the package directory and running the following command:

 yum-arch . 

As yum "digests" the package headers, it collects them in a headers/ subdirectory.

For later Fedora distributions, assuming the packages are in the directory described earlier for Fedora updates, you'd run the following command:

 createrepo /var/www/html/yum/Fedora/Core/updates/4/i386 

This command creates an XML database in the repodata/ subdirectory. If your mirror process already copied either of these directories, you don't need to create it.

11.1.7. Test a Local Update

Now you'll want to test a local update. I described some of the update systems in Chapter 8. To summarize, for any of our three distributions, you'll need to make some configuration changes to point the package manager to the update server you created on your local network:


Updating yum for Fedora

If you're updating yum for Fedora, you'll want to update the appropriate configuration files in the /etc/yum.repos.d directory. If your local mirror consists of Fedora updates, the file is fedora-updates.repo. For example, if you've shared the directory described in the previous section via NFS and have mounted the appropriate directory, you would substitute the following for the default baseurl directive:

 baseurl=file:/var/www/html/yum/Fedora/Core/updates/4/i386/ 


Updating YaST for SUSE

If you're updating YaST for SUSE Linux, you'll need to point the update server to the shared local directory. In the appropriate YaST menu, you can configure a connection to any of several servers, including FTP, HTTP, or NFS servers from the local network. For example, if I've created an FTP server that points to the SUSE repository directory created earlier, I'd select FTP, cite the name of the server, and point to the following directory on that server: /var/lib/YaST2/you/mnt/i386/update/10.0/


Updating apt for Debian

If you're updating apt for Debian Linux, you'll want to update the appropriate URLs configured in /etc/apt/sources.list. For example, if you've mirrored a repository for Debian Sarge and created an HTTP server on your local network, on a computer named debianrep, in the web server's /repo subdirectory, you'd add the following line to each clients' sources.list file:

 deb http://debianrep/repo sarge main 

Once you change the appropriate configuration file, you can test updates from the local server that you created.

11.1.8. Automate the Synchronization Process

When you're satisfied that the local update server meets your needs, you'll want to automate the synchronization process. To do so, insert the rsync command(s) that you used in a cron job file. If you had to create yum or apt database files, you'll want to add those commands described earlier to the cron job.

Even after the first time you create a mirror, the downloads for updates can be extensive. For example, updates to the OpenOffice.org suite alone can occupy several hundred megabytes.

Therefore, you'll want to schedule the cron job for a time when few or no other jobs are running. And that depends on the schedule of other cron jobs, as well as any other jobs (such as database processing) that may happen during off-hours.

11.1.9. Connecting Local Workstations

Once you've tested your local mirror, and then configured regular updates to that mirror, you're ready to connect your local workstations to it. You'll need to modify the same files as described earlier in the "Test a Local Update" section.

If you want to configure automatic updates on your workstations from your local repositories, you'll need to configure cron jobs on each host.

Remember, updates always carry some degree of risk. But when you update the system with the local repository, you're testing at least some of the updates. You have to decide if you want to do more testing or allow automatic updates to the production systems on your network. You can always create a script to log in to and update each of the production systems when you're ready.


Some distributions support GUI configuration of automated updates; SUSE supports it directly via YaST (which is saved to /etc/cron.d/yast2-online-update).

If you've installed the latest version of yum on Fedora Core, there's a cron job already configured in /etc/cron.daily/yum.cron. To let it run, you'll need to activate the yum service in the /etc/init.d directory.

Creating an update script is a straightforward process, with the following general steps:

  1. Create a cron job in the appropriate directory. If you want a weekly update, add it to /etc/cron.weekly.

  2. Make sure the script checks for the latest version of the update-management command. For example, if you're updating with apt, make sure it's up-to-date with the following command:

     apt-get install apt 

    I use apt-get install and not apt-get upgrade, so I don't have to worry about pending updates to other packages. If the package is already installed, it is automatically upgraded.

  3. If you're running apt, you'll need to make sure the local cache of packages is up-to-date:

     apt-get update 

  4. Finally, apply the update command that you need, such as the following:

 apt-get dist-upgrade 



Linux Annoyances for Geeks
Linux Annoyances for Geeks: Getting the Most Flexible System in the World Just the Way You Want It
ISBN: 0596008015
EAN: 2147483647
Year: 2004
Pages: 144
Authors: Michael Jang

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net