15.6 Monitoring and Managing Disk Space Usage


This section looks at the tools available to monitor and track disk space usage. It then goes on to discuss ways of approaching a perennial administrative challenge: getting users to reduce their disk use.

15.6.1 Where Did It All Go?

The df -k command produces a report that describes all the filesystems, their total capacities, and the amount of free space available on each one (reporting sizes in KB). Here is the output from a Linux system:

File system    Kbytes   used     avail     capacity     Mounted on  /dev/sd0a      7608     6369     478       93%          / /dev/sd0g      49155    45224    0         102%         /corp

This output reports the status of two filesystems: /dev/sd0a, the root disk, and /dev/sd0g, the disk mounted at corp (containing all files and subdirectories underneath /corp). Each line of the report shows the filesystem's name, the total number of kilobytes on the disk, the number of kilobytes in use, the number of kilobytes available, and the percentage of the filesystem's storage that is in use. It is evident that both filesystems are heavily used. In fact, the /corp filesystem appears to be overfull.

As we've noted earlier, the operating system generally holds back some amount of space in each filesystem, allocatable only by the superuser (usually 10%, although Linux uses 5% by default). A filesystem may appear to use over 100% of the available space when it has tapped into this reserve.

The du -k command reports the amount of disk space used by all files and subdirectories underneath one or more specified directories, listed on a per-subdirectory basis (amounts are given in KB).

A typical du report looks like this:

$ du -k /home/chavez  50    /home/chavez/bin  114    /home/chavez/src  ...  34823 /home/chavez

This report states that in the directory /home/chavez, the subdirectory bin occupies 50 blocks of disk space, and the subdirectory src occupies 114 blocks. Using the du command on users' home directories and on directories where ongoing development is taking place is one way to determine who is using the system's disk space.

The report from du can be inordinately long and tedious. By using the -s option, you eliminate most of the data; du -s reports the total amount of disk space that a directory and its contents occupies, but it does not report the storage requirements of each subdirectory. For example:

$ du -k -s /home/chavez  34823 /home/chavez

In many cases, this may be all the information you care about.

To generate a list of the system's directories in order of size, execute the command:

$ du -k / | sort -rn

This command starts at the root filesystem, lists the storage required for each directory, and pipes its output to sort. With the -rn options (reverse sort order, sort by numeric first field), sort orders these directories according to the amount of storage they occupy, placing the largest first.

If the directory specified as its parameter is large or has a large number of subdirectories, du can take quite a while to execute. It is thus a prime candidate for automation via scripts and after-hours execution via cron.

The quot command breaks down disk space usage within a single filesystem by user. This command is available on all of the systems we are considering except Linux.[28] quot has the following syntax:

[28] Linux does provide it for xfs filesystems.

# quot file-system

quot reports the number of kilobytes used by each user in the specified filesystem. It is run as root (to access the disk special files). Here's a typical example:

# quot /  /dev/sd0a (/):  6472   root  5234   bin  62     sys  2      adm

This report indicates that on the root disk, 6472 kilobytes are owned by the user root, 5234 kilobytes are owned by user bin, and so on. This command can help you spot users who are consuming excessive amounts of disk space, especially in areas other than their home directories. Like du, quot must access the entire disk and so can take an appreciable amount of time to execute.

15.6.2 Handling Disk Shortage Problems

The commands and scripts we've just looked at will let you know when you have a disk space shortage and where the available space went, but you'll still have to solve the problem and free up the needed space somehow. There is a large range of approaches to solving disk space problems, including the following:

  • Buy another disk. This is the ideal solution, but it's not always practical.

  • Mount a remote disk that has some free space on it. This solution assumes that such a disk is available, that mounting it on your system presents no security problems, and that adding additional data to it won't cause problems on its home system.

  • Eliminate unnecessary files. For example, in a pinch, you can remove the preformatted versions of the manual pages provided that the source files are also available on your system.

  • Compress large, infrequently accessed files.

  • Convince or cajole users into deleting unneeded files and backing up and then deleting old files they are no longer using. If you are successful, a great deal of free disk space usually results. At the same time, you should check the system for log files that can be reduced in size (discussed later in this section).

    When gentle pressure on users doesn't work, sometimes peer pressure will. The system administrator on one system I worked on used to mail a list of the top five "disk hogs" essentially the output of the quot command whenever disk space was short. I recommend this approach only if you have both a thick skin and a good-natured user community.

  • Some sites automatically archive and then delete user files that haven't been accessed in a certain period of time (often two or three months). If a user wants a file back, he can send a message to the system administration staff, who will restore it. This approach is the most brutal and should only be taken when absolutely necessary. It is fairly common in university environments, but rarely used elsewhere. It's also easy to circumvent by touching all your files every month, and performing system backups may also reset access times on inactive files.

These, then, are some of the alternatives.[29] In most cases, though, when you can't add any disks to the system, the most effective way to solve a disk space problem is to convince users to reduce their storage requirements by deleting old, useless, and seldom (if ever) used files (after backing them up first). Junk files abound on all systems. For example, many text editors create checkpoint and backup files as protection against a user error or a system failure. If these accumulate, they can consume a lot of disk space. In addition, users often keep many versions of files around (noticed most often in the case of program source files), frequently not even remembering what the differences are between them.

[29] There is another way to limit users' disk usage on some systems: disk quotas (discussed later in this section). However, quotas won't help you once the disks are already too full.

The system scratch directory /tmp also needs to be cleared out periodically (as well as any other directories serving a similar function). If your system doesn't get rebooted very often, you'll need to do this by hand. You should also keep an eye on the various system spooling directories under /usr/spool or /var/spool because files can often become stagnant there.

Unix itself has a number of accounting and logging files that, if left unattended, will grow without bound. As administrator, you are responsible for extracting the relevant data from these files periodically and then truncating them. We'll look at dealing with these sources of wasted space in the following sections.

Under some circumstances, a filesystem's performance can begin to degrade when a filesystem is more than 80%-90% full. Therefore, it is a good idea to take any corrective action before your filesystems reach this level, rather than waiting until they are completely full.

15.6.2.1 Using find to locate or remove wasted space

The find command may be used to locate potential candidates for archival and deletion (or just deletion) in the event of a disk space shortage. For example, the following command prints all files with names beginning with .BAK. or ending with a tilde, the formats for backup files from two popular text editors:

$ find / -name ".BAK.*" -o -name "*~" -print

As we've seen, find can also delete files automatically. For example, the following command deletes all editor backup files over one week old:

# find / /bio /corp -atime +7 \( -name ".BAK.*" \        -o -name "*~" \) -type f -xdev -exec rm -f {} \;

When using find for automatic deletion, it pays to be cautious. That is why the previous command includes the -type and -xdev options and lists each filesystem separately. With the cron facility, you can use find to produce a list of files subject to deletion nightly (or to delete them automatically).

Another tactic is to search the filesystem for duplicate files. This will require writing a script, but you'll be amazed at how many you'll find.

15.6.2.2 Limiting the growth of log files

The system administrator is responsible for reaping any data needed from logfiles and keeping them to a reasonable size. The major offenders include these files:

  • The various system log files in /usr/adm or /var/adm, which may include sulog, messages, and other files set up via /etc/syslog.conf.

  • Accounting files in /usr/adm or /var/adm, especially wtmp and acct (BSD) or pacct (System V). Also, under System V, the space consumed by the cumulative summary files and ASCII reports in /var/adm/acct/sum and /var/adm/acct/fiscal are worth monitoring.

  • Subsystem log files: many Unix facilities, such as cron, the mail system, and the printing system, keep their own log files.

  • Under AIX, the files smit.log and smit.script in users' home directories are appended to every time someone runs SMIT. They become large very quickly. You should watch the ones in your own and root's home directories (if you su to root, the files still go into your own home directory). Alternatively, you could run the smit command with the -l and -s options (which specify the log and script filenames respectively) and set both filenames to /dev/null. Defining an alias is the easy way to do so:

    alias smit="smit -l /dev/null -s /dev/null"       bash/ksh alias smit "smit -l /dev/null -s /dev/null"       csh/tcsh

There are several approaches to controlling the growth of system log files. The easiest is to truncate them by hand when they become large. This is advisable only for ASCII (text) log files. To reduce a file to zero length, use a command such as:

# cat /dev/null > /var/adm/sulog

Copying from the null device into the file is preferable to deleting the file, because in some cases the subsystem won't recreate the log file if it doesn't exist. It's also preferable to rm followed by touch because the file ownerships and permissions remain correct and also because it releases the disk space immediately.

To retain a small part of the current logging information, use tail, as in this example:

# cd /var/adm # tail -100 sulog >tmp # cat tmp > sulog

A third approach is to keep several old versions of a log file on the system by periodically deleting the oldest one, renaming the current one, and then recreating it. This technique is described in Section 3.2.

AIX provides the skulker script (stored in /usr/sbin) to perform some of these filesystem cleanup operations, including the following:

  • Clearing the queueing system spooling areas of old, junk files.

  • Clearing /tmp and /var/tmp of all files over one day old.

  • Deleting old news files (over 45 days old).

  • Deleting a variety of editor backup files, core dump files, and random executables (named a.out). You may want to add to the list of file types.

The system comes set up to run skulker every day at 3 A.M. via cron, but the crontab entry is commented out. If you want to run skulker, you'll need to remove the comment character from the skulker line in root's crontab file.

15.6.3 Controlling Disk Usage with Disk Quotas

Disk space shortages are a perennial problem on all computers. For systems where direct control over how much disk space each user uses is essential, disk quotas may provide a solution.

The disk quota system allows an administrator to limit the amount of filesystem storage that any user can consume. If quotas are enabled, the operating system will maintain separate quotas for each user's disk space and inode consumption (equivalent to the total number of files he owns) on each filesystem.

There are two distinct kinds of quota: a hard limit and a soft limit. A user is never allowed to exceed his hard limit, under any circumstances. When a user reaches his hard limit, he'll get a message that he has exceeded his quota, and the operating system will refuse to allocate any more storage. A user may exceed the soft limit for a limited period of time; in such cases, he gets a warning message, and the operating system grants the request for additional storage. If his disk usage still exceeds this soft limit at the next login, the message will be repeated. He'll continue to receive warnings at each successive login until either:

  • He reduces his disk usage to below the soft limit, or

  • He's been warned a fixed number of times (or for a specified period of time, depending on the implementation). At this point, the operating system will refuse to allocate any more storage until the user deletes enough files that his disk usage again falls below his soft limit.

The disk quota system has been designed to let users have large temporary files, provided that in the long term, they obey a much stricter limit. For example, consider a user with a hard limit of 15,000 blocks and a soft limit of 10,000 blocks. If this user's storage ever exceeds 15,000 blocks, the operating system will refuse to allocate any more storage immediately; he will need to free some storage before he can save any more files. If this user's storage exceeds 10,000 blocks, he'll get a warning but requests for more disk space will still be honored. However, if this user does not reduce his storage below 10,000 blocks, the operating system will eventually refuse to allocate any additional storage until it does fall below 10,000 blocks.

If you decide to implement a quota system, you must determine which filesystems need quotas. In most situations, the filesystems containing user home directories are appropriate candidates for quotas. Filesystems that are reserved for public files (for example, the root filesystem) probably shouldn't use quotas. The /tmp filesystem doesn't usually have quotas because it's designed to provide temporary scratch space.

Many operating systems require quotas to be enabled in the kernel, and many kernels do not include them by default. Check your kernel configuration before attempting to use quotas.

15.6.3.1 Preparing filesystems for quotas

After deciding which filesystems will have quotas, you'll need to edit the filesystem entries in the filesystem configuration file (usually /etc/fstab) to indicate that quotas are in use by editing the options field, as in these examples:[30]

[30] There are two versions of the Linux disk quota facility. This discussion describes Version 1 because Version 2 is relatively new.

FreeBSD /dev/ad1s1a      /1   ufs       rw,userquota      1 1 Linux /dev/sdb2        /1   reiserfs  usrquota,grpquota 1 1 HP-UX /dev/vg01/lvol3  /1   vxfs      rw,quota          0 1 Tru64 chem_domain#one  /1   advfs     rq                0 1 Solaris /dev/dsk/c0t3d0s0 ...  /1  ufs   2   yes    rw,logging,quota

See Section 10.2 for full details on the filesystem configuration file on the various systems.

On AIX systems, add a line like the following to the filesystem's stanza in /etc/filesystems:

quota = userquota,groupquota

Include the userquota keyword for standard disk quotas and the groupquota keyword for group-based disk quotas (described in the final part of this section).

Next, make sure that there is a file named quotas in the top-level directory of each filesystem for which you want to establish quotas. If the file does not exist, create it with the touch command:[31]

[31] This is not always required by recent quota system implementations, but it won't hurt either.

# cd /chem # touch quotas # chmod 600 quotas

The file must be writable by root and no one else.

15.6.3.2 Setting users' quota limits

Use the edquota command to establish filesystem quotas for individual users. This command can be invoked to edit the quotas for a single user:

# edquota username(s)

When you execute this command, edquota creates a temporary file containing the hard and soft limits on each filesystem for each user. After creating the file, edquota invokes an editor so you can modify it (by default, vi; you can use the environment variable EDITOR to specify your favorite editor). Each line in this file describes one filesystem. The format varies somewhat; here is an example:

/chem: blocks in use: 13420, limits (soft=20000, hard=30000)         inodes in use: 824, limits (soft=0, hard=0)

This entry specifies quotas for the /chem filesystem; by editing it, you can add hard and soft limits for this user's total disk space and inode space (total number of files). Setting a quota to 0 disables that quota. The example specifies a soft quota of 20,000 disk blocks, a hard quota of 30,000 disk blocks, and no quotas on inodes. Note that the entry in the temporary file does not indicate anything about the user(s) to which these quotas apply; quotas apply to the user specified when you execute the edquota command. When you list more than one user on the command line, you will edit a file for each one of them in turn.

After you save the temporary quota file and exit the editor (using whatever commands are appropriate for the editor you are using), edquota modifies the quotas files themselves. These files cannot be edited directly.

The -p option to edquota lets you copy quota settings between users. For example, the following command applies chavez's quota settings to users wang and harvey:

# edquota -p chavez wang harvey
15.6.3.3 Setting the soft limit expiration period

edquota's -t option is used to specify the system-wide time limit for soft quotas. Executing edquota -t also starts an editor session something like this one:

Time units may be: days, hours, minutes, or seconds Grace period before enforcing soft limits for groups:  /chem: block grace period: 3 days, file grace period: 0 days

A value of zero days indicates the default value is in effect (usually seven days). You can specify the time period in other units by changing days to one of the other listed keywords. Some implementations allow you to specify the grace period in months as well, but then one would have to start to wonder what the point of using disk quotas was in the first place.

15.6.3.4 Enabling quota checking

The quotaon command is used to activate the quota system and enable quota checking:

# quotaon filesystem # quotaon -a

The first command enables the quota system for the specified filesystem. The latter enables quotas on all filesystems listed with quotas in the filesystem configuration file. For example, the following command enables quotas for the /chem filesystem:

# quotaon /chem

Similarly, the command quotaoff disables quotas. It can be used with the -a option to disable all quotas, or with a list of filesystem names.

15.6.3.5 Quota consistency checking

The quotacheck command checks the consistency of the quotas file for the filesystem specified as its argument. It verifies that the quota files are consistent with current actual disk usage. This command should be executed after you install or modify the quota system. If used with the option -a, quotacheck checks all filesystems designated as using quotas in the filesystem configuration file.

quotacheck -a and quotaon -a also need to be run at boot time (in this order). You may need to add them to one of the system boot scripts on AIX systems. The other Unix versions run them automatically, via these boot scripts:

FreeBSD
/etc/rc (if check_quotas="yes" in /etc/rc.conf)
HP-UX
/sbin/init.d/localmount
Linux
/etc/init.d/quota(SuSE 7: if START_QUOTA="yes" in /etc/rc.config)
Solaris
/etc/init.d/MOUNTFS and ufs_quota
Tru64
/sbin/init.d/quota if QUOTA_CONFIG="yes" in /etc/rc.config

15.6.3.6 Disk quota reports

The repquota command reports the current quotas for one or more specified filesystem(s). Here is an example of the reports generated by repquota:

# repquota -v /chem  *** Report for user quotas on /chem (/dev/sd1d)                      Block limits                         File limits User          used     soft     hard     grace     used    soft    hard    grace chavez  --   13420    20000    25000                824       0       0   chen    +-    2436     2000     3000     2days        8       0       0

The plus sign in the entry for user chen indicates that he has exceeded his disk quota.

Users can use the quota command to determine where their current disk usage falls with respect to their disk quotas.

15.6.3.7 Group-based quotas (AIX, FreeBSD, Tru64 and Linux)

AIX, FreeBSD, Tru64, and Linux extend standard disk quotas to Unixgroups as well as individual users. Specifying the -g option to edquota causes names on the command line to be interpreted as group names rather than as usernames. Similarly, edquota -t -g allows you to specify the soft limit timeout period for group quotas.

By default, the quotaon, quotaoff, quotacheck, and repquota commands operate on both user and group quotas. You can specify the -u and -g options to limit their scope to only user quotas or only group quotas, respectively. Users must use the following form of the quota command to determine the current status of group quotas:

$ quota -g chem

For example, this command will report the disk quota status for group chem. Users may query the disk quota status only for groups of which they are a member.



Essential System Administration
Essential System Administration, Third Edition
ISBN: 0596003439
EAN: 2147483647
Year: 2002
Pages: 162

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net