Troubleshooting Tips

Troubleshooting Tips

Not that anything ever goes wrong with your system, but when something does go wrong, there are some things you can try in order to figure out what has happened .

Troubleshooting problems in a Unix system is similar in many ways to troubleshooting on any system: You start by comparing the symptoms of the problem with the patient's medical history. When did the problem start? Oh, right after you installed the system-configuration files you were up all night editing? Hmmm. Maybe that's a clue to the problem . . .

Using the system log files

The system log files (described earlier in this chapter) often have an error message related to the problem you are experiencing. Usually you won't understand the error message, but don't stop there. You can search the Web for information regarding the exact error message you are seeing.

To search the Web for an error message:

1.
Copy whatever seems to be the most descriptive part of the error message.

2.
Use your favorite Web search engine to search for the error message.

This usually means enclosing all the words in quotesfor example, "DNSAgent: dns_send_query_server - timeout"

3.
Consider adding "Mac OS X" or "Darwin" as a separate search string.

For example, using the Google search engine, "DNSAgent: dns_send_query_server - timeout" + "Mac OS X" limits the search to pages that contain both of the phrases enclosed in quotes. (We found five pages with that search.)

Permission problems

If you are getting an error that includes the phrase "Permission denied " or something similar, it's a sign that you have a permission problem somewherea common problem in Unix. Permission problems crop up because a program might not be able to write to a directory or file it expects to, or because it might not be able to read a file and thus is missing some configuration information.

Tracking down permission problems, like much computer troubleshooting, requires that you think like the machine. Remember that in order to create a file, a process must have write permission for the directory containing the file (because the filename is an entry in the directory), while in order to change a file, you must have write permission on the file itself.

One quick thing to try is the "Repair Permissions" feature in the GUI application Disk Utility (located in the Utilities folder of the Applications folder). It will restore the permissions on many system files to their Apple-supplied defaults. Review Chapter 8 for details on permissions.

Dealing with "device full" problems

Another problem you are likely to run into sooner or later is when a disk fills up.

If you see an error message that says, "Write Error: No space left on device," you've filled up a disk volume; that is, you've used up all the available storage space. (Note that in Unix documentation the terms volume and partition are often used interchangeablyfor example, in the man pages for df and diskutil .)

Although this doesn't happen every day (hopefully!), the consequences can be pretty harsh : Some programs may simply stop working. For example, a mail server cannot save incoming mail if there is no disk space left.

You can quickly see if you are running out of disk space by using the df command (described in "To see a summary of disk usage for the entire system," earlier in this chapter). Figure 11.39 shows an example of a machine with two disks, one of which has two volumes (also called partitions). In the example, volume s9 on disk0 is almost full.

Figure 11.39. Example of output from df -lk showing two disks with a total of three volumes (also called partitions ). Partition s9 on disk0 is almost full.

If you see that any of the regular volumes are over 90 percent capacity, it's time to start worrying. By "regular volumes," we mean the ones where the filesystem column in df starts with /dev/disk . Remember that df displays information about various pseudovolumes that always show up at 100 percent capacityfor example, the fdesc ( file descriptor ) filesystem, which is used to keep track of open files.

This is possible because the operating system keeps a small amount of disk space in reserve to reduce the chance of a volume's filling up. If you see that a disk volume is at 101 percent capacity, then you have a problem now .

Basic steps to free up space on a disk volume:

  • Delete unneeded files on the critical volume. Remember that using the Trash does not delete filesit moves them. To actually delete the files, use rm and/or empty the Trash. See "To clear out users' Trash for them," below.

  • Move one or more directories to a different volume and put a symbolic link where the directory used to be. See "To move directories from one disk to another while keeping the original path ," below.

  • Add more disks, and then move directories to the new disks. In Mac OS X, this is as easy as it has always been on the Mac. With external FireWire and USB hard drives , you don't even need to shut down. Note that Mac OS X automatically makes any disks you add show up in /Volumes .

On almost all other Unix systems, adding a disk is more complicated, but you can mount new disks on any directory. For example, you could mount a new disk on /Users (though you would still have to copy the old contents onto the new disk).

To clear out users' Trash for them:

1.
Find each user 's Trash directory.

Each user has a .trash directory in his or her home directory.

If the volume that's filling up is the one that holds users' home directories, go into each user's home directory and delete his or her .trash directory (it is re-created when the user needs it).

You'll need to do the next step for each user.

2.
sudo rm -rf ~ username /.TRash

That removes an entire .trash directory. (In case you are wondering, the .trash directory will be re-created when needed; also, using rm -rf ~ username /.trash/* will remove the contents only, but will miss deleting dot-files at the top level of the .trash directory.)

If you use the Finder to trash a file that is on a different volume than your home directory, then instead of going into ~/.trash , that file goes into a different Trash directory.

There are directories at the root level of the directory on which each volume is mounted. Huh, you say? Here is an example:

Let's say you have three disk volumes. Perhaps you have two disks, and one of them has two partitions, so you have a total of three disk volumes. Your df output might look like that shown in Figure 11.39. (Note the use of the -lk options to show only local volumes, and the sizes in kilobytes. Note too that df will show only volumes, not the Trash files themselves .)

In that case, there are three directories, each called .TRashes :

 /.Trashes /Volumes/partition2/.Trashes /Volumes/flamepit/.Trashes 

Each of the directories has subdirectories for each user ID that has trashed files from that volume. So if user puffball has uid 502 , then there might be

 /Users/puffball/.Trash /Volumes/partition2/.Trashes/502 /Volumes/flamepit/.Trashes/502 

You want to remove the .trashes directory from the critical volume (don't worry, it will be re-created when needed).

3.
sudo rm -rf / volume-in-trouble /.

Trashes

To remove the .trashes directory for the volume mounted on /Volumes/partition2 :

 sudo rm -rf /Volumes/partition2/.  Trashes 

Going Over 100 Percent

On some Unix systems, df may show you that a filesystem (as it does with a volume) is over 100 percent capacity.

On those systems, the "used" and "available" columns in df add up to something less than the column showing the total capacity. On Mac OS X the numbers add up exactly.


If you have one volume that is filling up and another with more space (perhaps you've added a second disk), you can move directories from the full volume to the spacious one and replace the original directory with a symbolic link.

To move directories from one disk to another while keeping the original path:

1.
Use ditto to copy the directory.

For example, if you want to move the /Users directory to the volume mounted on /Volumes/partition2 , you can use

 sudo ditto -rsrc /Users  /Volumes/partition2/Users 

2.
mv olddir olddir .save

For example:

mv /Users /Users.save

You'll delete it later after making sure that everything is OK.

3.
Create a symbolic link where the old directory was, pointing to the new directory.

For example:

 ln -s /Volumes/partition2/Users  /Users 

So anything that accesses /Users still works.

4.
When everything seems OK, delete the old directory.

For example:

rm -rf /Users.save

Problems booting up

In the unlikely and scary event that your machine won't completely boot up, you may still be able to get things working againassuming that the machine can at least begin the boot process.

If you are able to boot into single-user mode, then you can attempt to repair file system damage using the fsck ( file system check ) command.

To watch all the system-startup messages:

  • Hold down both and while booting.

    This is called "booting in verbose mode."

    In a normal, healthy boot-up, you see a great deal of text messages scrolling across the screen as the system goes through the boot-up process.

    It is very important to learn what a healthy boot-up looks like so that you'll notice any problems when they occur. Try it a couple of times and watch what a normal startup looks like. Booting in verbose mode won't directly fix anything, but you may be able to see what's going wrong, or at least copy a message off the screen to give to someone else to assist in troubleshooting.

The last resort

In most cases your disks will be using a journaling file system , which makes data corruption extremely unlikely. You can see a list of your volumes with

diskutil list

and see what kind of file system a volume has with

diskutil info volume

For example:

diskutil list /dev/disk0s2

If the output includes Journaled HFS+ , then the volume has a journaled file system and the following tasks will probably have no effect.

This next task is mainly , useful only if your disk(s) are not formatted with a journaling file system, and should be considered a last resort. Do not attempt it unless you are willing to risk losing data (maybe even all the data on your disk) and you have tried all other available approaches.

To check and repair the file system with fsck:

1.
Boot into single-user mode.

You do this by holding down both and while the machine starts up.

If the system isn't too badly messed up, you end up at a prompt like this:

localhost#

Your next move is to try to check and repair the file system.

2.
/sbin/fsck -fy

This is basically the command-line version of the repair feature in Disk Utility.

See the man page for fsck to learn more about that command.

When you get back to a prompt, you can try mounting the root volume and booting the machine.

When you get back to the prompt, run it again to make sure that the repairs were effective.

Getting More Help

There are plenty of places to get deeper into Unix system administration. Here are a few:

The official Apple documentation for the Darwin layer of Mac OS X can be found at http://developer.apple.com/darwin. Apple offers training and certification for Mac OS X; see http://train.apple.com/.

Another useful set of documentation from Apple is oriented toward developers (programmers) but contains much information of interest to anyone wanting to dig deeper into Mac OS X: http://developer.apple.com/techpubs/macosx/Essentials/SystemOverview.

The Mac OS X Hints Web site (www. macosxhints .com) is a wonderful user-supported site run by Rob Griffiths. It is basically a big bulletin board for Mac OS X information. It's free, but you can make a donation to support it.

Two valuable books in the Unix system-administration world are Essential System Administration , Third Edition, by leen Frisch (O'Reilly; www.oreilly.com/catalog/esa3); and Unix System Administration Handbook , 3rd Edition, by Evi Nemeth, Garth Snyder, Scott Seebass, and Trent R. Hein (Admin.com; www.admin.com).


3.
/sbin/fsck -fy

If you get a message saying that your disk "appears to be OK," then fsck worked.

  • If it didn't work, try booting from an external FireWire drive if available, or the installation DVD or CD. You may then be able to run Terminal and possibly view the damaged volumes. You will probably need help from an experienced Unix administrator to mount the damaged volume(s) and fix them. Contacting Apple for assistance isn't a bad idea, either.

  • If it worked (or even if it didn't), go ahead and reboot the machine.

4.
reboot

Hopefully, the machine starts up and all is well.

Tip

  • In /Library/Logs you may find log files that have some record of things that have gone wrong. Look for files with names such as panic.log and a CrashReporter subdirectory. There may also be CrashReporter subdirectories in any users' own Library/Logs directoriesthat is, /Users/ user name /Library/Logs/CrashReporter/ .




Unix for Mac OS X 10. 4 Tiger. Visual QuickPro Guide
Unix for Mac OS X 10.4 Tiger: Visual QuickPro Guide (2nd Edition)
ISBN: 0321246683
EAN: 2147483647
Year: 2004
Pages: 161
Authors: Matisse Enzer

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net