Troubleshooting a Network


The following figure shows a network that has several problems: Computers on the network can't access the Internet; the domain name system (DNS) server is down; some computers can't see the other computers on the network because they are getting their Internet Protocol (IP) addresses and DNS information from an unsanctioned Dynamic Host Configuration Protocol (DHCP) server; and a Mac OS X computer (in this case, the iMac) has some services like File Transfer Protocol (FTP) and remote login turned on. In this lesson, you'll learn how to troubleshoot these issues.

Establishing a Methodology

The following flowchart, which provides a framework for the network troubleshooting process, is a condensed version of the Apple General Troubleshooting Flowchart.

Gather Information

The first step in this process is to gather information about the issue. You're trying to establish its exact nature by getting as much information as possible. For example, you may find that the symptom the end user is reporting has stopped the user but has nothing to do with the underlying issue. Initial reports may be misleading. "I can't connect to the Internet" is meaningless until you have more information.

To ensure that you have the best possible understanding of the report, ask a mix of openended and yes/no questions. Keep in mind that your end user may not have an understanding of networking in general and almost certainly does not know your network architecture. The following questions are useful:

  • Did things work at one point but suddenly stop working, or is this the first time you've tried to do this?

  • Do you know if anything changed recently on your system or in your settings? What is that?

  • Have you installed all current updates from Software Update (particularly any security updates)?

  • When did the issue first appear?

  • Is anyone else in the area having a similar issue?

  • Is the issue constant or intermittent?

  • Does it occur in only one application?

  • Does it persist if you restart?

You should resist jumping to conclusions or making suggestions based only on the answer to one or two questions. While these suggestions might keep your users at bay for a short time or even temporarily cure the symptom, you still have not identified the cause.

When you are gathering information, don't hesitate to request logs or System Profiler reports. You can also log in to the remote computer to view relevant log entries or run System Profiler remotely.

Verify the Issue

The next step is to verify the issue. Ask yourself if you recognize the issue, log in to the remote computer, and try to reproduce it there. Walk your end user through the process and see if you can identify where the issue recurs. Use Apple Knowledge Base documents at www.apple.com/support as a reference.

When you have completed the information-gathering and verification steps, you should have enough information to try a fix. Evaluate the nature of the issue: Is it local to this machine, specific to the network, or specific to a particular server?

Note

Fixing the issue may involve network configuration on servers that you do not control, so you'll want to discuss the issue with other system administrators in your organization.


When you are ready to try a fix, start by isolating as much as possible. Eliminate possible sources. Narrow your scope from general topics ("the network is slow") to specifics ("the network is slow when browsing specific websites using specific machines"). Often the answer will reveal itself without your having to make major changes to the network. In any case, before making any changes, consult with your network architect or a senior system administrator to double-check your reasoning.

Fix the Issue

Once you have established and verified the issue and have a solution in mind, apply the appropriate fix. Evaluate the fix to see whether it resolves the issue, and pay special attention to ensure that you have not introduced network instability or new issues for other end users. Give yourself a time frame for evaluating the results: In most cases, if the issue goes away for more than 24 hours, it is resolved.

Finally, if you reach the point where you have evaluated several solutions and none of your fixes have worked, reevaluate your reasoning. If you can't find a flaw in your approach, or you don't find a fresh approach, escalate the issue to a senior system administrator or your network architect.

Troubleshooting Network Access

When a computer cannot access other computers on the network, first check the physical connection. Many network problems stem from loose or incorrectly wired cables. To thoroughly check the physical connection between two machines, you may need to check a series of switches for activation lights.

If the physical connection is active and you are working in a DHCP network environment, see whether the computer received a valid IP address and subnet mask from the DHCP server. Also check whether the computer can use Bonjour connections to servers.

You can also use several command-line tools to troubleshoot connectivity, as illustrated in the following figure.

Here are detailed descriptions of the command-line tools you can use to troubleshoot connectivity:

  • ping: Use this command to send packets to hosts on the network and receive an answer in response. If you ping a host and don't receive packets back, ping another host that you know is up and running to determine whether the issue is limited to the host in question. When using ping, remember that firewalls might not allow the traffic that is generated by the ping command. If you suspect a firewall is causing a computer not to answer, use telnet or another command that may penetrate firewalls.

  • traceroute: Use this command to determine whether you have a routing problem. You trace the route to a particular host to determine at what point the route stops functioning. In addition, use this command to determine whether you have a deadlock that causes packets to go back to the original host. Using this command, you can also determine whether there is congestion on the network by looking at how long it takes packets to get through.

  • arp: Use this command to show what hosts in the local subnet are known and verify that there are no duplicate entries. The arp a command displays a table that lists the addresses of the computers on your local subnet and their corresponding MAC addresses. If the arp output lists the other systems on your network, it means that your network card is functioning properly but you might have a routing problem.

  • netstat nr: Use this command to display the routing table. You want to make sure that the default destination defined in your routing table points to a gateway and that the gateway is accessible.

    In the following example, the default destination points to the 192.168.1.1 gateway:

    >netstat nr Routing tables Internet: Destination    Gateway      Flags    Refs    Use    Netif Expire default        192.168.1.1  UGSc     9       3      en0

  • scutil and ipconfig: Use these commands to show the state of your network configuration. You can also examine NetworkInterfaces.xml to look for any discrepancies. If you suspect that the source of the issue is an unsanctioned DHCP server, you can use the ipconfig getpacket interface command (where interface is your network interface, such as en0) to display the last packet received from the DHCP server on your computer.

Note

Some sites do not allow Internet Control Message Protocol (ICMP) traffic. This can hamper the troubleshooting effectiveness of ping and TRaceroute on those networks.


Troubleshooting DNS and Domain Names

If you determine that you have no problems accessing other computers on the network, but you cannot connect to hosts using their domain names, it is likely that the error lies with the domain name lookup process. The following figure illustrates that the problem likely lies with the DNS server.

Make sure that the DNS server is properly set on the Network pane of System Preferences. Then use the following commands to figure out the issue:

  • host: Use this command to perform a domain name lookup. If the command fails, it means either that the host you're trying to connect to doesn't have a valid DNS entry in the DNS zone files maintained by the DNS server, or that the DNS server is down. You can use the a or v option to get the entire DNS record.

  • dig: This command performs a domain name lookup like host. In addition, this command displays the responses returned by the queried DNS servers. Analyzing the responses helps you determine whether the issue is the result of an error in the domain name's DNS entry or a missing entry for the domain.

  • nslookup: This command is deprecated. Use host or dig instead.

Another useful tool to resolve names is lookupd d (used with options hostWithname: or hostWithInternet Address:). Lesson 11, "Planning and Deploying Directory Services," covers lookupd in more detail.

Note

You can also use the Lookup pane of the Network Utility tool to perform domain name lookups.


Troubleshooting Network Services

If you are running services on your computer, and other computers are having difficulty reaching your machine, as shown in the following figure, you should ensure that the services are configured properly. Check the configuration files for each process; in this case, you would first examine the /System/Library/LaunchDaemons directory. Try to connect to these services locally from your own machine, such as ssh yourusername@127.0.0.1. If the service does not allow you to connect, then there is an issue with the service running locally on your computer.

You can also use tools such as netstat, which allow you to see network statistics as well as the different sockets and ports that you have open on your machine. For example, the netstat an command displays the state of the ports that are currently being used.

In the output, entries can contain one of the following keywords:

  • LISTEN: Indicates that the port on your computer is listening for requests.

  • ESTABLISHED: Indicates an established connection and shows the address of the connected system.

  • CLOSED: Indicates that the port is closed and might explain why other systems can't access a particular service on your computer.

  • CLOSEWAIT: Indicates that a port is closed and is waiting for a defined period of time before returning to a LISTEN state (may indicate a Denial of Service attack).

When looking at the output of the netstat an command, check for patterns that might indicate an issue. For example, if you notice that port 22 is being used by an unknown system when you know that only systems with certain IP addresses should be using the port, it is a sign of intrusion into your system. Also, if you notice that port 139 is closed when it's supposed to be open, it'll explain why Windows machines can't access your computer.

One way to quickly check the status of ports is to use the following command:

netstat an | grep LISTEN


Another command to use to list processes listening for Internet connections is the following:

sudo lsof i | grep LISTEN


This command lists all open Internet files. Each entry lists the process that has opened the file and the port on which it's listening.

Another thing to check when troubleshooting access to services is the firewall. Make sure that the firewall is not preventing other machines from connecting to you. You can use sudo ipfwlist to see firewall rules or use the Firewall pane of System Preferences' Sharing preference pane.

Use grep to filter output to show rules based on whether they allow access or deny it:

sudo ipfw list |  grep allow sudo ipfw list |  grep deny


You should also check service-specific log files and run any service in question in the foreground or in debug mode. Most log files are found in either /var/log/ or /Library/Logs/.




Apple Training Series. Mac OS X System Administration Reference, Volume 1
Apple Training Series: Mac OS X System Administration Reference, Volume 1
ISBN: 032136984X
EAN: 2147483647
Year: 2005
Pages: 258
Authors: Schoun Regan

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net