Network problems can be manifested in a variety of scenarios. Some are very difficult to diagnose, while others become apparent very quickly. The system manager, although not normally responsible for the network, must know about network issues and how they can affect the systems that he is responsible for. The majority of larger installations make use of network management software, such as Solstice Domain Manager (discussed in Chapter 14) in one form or another, so most of the problems that will be encountered should be dealt with by this software. However, to configure a network management product, a level of knowledge is required to understand exactly what events are being monitored and to determine the remedial action to take if it occurs. This section looks at some of the basic troubleshooting tools used to determine the status of a system's network capability and also to diagnose network- related problems. ifconfigThe ifconfig command is probably the first command to be run when diagnosing a network problem. From the information returned by the command, it is possible to verify that the network interface is functioning correctly, that the IP and broadcast addresses are correct, and that the network mask being used is also correct. Listing 13.2 contains the result from running the ifconfig command with the -a flag to show all network interfaces. Listing 13.2 Sample Output from the Command ifconfig -a Showing the Status of All Connected Network Interfacestaurus# ifconfig -a lo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232 inet 127.0.0.1 netmask ffffff00 le0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500 inet 210.127.8.3 netmask ffffff00 broadcast 210.127.8.255 ether 8:0:20:a:a1:2a tauru s# pingThe ping command is another extremely popular network monitoring tool. It is used for a number of purposes:
Listing 13.3 contains sample output from two different executions of the ping command. The first one merely establishes whether the remote host is responding to requests. The second sends a fixed- size data packet and records the time taken to send it, as well as overall statistics on round-trip times and packet loss. Listing 13.3 Two Options from the ping Command, One to Establish the Status of a Remote System and One to Determine Transmission Times and Reliabilityleo# ping taurus taurus is alive [ /export/home/john ] leo# leo# ping -s taurus PING taurus: 56 data bytes 64 bytes from taurus (210.127.8.3): icmp_seq=0. time=15. ms 64 bytes from taurus (210.127.8.3): icmp_seq=1. time=6. ms 64 bytes from taurus (210.127.8.3): icmp_seq=2. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=3. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=4. time=6. ms 64 bytes from taurus (210.127.8.3): icmp_seq=5. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=6. time=6. ms 64 bytes from taurus (210.127.8.3): icmp_seq=7. time=6. ms 64 bytes from taurus (210.127.8.3): icmp_seq=8. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=9. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=10. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=11. time=6. ms 64 bytes from taurus (210.127.8.3): icmp_seq=12. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=13. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=14. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=15. time=7. ms 64 bytes from taurus (210.127.8.3): icmp_seq=16. time=5. ms 64 bytes from taurus (210.127.8.3): icmp_seq=17. time=5. ms ^C taurus PING Statistics 18 packets transmitted, 18 packets received, 0% packet loss round-trip (ms) min/avg/max = 5/5/15 leo# With the ever-increasing use of firewalls, it is possible that the network administrator might disable the protocol that ping uses ”that is, the Internet Control Message Protocol (ICMP). If this protocol is disabled, then any ping messages that pass through the firewall will fail, indicating (perhaps falsely) that a system is down. netstatThe netstat command has a wide range of uses. It can be used to monitor the state of a network interface, to determine which network connections are established (or hung) with remote systems, to provide information based on specific network protocols, and also to display the internal routing table. As an example, consider Listing 13.4, which contains the first of two samples of output from the netstat command. The output is from a Sun Enterprise 250 running Solaris 2.6, where the import of an Oracle database was taking an unacceptable amount of time yet all network connections were working as expected. On running the netstat -i command, with an interval of 5 seconds, a high number of input errors were apparent. Listing 13.4 Sample Output from the Command netstat -i Showing the Abnormally High Number of Input Errors on the Network Interfaceleo# netstat -i 5 input hme0 output input (Total) output packets errs packets errs colls packets errs packets errs colls 44845867 25576336 3627474 204 122543 44952869 25576336 3734476 204 122543 108 44 28 0 0 108 44 28 0 0 99 48 25 0 0 99 48 25 0 0 182 77 26 0 0 182 77 26 0 0 154 86 26 0 0 154 86 26 0 0 179 113 25 0 0 179 113 25 0 0 89 42 27 0 0 89 42 27 0 0 101 47 28 0 0 101 47 28 0 0 111 38 26 0 0 111 38 26 0 0 136 55 25 0 0 136 55 25 0 0 150 59 35 0 1 150 59 35 0 1 leo# Some searching on the Sunsolve database revealed a patch for the symptoms. That patch was duly installed and fixed the problem, as displayed in Listing 13.5, which shows the same network interface following the patch installation. Listing 13.5 Sample Output from the Command netstat -i Showing That the Problem Is Resolvedleo# netstat -i 5 input hme0 output input (Total) output packets errs packets errs colls packets errs packets errs colls 8623134 0 707459 103 37491 8633461 0 717786 103 37491 138 0 21 0 0 138 0 21 0 0 64 0 22 0 0 64 0 22 0 0 106 0 21 0 0 106 0 21 0 0 92 0 21 0 0 92 0 21 0 0 150 0 24 0 0 150 0 24 0 0 82 0 21 0 0 82 0 21 0 0 128 0 21 0 0 128 0 21 0 0 114 0 21 0 0 114 0 21 0 0 93 0 21 0 0 93 0 21 0 0 124 0 22 0 0 126 0 24 0 0 109 0 29 0 0 111 0 31 0 0 leo# This example was more interesting because it did not appear to be a network problem at all. Indeed, this indicated a performance issue because the import took much longer to complete than was expected. tracerouteThe traceroute command does exactly as you would expect: It traces the path taken to get from one host to another. It displays information about each of the "hops" along the way. This command is extremely useful when trying to determine why two hosts are incapable of communicating because it will indicate routers along the way that are not responding. Listing 13.6 shows an example of the traceroute command. Listing 13.6 The traceroute Command Showing the Path Taken to Reach a Remote Hostleo# traceroute taurus traceroute to taurus (210.127.8.3), 30 hops max, 40 byte packets leo-router (209.127.8.1) 3 ms 2 ms 2 ms bb1-gate-x (188.101.25.67) 22 ms 21 ms 18 ms 3 bb2-gate-a (187.100.80.10) 32 ms 29 ms 17 ms 4 bb5-area-xconn-alpha (192.150.100.68) 7 ms 5 ms 3 ms 5 taurus-router (210.127.8.1) 6 ms 4 ms 3 ms 6 taurus (210.127.8.3) 7 ms 5 ms 3 ms leo# snoopThe snoop command is a powerful network command that captures packets on a network interface. The captured packets can be displayed on the screen as they occur or can be saved to a file for later analysis. The snoop command requires superuser privilege to run because it puts the network interface into promiscuous mode so that all packets can be captured. Listing 13.7 demonstrates the type of information that can be gathered with this command. The example shows a simple Telnet connection being established by user john with his password of john1. It is worth noting that snoop will capture all traffic on a nonswitched network, including traffic between other systems that might not have been requested . On a switched network, however, snoop will capture only packets on the system on which the command is being run and any systems communicating with it; network traffic from other systems will not be captured. The command is extremely useful, however, particularly when trying to see if acknowledgements are being received from a remote host. Listing 13.7 The snoop Command Can Even Capture Password Information That Is Transmitted Across the Network34 0.04790 aries-> taurus TELNET C port=60321 35 0.00006 taurus -> aries TELNET R port=60321 73 775login: 36 0.00045 aries-> taurus TELNET C port=60321 37 0.00015 taurus -> aries TELNET R port=60321 38 0.04937 aries-> taurus TELNET C port=60321 39 0.60130 aries-> taurus TELNET C port=60321 j 40 0.00021 taurus -> aries TELNET R port=60321 j 41 0.04841 aries-> taurus TELNET C port=60321 42 0.01563 aries-> taurus TELNET C port=60321 o 43 0.00019 taurus -> aries TELNET R port=60321 o 44 0.04418 aries-> taurus TELNET C port=60321 45 0.11145 aries-> taurus TELNET C port=60321 h 46 0.00013 taurus -> aries TELNET R port=60321 h 47 0.04839 aries-> taurus TELNET C port=60321 48 0.12778 aries-> taurus TELNET C port=60321 n 49 0.00012 taurus -> aries TELNET R port=60321 n 50 0.04208 aries-> taurus TELNET C port=60321 51 0.31836 aries-> taurus TELNET C port=60321 52 0.00021 taurus -> aries TELNET R port=60321 53 0.04148 aries-> taurus TELNET C port=60321 54 0.00004 taurus -> aries TELNET R port=60321 Password: 55 0.04991 aries-> taurus TELNET C port=60321 56 0.53745 aries-> taurus TELNET C port=60321 j 57 0.09865 taurus -> aries TELNET R port=60321 58 0.00022 aries-> taurus TELNET C port=60321 o 59 0.09976 taurus -> aries TELNET R port=60321 60 0.03078 aries-> taurus TELNET C port=60321 h 61 0.09923 taurus -> aries TELNET R port=60321 62 0.07719 aries-> taurus TELNET C port=60321 n 63 0.09280 taurus -> aries TELNET R port=60321 64 0.09994 aries-> taurus TELNET C port=60321 1 65 0.10005 taurus -> aries TELNET R port=60321 66 0.12767 aries-> taurus TELNET C port=60321 67 0.00041 taurus -> aries TELNET R port=60321 68 0.04594 aries-> taurus TELNET C port=60321 69 0.00007 taurus -> aries TELNET R port=60321 Last login: Wed Jan 3 19:05:21 from aries 70 0.04989 aries-> taurus TELNET C port=60321 71 0.00005 taurus -> aries TELNET R port=60321 Sun Microsys tems Inc 72 0.04991 aries-> taurus TELNET C port=60321 73 0.01966 taurus -> aries TELNET R port=60321 [ /export/ho me/john lsofAll the commands mentioned previously in this section are bundled with the standard installation of the Solaris operating environment. This one, lsof , is freely available in the public domain and can be downloaded from a number of sites, such as http://www.sunfreeware.com. The lsof command displays the files that are opened by processes running on the system. It is extremely useful when trying to determine why a file system cannot be unmounted. The information provided is often sufficient for the administrator to identify the offending process and take the appropriate action. Listing 13.8 shows the number of open files owned by user john, who has simply logged on to the system. Listing 13.8 The lsof Command Shows How Many Files a User Opens Merely by Logging On and Running a Single Shellaries# lsof -u john COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ksh 16282 john cwd VDIR 32,56 1024 10176 /export/home/john ksh 16282 john txt VREG 32,6 192764 36481 /usr/bin/ksh ksh 16282 john txt VREG 32,6 1115940 23161 /usr/lib/libc.so.1 ksh 16282 john txt VREG 32,6 832236 22968 /usr/lib/libnsl.so.1 ksh 16282 john txt VREG 32,6 17252 18373 /usr/platform/sun4u/lib/libc_psr.so.1 ksh 16282 john txt VREG 32,6 19876 22890 /usr/lib/libmp.so.2 ksh 16282 john txt VREG 32,6 56988 22908 /usr/lib/libsocket.so.1 ksh 16282 john txt VREG 32,6 4600 23169 /usr/lib/libdl.so.1 ksh 16282 john txt VREG 32,6 183060 22768 /usr/lib/ld.so.1 ksh 16282 john 0u VCHR 24,0 0t670181 135213/devices/pseudo/pts@0:0->ttcompat->ldterm->ptem->pts ksh 16282 john 1u VCHR 24,0 0t670181 135213/devices/pseudo/pts@0:0->ttcompat->ldterm->ptem->pts ksh 16282 john 2u VCHR 24,0 0t670181 135213/devices/pseudo/pts@0:0->ttcompat->ldterm->ptem->pts ksh 16282 john 63u VREG 32,56 4406 10192 /export/home/john/.sh_history aries# |
Top |