Narrowing Down the Problem

Troubleshooting a network problem can be daunting. That’s why it’s best to start by trying to narrow down the source of the problem. You do this by checking a few key areas, beginning with the simple stuff.

Checking for the Simple Stuff

The first thing to check, as most people will tell you, is the simple stuff. There’s a saying that goes “all things being equal, the simplest explanation is probably the correct one.” For computers, it’s rather hard to categorize simple stuff because what’s simple to one person might be complex to another. I like to define simple stuff (as it relates to troubleshooting) as those items that you don’t think to check, but when it turns out that one of those items is the problem, you say, “Oh, DUH!” Almost everyone can agree on a few items that fall into this category:

  • Correct login procedure and rights

  • Link lights/collision lights

  • Power switch

  • Operator error

start sidebar
Real World Scenario: Can the Problem Be Reproduced?

The first question to ask anyone who reports a network or computer problem is “Can you show me what ‘not working’ looks like?” If you can reproduce the problem, you can identify the conditions under which it occurs. And if you can identify the conditions, you can start to determine the source.

Unfortunately, not every problem can be reproduced. The hardest problems to solve are those that can’t be reproduced, but instead appear randomly.

end sidebar

The Correct Login Procedure and Rights

To gain access to the network, users must follow the correct login procedure exactly. If they don’t, they will be denied access. Considering everything that must be done correctly and in the correct order, it’s a miracle that anyone logs in to a network correctly at all. There are so many opportunities for making a mistake.

First, a user must enter the username and password correctly. As easy as this sounds, users frequently enter this information incorrectly, don’t realize it, and report to the network administrator that the network is broken or that they can’t log in. The most common problem is accidentally typing the wrong username or password incorrectly. In some operating systems, this can happen when you accidentally leave the Caps Lock key pressed. An example of this is Unix, in which passwords are case-sensitive; the user will not be able to log in, unless his or her password is in all capital letters.

Additionally, in NetWare and Windows NT the network administrator can restrict the times and conditions under which users can log in. If a user doesn’t log in at the right time or from the right workstation, the network operating system will reject the login request, even though it might be a valid request in terms of the username and password being spelled correctly. Additionally, a network administrator might restrict how many times a user can log in to the network simultaneously. If that user tries to establish more connections than are allowed, access will be denied. Any time a user is denied access to the network, they are likely to interpret that as a problem, even though the network operating system might be doing what it should.

To test for these types of problems, first check to see if the username and password are being typed correctly and whether or not the Caps Lock key is pressed. Try the login yourself from another workstation (assuming that doesn’t violate the security policy). If it works, you might try asking the user to check to see if the Caps Lock light on the keyboard is on (indicating that the Caps Lock key has been pressed). If that doesn’t solve the problem, check the network documentation to see if the aforementioned kinds of restrictions are in place.

Tip 

If intruder detection is enabled on the network, the user’s account will be locked after a specified number of incorrect login attempts. In this case, the user cannot log in until the administrator has unlocked the account, or until a certain amount of time specified by the administrator has elapsed, after which the account is unlocked.

The Link and Collision Lights

The link light is a small light-emitting diode (LED) found on both the NIC and the hub. It is typically green and is labeled link (or some abbreviation). A link light indicates that the NIC and hub (in the case of 10BaseT) are making a logical (Data Link layer) connection. You can usually assume that the workstation and hub are communicating if the link lights are lit on both the workstation’s NIC and the hub port to which the workstation is connected.

Note 

The link lights on some NICs aren’t activated until the operating system driver is loaded for that NIC. So, if the link light isn’t on when the system is first turned on, you may have to wait until the operating system loads the NIC driver.

The collision light is also a small LED, typically amber in color. It can usually be found on both Ethernet NICs and hubs. When lit, it indicates that an Ethernet collision has occurred. It is important to know that this light will blink occasionally, because collisions are somewhat common on busy Ethernet networks. However, if this light stays on continuously, there are too many collisions happening for legitimate network traffic to get through. This can be caused by a malfunctioning network card or another malfunctioning network device.

Warning 

Be careful not to confuse the collision light with the network activity or network traffic light (usually green). The network activity light indicates that a device is transmitting. This particular light should be blinking on and off continually as the device transmits and receives data on the network.

The Power Switch

To function properly, all computer and network components must be turned on and powered up. As obvious as this is, network administrators often hear a user complain, “My computer is on, but my monitor is dark.” In this case, our response is to ask, “Is the monitor turned on?” After a pause, the voice on the other end usually says sheepishly, “Oh. Thanks.”

Most systems include a power indicator such as a Power or PWR light, and the power switch typically has a 1 or an On indicator. However, the unit could be powerless even if the power switch is in the On position. Thus, you need to check that all power cables are plugged in, including the power strip.

Tip 

Remember that every cable has two ends, and both must be plugged in to something.

When troubleshooting power problems, start with the most obvious device and work your way back to the power service panel. There could be any number of power problems between the device and the service panel, including a bad power cable, bad outlet, bad electrical wire, tripped circuit breaker, or blown fuse. Any of these items can cause power problems at the device.

Operator Error

The problem may be that the user simply doesn’t know how to perform the operation correctly; in other words, the problem may be due to OE ( operator error) . Those in the computer and networking industry have devised several colorful expressions to describe operator error:

  • EEOC (Equipment Exceeds Operator Capability)

  • PEBCAK (Problem Exists Between Chair And Keyboard)

  • ID Ten T Error (written as ID10T)

Assuming that all problems are related to operator error, however, is a mistake. Before you attribute any problem to operator error, ask the user to reproduce the problem in your presence, and pay close attention. You may find out that the user is having a problem because he or she is using an incorrect procedure—for example, flipping the power switch without following proper shutdown procedures. You may also find out that the user was trained incorrectly, in which case you might want to see if others are having the same difficulty. If the problem and solution are not obvious, try the procedure yourself, or ask someone else at another workstation to do so.

Note 

This is only a partial list of simple stuff. You’ll come up with our own expanded list over time, as you troubleshoot more and more systems.

Is Hardware or Software Causing the Problem?

A hardware problem typically manifests itself as a device in your computer that fails to operate correctly. You can usually tell that a hardware failure has occurred because you will try to use that piece of hardware, and the computer will issue an error indicating that this has happened. Some failures, such as hard-disk failures, may give warning signs—for example, a Disk I/O error or something similar. Other components may just suddenly fail. The device will be operating fine and then simply fail.

The solution to hardware problems usually involves either changing hardware settings, updating device drivers, or replacing hardware. As we have discussed in previous chapters, I/O address, IRQ (interrupt requests), and DMA (direct memory access) conflicts can cause computers (including workstations and servers) to malfunction. Change the hardware settings to solve these types of problems.

If the hardware has actually failed, however, you must get out your tools and start replacing components. If this is not one of your skills, you can send the device out for repair. In either case, because the system can be down for anywhere from an hour to several days, it’s always prudent to have backup hardware on hand.

Software problems are a little more evasive. Some problems might result in General Protection Fault messages, which indicate a Windows or Windows program error of some type. Also, a program might suddenly stop responding (hang), or the entire machine might lock up randomly. The solution to these problems generally involves a trip to the manufacturer’s support website to get software updates and patches or to search for the answer in a knowledge base.

Sometimes software will give you a precise message regarding the source of the problem, such as the software is missing a file or a file has become corrupt. In this case, you can either provide the file or, if necessary, reinstall the software. Neither solution takes long, and your computer will be up and running in a short time.

Tip 

Sometimes fragmented memory, which occurs after you open and close too many programs, is the source of the problem. The solution may be to reboot the computer, thus clearing memory. Be sure to add this to your network-troubleshooting bag of tricks.

Is It a Workstation or a Server Problem?

Troubleshooting this problem involves first determining whether one person or a group of people are affected. If only one person is affected, think workstation. If several people are affected, the server or, more generally speaking, a portion of the network is probably experiencing problems.

If a single user is affected, your first line of defense is to try to log in from another workstation within the same group of users. If you can do so, the problem is related to the user’s workstation. Look for a cabling fault, a bad NIC, or some other problem.

On the other hand, if several people in a group (such as a whole department) can’t access a server, the problem may be related to that server. Go to the server in question, and check for user connections. If everyone is logged in, the problem could be related to something else, such as individual rights or permissions. If no one can log in to that server, including the administrator, the server may have a communication problem with the rest of the network. If it has crashed, you might see messages to that effect on the server’s monitor, or the screen might be blank, indicating that the server is no longer running. These symptoms vary among network operating systems.

Which Segments of the Network Are Affected?

Making this determination can be tough. If multiple segments are affected, the problem could be a network address conflict. As you may remember from Chapter 4, “TCP/IP Utilities,” network addresses must be unique across an entire network. If two segments have the same IPX network address, for example, all the routers and NetWare servers will complain bitterly and send out error messages, hoping that it’s just a simple problem that a router can correct. This is rarely the case, however, and, thus, the administrator must find and resolve the issue. Also keep in mind that the continuous broadcasting of error messages will negatively impact network performance.

If all users of the network are experiencing the problem, it could be related to a different device, such as a server that everyone accesses. Or, a main router or hub could be down, making network transmissions impossible.

Additionally, if the network has WAN connections, you can determine if a network problem is related to the WAN connection by checking to see if stations on both sides can communicate. If they can, the problem isn’t related to the WAN. If they can’t communicate, you must check everything between the sending station and the receiving one, including the WAN hardware. Usually, the WAN devices have built-in diagnostics that can indicate whether the WAN link is functioning correctly to help you determine if the fault is related to the WAN link or to the hardware involved.

Cabling Issues

After you determine whether the problem is related to the whole network, to a single segment, or to a single workstation, you must determine whether the problem is related to network cabling. First, check to see if the cables are properly connected to the correct port. More than once, I’ve seen a wall phone cable plugged into a modem in the In jack.

Additionally, patch cables from workstation to wall jack can and do go bad, especially if they get moved or tripped over often. This problem is often characterized by connection problems. If you test the NIC and there is no link light (discussed earlier in this chapter), the problem could be related to a bad patch cable.

It is also possible to have a cabling problem in the walls where the cabling wasn’t installed correctly. If a network cable was run over a fluorescent light, for example, the workstation attached to that cable might have problems only when the lights are on. The problem is that the fluorescent lights produce a large amount of EMI and can disrupt communications in that cable. This kind of problem may manifest itself only at times when most lights need to be on.

Next, check the MDI/MDX port setting on small, workgroup hubs, a potential source of trouble that is often overlooked. This port is used to uplink, for example, to a hub on the network’s backbone. The port setting has to be set to either MDI or MDX, depending on the type of cable used for the hub-to-hub connection. A crossover cable (discussed later in this chapter) requires that the port be set to MDI; a standard network patch cable requires that the port be set to MDX (sometimes labeled MDI-X). You can usually adjust the setting via a regular switch or a DIP (Dual Inline Package) switch. Check the hub’s documentation.

Note 

Some hubs just have a port labeled MDX, since the MDI setting is really just another standard port for all intents and purposes. If you connect hubs using a standard patch cable, you must connect the MDX port to a standard port on the backbone hub.




Network+ Study Guide
Network+ Study Guide
ISBN: 470427477
EAN: N/A
Year: 2002
Pages: 151

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net