The Troubleshooter s Resources

The Troubleshooter’s Resources

In the process of troubleshooting a workstation, a server, or other network component, you have many resources at your disposal. In this section, we’ll take a brief look at some of them. Those you use depend on the situation and your personal preferences. You will eventually have your own favorites.

Log Files

As you learned in Chapter 6, log files can indicate the general health of a server. Each log file format is different, but, generally speaking, the log files contain a running list of all errors and notices, the time and date they occurred, and any other pertinent information. Let’s look at a couple of the log files from the most commonly used network operating systems, NetWare 5 and Windows NT 4.

NetWare Log Files

NetWare uses three log files that can help you diagnose problems on a NetWare server:

  • The Console Log file (CONSOLE.LOG)

  • The Abend Log file (ABEND.LOG)

  • The Server Log file (SYS$LOG.ERR)

Each file has different uses in the troubleshooting process.

The CONSOLE.LOG File

The Console Log file ( CONSOLE.LOG) keeps a history of all errors and information that have been displayed on the server’s console. It is located in the SYS:\ETC directory on the server and is created and maintained by the utility CONLOG.NLM that comes with NetWare versions 3.12 and later. You must load this utility manually (or place the load command in the AUTOEXEC.NCF file so that it starts automatically upon server startup) by typing the following at the console prompt:

LOAD CONLOG 

Once this utility is loaded, it erases whatever CONSOLE.LOG file currently exists and starts logging to the new file.

Note 

This command works with any version of NetWare, including 3.12 or later. However, if you are using NetWare 5 or later, the LOAD command is optional. It is required in versions 3.12 to 4.1x.

Figure 10.2 shows a sample CONSOLE.LOG file. From this log file, we can tell that someone edited the AUTO-EXEC.NCF file and then restarted the server. This indicates a major change on the server. If we were trying to troubleshoot a server that was starting to exhibit strange problems after a recent reboot, this might be a source to check.

click to expand
Figure 10.2: A sample CONSOLE.LOG file

Warning 

The information in the CONSOLE.LOG file is lost every time the CONLOG.NLM is unloaded and reloaded. It doesn’t keep a history of every command ever issued, only those issued since CONLOG.NLM was loaded. However, you can configure the ARCHIVE=YES parameter to configure CONLOG to keep a history of all the conlog files. The first file is saved with a .000 extension, the next with a .001 extension, and so forth. The complete command to run at the console (or add to Autoexec.ncf) is Conlog archive=yes.

The ABEND.LOG File

This log file registers all Abends on a NetWare server. An Abend (ABnormal END) is an error condition that can halt the proper operation of the Net-Ware server. Abends can be serious enough to lock the server, or they can simply force an NLM to shut down. You know an Abend has occurred when you see an error message that contains the word Abend on the console. Additionally, the server command prompt will include a number in angle brackets (for example, <1>) that indicates the number of times the server has Abended since it was brought online.

Because the server may reboot after an Abend, these error messages and what they mean can be lost. NetWare versions 4.11 and later include a routine to capture the output of the Abend both to the console and to the ABEND.LOG file. ABEND.LOG is located in the SYS:SYSTEM directory on the server.

The ABEND.LOG file contains all the information that is output to the console screen during an Abend, plus much more:

  • The exact flags and registers of the processor at the time of the Abend

  • The NLMs that were in memory, including their versions, descriptions, memory settings, and exact time and date

Here is a portion of our ABEND.LOG file.

*********************************************************    Server S1 halted Friday, February 12, 1999   2:37:03 pm Abend 1 on P00: Server-5.00a: Page Fault Processor Exception  (Error code 00000002)    Registers:     CS = 0008 DS = 0010 ES = 0010 FS = 0010 GS = 0010        SS = 0010     EAX = 00000000 EBX = D0AC2238 ECX = 0697DEF0       EDX = 00000009     ESI = D0C5C040 EDI = 00000000 EBP = 0697DED0        ESP = 0697DEC0     EIP = D0AC2232 FLAGS = 00014246     D0AC2232 C600CC         MOV     [EAX]=?,CC     EIP in ABENDEMO.NLM at code start +00000232h    Running process: Abendemo Process Created by: NetWare Application Thread Owned by NLM: ABENDEMO.NLM Stack pointer: 697DCE0 OS Stack limit: 697A000 Scheduling priority: 67371008 Wait state: 5050170  (Blocked on keyboard) Stack: D0AC22C1  (ABENDEMO.NLM|MenuAction+89)        D1FEA602  (NWSNUT.NLM|NWSShowPortalLine+3602)        --00000008  ?        --00000000  ?        --0697DF20  ?        --D0134080  ?        --00000001  ?        D1FEA949  (NWSNUT.NLM|NWSShowPortalLine+3949)        --00000010  ?        --0697DEF0  ?        --0697DEF4  ?        --0697DFAC  ?        --D0C2E100  (CONNMGR.NLM|WaitForBroadcastsToClear+C90C)        --00000003  ?        --00000008  ?        --00000012  ?        --00000000  ?        --00000019  ?        --00000050  ?        --000000FF  ?        --00000001  ?        --00000010  ?        --00000001  ?        --00000000  ?        --00000011  ?        --0697DFDC  ?        --0000000B  ?        --00000000  ?        D1FEABD9  (NWSNUT.NLM|NWSShowPortalLine+3BD9)        --0000000B  ?        --00000000  ?        --00000000  ? Additional Information:     The CPU encountered a problem executing code in      ABENDEMO.NLM.  The problem may be in that module      or in data passed to that module by a process owned      by ABENDEMO.NLM.    Loaded Modules: SERVER.NLM       NetWare Server Operating System   Version 5.00    August 27, 1998   Code Address: FC000000h  Length: 000A5000h   Data Address: FC5A5000h  Length: 000C9000h LOADER.EXE       NetWare OS Loader   Code Address: 000133D0h  Length: 0001D000h   Data Address: 000303D0h  Length: 00020C30h CDBE.NLM         NetWare Configuration DB Engine   Version 5.00    August 12, 1998   Code Address: D087E000h  Length: 00007211h   Data Address: D0887000h  Length: 0000684Ch

This information can be useful when determining the source of an Abend. For example, any time you see the words Page Fault or Stack in the output, the Abend occurred because of something having to do with memory. Usually, it’s because a program or process tried to take memory that didn’t belong to it (for example, from another program). When NetWare detects this, it shuts down the offending process and issues an Abend.

The SYS$LOG.ERR File

The general Server Log file, found in the SYS:SYSTEM directory, lists any errors that occur on the server, including Abends and NDS errors and the time and date of their occurrence. An error in the SYS$LOG.ERR file might look something like this:

1-07-1999  11:51:10 am:    DS-7.9-17    Severity = 1  Locus = 17  Class = 19    Directory Services:  Could not open local database,     error: -723 

The Severity, Locus, and Class designations in the second line substitute for lengthy text descriptions of the error and can provide more information:

  • Severity indicates the seriousness of the problem.

  • Locus indicates which system component is affected by the error (for example, memory, disk, or LAN cards).

  • Class indicates the type of error.

Tables 10.1, 10.2, and 10.3 explain the codes used for Severity, Locus, and Class. Based on the information in these tables, we can determine some information about our example above. A Severity of 1 indicates a warning condition (so the problem isn’t really serious), a Locus of 17 indicates that the error relates to the operating system (which would make sense because this is a Directory Services error), and a Class of 19 indicates the problem is with a domain, meaning that the problem is defined by the operating system, but it’s not an operating system problem. These designations tell us the reported error is related to NDS, and that it’s not really serious. In fact, this particular error might occur when you bring up the server and the database hasn’t yet been opened by the operating system.

Table 10.1: SYS$LOG.ERR Severity Code Descriptions

Number

Description

0

Informational. Indicates that the information is non-threatening, usually just to record some kind of entry in the SYS$LOG.ERR file.

1

Warning. Indicates a potential problem that does not cause damage.

2

Recoverable. Indicates an error condition has occurred that can be recovered by the operating system.

3

Critical. Indicates a condition that should be taken care of soon and that might cause a server failure in the near future. For example, mirrored partitions are out of sync or the Abend recovery routine is invoked.

4

Fatal. Indicates that something has occurred that will cause the imminent shutdown of the server or that a shutdown has occurred. This type of error might occur when a disk driver unloads because of a software failure.

5

Operation Aborted. Indicates that an attempted operation could not be completed because of an error. For example, a disk save could not be completed because the disk was full.

6

No NOS Unrecoverable. Indicates that the operation could not be completed, but that it will not affect the operating system. For example, a compressed file is corrupt and unrecoverable.

Table 10.2: SYS$LOG.ERR Locus Code Descriptions

Number

Description

0

Unknown

1

Memory

2

File System

3

Disks

4

LAN Boards

5

COM Stacks (Communication Protocols)

6

No Definition

7

TTS (Transaction Tracking System)

8

Bindery

9

Station

10

Router

11

Locks

12

Kernel

13

UPS

14

SFT_III

15

Resource Tracking

16

NLM

17

OS Information

18

Cache

19

Domain

Table 10.3: SYS$LOG.ERR Class Code Descriptions

Number

Description

0

Class Unknown

1

Out of Resources

2

Temporary Situation

3

Authorization Failure

4

Internal Error

5

Hardware Failure

6

System Failure

7

Request Error

8

Not Found

9

Bad Format

10

Locked

11

Media Failure

12

Item Exists

13

Station Failure

14

Limit Exceeded

15

Configuration Error

16

Limit Almost Exceeded

17

Security Audit Information

18

Disk Information

19

General Information

20

File Compressions

21

Protection Violation

Windows NT 4 Log Files

Windows NT, like other network operating systems, employs comprehensive error and informational logging routines. Every program and process theoretically could have its own logging utility, but Microsoft has come up with a rather slick utility, Event Viewer, which, through log files, tracks all events on a particular Windows NT computer. Normally, though, you must be an administrator or a member of the Administrators group to have access to Event Viewer.

To use Event Viewer, follow these steps:

  1. Choose Start Ø Programs Ø Administrative Tools (Common) to open the Select Computer dialog box:

    click to expand

  2. In the Computer field, enter the UNC (Universal Naming Convention) name of the computer whose events you want to view.

Note 

You can also simply double-click the computer’s name in the list in the Select Computer section.

  1. If you are connected to a Windows NT network over a slower link, such as a slow WAN link or a dial-up connection, click the Low Speed Connection check box to optimize Event Viewer for running over the lower-speed connection.

  2. Click OK.

  3. To view a log file, select it from the list.

  4. To view a different log file, choose Log Ø Select Computer.

The first time you open Event Viewer, you will automatically be brought to the System Log. Subsequently, when you open Event Viewer, the first log you see is the one you were last viewing.

Warning 

Even though this list displays Windows 95/98 computers, you cannot view log files on those computers because their logging system isn’t designed to interface with Event Viewer.

Using Event Viewer, you can take a look at three types of files:

  • The System Log

  • The Security Log

  • The Application Log

Tip 

To view the log files of any Windows NT machine from your Windows 95/98 client, copy the Server Tools from the Windows NT Server CD to your hard disk and create a shortcut for them. The Server Tools directory is located in the \CLIENTS\SRVTOOLS\ directory on the Windows NT Server Installation CD.

The System Log

This log file tracks just about every event that occurs on that computer. It is similar to NetWare’s SYS$LOG.ERR file. However, whereas the SYS$LOG.ERR file tracks many categories of errors, the System Log tracks only three main types of events:

  • Information (an event occurred, especially when a service fails)

  • Warning (an event occurred that could cause problems)

  • Error (a component has failed and needs immediate attention)

In a log file, the icon that precedes the date indicates the event’s type. Figure 10.3 shows the three types of events found in the System Log.


Figure 10.3: Sample Log event types and their associated icons

Note 

Two other types of events (Audit Success and Audit Failure) normally appear only in the Security Log (discussed later in this chapter).

Figure 10.4 shows a sample System Log. This list contains several categories of information, including the date and time the event occurred, the source of the event (which process the event came from), which user (if applicable) initiated the process, the name of the computer the event happened on, and the Event ID number (in the Event column). The Event ID number is the unique error type of a particular event. For an explanation of each Event ID number, check the Help file, or go to www.microsoft.com/technet/ and search for Event ID.

click to expand
Figure 10.4: A sample System Log (note the different error types and event IDs)

If you want more detail on a specific event, double-click it. Figure 10.5 shows the event detail for the following event in Figure 10.4:

click to expand
Figure 10.5: The Event Detail dialog box for an event listed in Figure 10.4

1/7/9911:33:15 AMDiskNone7N/AS1 

The note in the Description box indicates that Windows NT found a bad disk block. Even though this is an error event, it is not serious. One bad block is not a problem, unless several disk blocks start going bad at once. The Data box lists the exact data the Event Viewer received about the error condition. This may be useful in determining the source of the problem. More than likely, if you have a serious problem that you can’t fix, this is the information that you will send to the vendor (or to Microsoft) to help troubleshoot the problem.

The Security Log

This log tracks security events specified by the domain’s Audit policy. The Audit policy is set in User Manager for Domains and specifies which security items will be tracked in Event Viewer. To set the Audit policy, follow these steps:

  1. Choose Start Ø Programs Ø User Manager for Domains to open User Manager for Domains.

  2. Choose Policy Ø Audit to open the Audit Policy dialog box:

    click to expand

  3. Indicate the events that you want logged and check the Success or Failure check boxes to track the success and failure of those events. Since these are security settings, most often you’ll want to log failures.

  4. Click OK, and these events will be logged for all users and systems in the domain.

After you set the Audit policy for a domain, you can view the Security Log for any computer in that domain. Follow these steps:

  1. Choose Start Ø Programs Ø Administrative Tools (Common) to open the Select Computer dialog box.

  2. In the Computer field, enter the UNC (Universal Naming Convention) name of the computer whose events you want to view.

  3. If you are connected to a Windows NT network over a slower link, such as a slow WAN link or a dial-up connection, click the Low Speed Connection check box to optimize Event Viewer for running over the lower-speed connection.

  4. Click OK.

  5. Choose Log Ø Security to open the Security Log (see Figure 10.6) for that computer.

    click to expand
    Figure 10.6: The Security Log in Event Viewer

As you can see, this log looks similar to the System Log in most respects. The main differences are the icons and the types of events recorded here. To view the detail for an event, double-click it.

The Security Log displays two types of events:

  • Success Audit (the event passed the security audit)

  • Failure Audit (the event failed the security audit)

Figure 10.7 shows the icons associated with each of these types of events. When an item fails a security audit, something security-related failed. For example, a common entry (assuming the Logon Failure check box is checked in the Audit Policy dialog box) is a Failure Audit with a value of Logon/ Logoff in the category. This means that the user failed to log on. If you look at the log shown previously in Figure 10.6, you can see that a user successfully logged on as administrator and that no failures have occurred.


Figure 10.7: The Security Log event types and their associated icons

This log is especially useful in troubleshooting when someone can’t access a resource. If your domain security policy has been set to log Failures of Use of User Rights, you can see every instance of a user not having enough rights to access a resource. The username appears in the User column of the Failure Audit event for the resource the user is trying to access.

The Application Log

This log is similar to the other two logs, except that it tracks events for network services and applications (for example, SQL Server and other Back-Office products). It uses the same event types (and their associated icons) as the System Log. Figure 10.8 shows an example of an Application Log.

click to expand
Figure 10.8: A sample Application event log

To access the Application Log, in Event Viewer, choose Log Ø Application. The Sources column indicates which service logged which event. For example, in Figure 10.8, you can see three error events that came from Microsoft SQL Server (the MSSQL entry).

All together, the log files present a picture of the general health of a Windows NT server. Generally speaking, if you see an error message, open Event Viewer and check the System Log. If you don’t see the event here, check the other two logs.

Manufacturers’ Troubleshooting Resources

In addition to viewing log files, you can use several types of troubleshooting tools that manufacturers make available for their network operating systems. You can use these resources to augment your own knowledge, as well as to solve those pesky problems that have no pattern or few recognizable symptoms. Each type of resource provides different information or different levels of support (some of which have been discussed in previous chapters, but their importance to troubleshooting necessitates discussing them again here). Let’s examine the most popular, including:

  • README files

  • Telephone support

  • Technical support CD-ROM

  • Technical support website

README Files

As you learned in Chapter 6, README files contain information that did not make it into the manual. The latest information released about the software can often be found in the README files. Also, they may contain tips, default settings, and installation information (so you don’t have to read the entire first chapter to install the software).

When troubleshooting application or networking software, check out the README file before you try any of the other manufacturers’ resources. It is usually found on the first installation disk or CD.

Telephone Support

Many people prefer telephone support over other forms of support. You actually get to talk to a human being from the software manufacturer about the problem. Most, if not all, software manufacturers have toll-free support numbers. The people on their end of the line can provide anything from basic how-to answers to complex, technical answers.

Unfortunately, because of their popularity, technical support phone lines are often busy. When the line is finally free, you might, however, find yourself in “voicemail hell.” We’ve all been through it: Press 1 for support for products A, B, and C. Press 2 for Products D, E, and F, and so on and so on. Most people don’t want this and hang up. They prefer to speak with a human being as soon as the call is answered. Today, phone support is often not free (the number to reach support might be, but the support itself is not), but must be purchased via either a time-limited contract or on an incidentby-incident basis. This is particularly true for network operating system software support. To solve this problem, companies have devised other methods, such as the technical support CD-ROM and website, which we will discuss next.

The Technical Support CD-ROM

With the development of CD-ROM technology, it became possible to put volumes of textual information on a readily accessible medium. The CDROM was, thus, a logical distribution vehicle for technical support information. In addition, the CD was portable and searchable. Introduced in the early 1990s, Novell’s Network Support Encyclopedia (NSE) CD-ROM was one of the first products of this kind. Microsoft’s TechNet came soon after. Both companies charge a nominal fee for a yearly subscription to these CDs (anywhere from $100–$500).

To be sure, the first editions of these products (as with the first editions of most software products) left much to be desired. Search engines were often clumsy and slow, and the CDs were released only about twice a year. As these products evolved, however, their search engines became more advanced, they included more documents, and they were released more often. And, probably most important, manufacturers began to include software updates, drivers, and patches on the CD.

The Technical Support Website

The technical support CDs were great, but people started to complain (as people are wont to do) that because this information was vital to the health of their network, they should get it for free. Well, that is, in fact, what happened. The Internet proved to be the perfect medium for allowing network support personnel access to the same information that was on the technical support CD-ROMs. Additionally, websites can be instantly updated and accessed, so they provide the most up-to-date network support information. Since websites are hosted on servers that can store much more information than CD-ROMs, websites are more powerful than their CD-ROM counterparts. Because they are easy to access and use and because they are detailed and current, websites are now the most popular method for disseminating technical support information. As examples, you can view Novell’s technical support website at http://support.novell.com/ and Microsoft’s technical support website (Tech-Net, a monthly subscription) at http://support.microsoft.com/servicedesks/technet/.

Hardware Troubleshooting Tools

In addition to manufacturer-provided troubleshooting tools, there are a few hardware devices we can use to troubleshoot the network. These are actual devices that you can use during the troubleshooting process. Some devices have easily recognizable functions; others are more obscure. Four of the most popular hardware tools (that the Network+ exam tests you on, by the way) are:

  • A crossover cable

  • A hardware loopback

  • A tone generator

  • A tone locator

The Crossover Cable

Sometimes also called a cross cable, a crossover cable is typically used to connect two hubs, but it can also be used to test communications between two stations directly, bypassing the hub. A crossover cable is used only in Ethernet UTP installations. You can connect two workstation NICs (or a workstation and a server NIC) directly using a crossover cable.

A normal Ethernet (10BaseT) UTP cable uses four wires—two to transmit and two to receive. Figure 10.9 shows this wiring, with all wires going from pins on one side directly to the same pins on the other side.

click to expand
Figure 10.9: A standard Ethernet 10BaseT cable

The standard Ethernet UTP crossover cable used in both situations has its transmit and receive wire pairs crossed so that the transmit set on one side (hooked to pins 1 and 2) is connected to the receive set (pins 3 and 6) on the other. Figure 10.10 illustrates this arrangement. Note that four of the wires are crossed as compared with the straight-through wiring of the standard 10BaseT UTP cable shown earlier in Figure 10.9.

click to expand
Figure 10.10: A standard Ethernet 10BaseT crossover cable

Tip 

Be sure to label a crossover cable as such to ensure that no one tries to use it as a workstation patch cable. If it is used as a patch cable, the workstation won’t be able to communicate with the hub and the rest of the network.

You can carry a crossover cable in the tool bag along with your laptop. If you want to ensure that a server’s NIC is functioning correctly, you can connect your laptop directly to the server’s NIC using the crossover cable. You should be able to log in to the server (assuming both NICs are configured correctly).

The Hardware Loopback

A hardware loopback is a special connector for Ethernet 10BaseT NICs. It functions similarly to a crossover cable, except that it connects the transmit pins directly to the receive pins (as shown in Figure 10.11). It is used by the NIC’s software diagnostics to test transmission and reception capabilities. You cannot completely test a NIC without one of these devices.

click to expand
Figure 10.11: A hardware loopback and its connections

Usually, the hardware loopback is no bigger than a single RJ-45 connector with a few small wires on the back. If a NIC has hardware diagnostics that can use the loopback, the hardware loopback plug will be included with the NIC. To use it, simply plug the loopback into the RJ-45 connector on the back of the NIC and start the diagnostic software. Select the option in your NIC’s diagnostic software that requires the loopback, and start the diagnostic routine. You will be able to tell if the NIC can send and receive data through the use of these diagnostics.

Tone Generator and Tone Locator

This combination of devices is used most often on telephone systems to locate cables. Since telephone systems use multiple pairs of UTP, it is nearly impossible to determine which set of wires goes where. Network documentation would be extremely helpful in making this determination, but if no documentation is available, you can use a tone generator and locator.

Note 

Don’t confuse these tools with a cable tester that tests cable quality. You use the tone generator and locator only to determine which UTP cable is which.

The tone generator is a small electronic device that sends an electrical signal down one set of UTP wires. The tone locator is another device that is designed to emit a tone when it detects the signal in a particular set of wires. When you need to trace a cable, hook the generator (often called the fox) to the copper ends of the wire pair you want to find. Then move the locator (often called the hound because it chases the fox) over multiple sets of cables (you don’t have to touch the copper part of the wire pairs; this tool works by induction) until you hear the tone. A soft tone indicates that you are close to the right set of wires. Keep moving the tool until the tone gets the loudest. Bingo! You have found the wire set. Figure 10.12 shows a tone generator and locator and how they are used.

click to expand
Figure 10.12: Use of a common tone generator and locator

Warning 

Never hook a tone generator to a cable that is hooked up to either a NIC or a hub! Because the tone generator sends electrical signals down the wire, it can blow a NIC or a hub. That is why tone generators are not usually used on networks. Cable testers are used more often. We’ll discuss cable testers later in this chapter.

Software Troubleshooting Tools

In addition to these hardware troubleshooting tools, you can use software programs to gain information about the current health and state of the network. These tools fall into two main categories:

  • Protocol analyzers

  • Performance-monitoring tools

We use the term network software diagnostics to refer to these tools.

Protocol Analyzer

Any software that can analyze and display the packets it receives can be considered a protocol analyzer. Protocol analyzers examine packets from protocols that operate at the lower four layers of the OSI model (including Transport, Network, Data Link, and Physical) and can display any errors they detect. Additionally, most protocol analyzers can capture packets and decode their contents. Capturing packets involves copying a series of packets from the network into memory and holding the copy so that it can be analyzed.

You could, for example, capture a series of packets and decode their contents to figure out where each packet came from, where it was going, which protocol sent it, which protocol should receive it, and so on. For example, you can find out:

  • The nature of the traffic on your network

  • Which protocol is used most often

  • If users are accessing unauthorized sites

  • If a particular network card is jabbering (sending out packets when there is no data to send)

Two common examples of protocol analyzers are Sniffer, a Network General product, and Novell’s LANalyzer.

Performance-Monitoring Tools

In addition to protocol analyzers, many network operating systems include tools for monitoring network performance and can display statistics such as the number of packets sent and received, server processor utilization, the amount of data going in and out of the server, and so on. NetWare comes with the MONITOR.NLM utility, and Windows NT comes with Performance Monitor. Both monitor performance statistics. You can use these utilities to determine the source of the bottleneck when users complain that the network is slow.

Note 

To start the MONITOR.NLM utility in NetWare, simply type LOAD MONITOR at the console prompt. To start the Performance Monitor program in Windows NT, you must first be logged in as Administrator (or a member of the Server Operators group). Once you are logged in, choose Start Ø Programs Ø Administrative Tools Ø Performance Monitor.




Network+ Study Guide
Network+ Study Guide
ISBN: 470427477
EAN: N/A
Year: 2002
Pages: 151

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net