Scenario 4: End-to-End Communication Problems


A typical ISDN case includes situations where the configuration is okay, the interfaces are configured correctly, and the router makes outgoing calls, but the connection is intermittent, or drops, soon after the connection is established. This scenario is one of the most difficult to troubleshoot for ISDN. As a reminder, end-to-end communications are based on the encapsulation protocol running on the BRI and dialer interfaces. Regardless of the fact that Q.931 is considered to be an end-to-end protocol (as discussed in Chapter 9, "ISDN Technology Background"), after a call is made, the frames are encapsulated by PPP, which takes over and manages the point-to-point and end-to-end communication. Recall that the PPP protocol has five phases (again, for troubleshooting purposes, you might consider there being four phases). Each phase starts after the previous one is open. Every phase and every protocol in that particular phase can create problems. Therefore, if the call out does not drop in the first three seconds, focus on PPP when you're troubleshooting.

The following issues are typical for end-to-end communication problems:

  • The LEC's ISDN switch settings

  • Link control protocol (LCP) problems and the magic 22 seconds

  • Authentication problems

  • End-to-end routing problems

Each issue is discussed in the following sections.

The LEC's ISDN Switch Settings

One of the simplest cases is when the router calls out but does not get a response. This situation is typical for new installs. The LEC is responsible for the settings of the switch. An error in the profile can create problems. The easiest way to identify a problem is by using the following:

 804-isdn#debug isdn events 

Search for messages with prefix I: (input). If you do not have access to the user's router, work with the end user to determine if the switch passes data from the core router to the user's router (check the RxD lights on the router). If no data is passed, work with the LEC to correct the problem. If the remote user is extremely non-technical, this might make troubleshooting very difficult.

LCP Problems and the Magic 22 Seconds

One of the most complicated issues is when everything works correctly and the router makes calls out, but after the router disconnects, the call results in one of the following:

  • If the disconnect is immediately after call out (less than one second), there's either a wrong phone number or it is not provisioned for data. The error messages in Chapter 12 provide more explanations about the possible cause.

  • If the call disconnects in two seconds, the other party is not configured to be called from this router (CHAP refused authentication) and the call is rejected.

  • The call drops in about 22 seconds.

Because of its importance, the 22 second issue deserves more detailed analyses. This is when the configuration is okay on both sides, the calling router tries to make a call, and the switch responds with call connected, but the call then drops after 22 seconds. Looking at the problem with a 77x router, you see the output that's shown in Example 13-9.

Example 13-9. The 22 Second Issue
 1. Status    01/01/1995 18:34:06 Line Status   Line Activated   Terminal Identifier Assigned    SPID Accepted   Terminal Identifier Assigned    SPID Accepted Port Status                                           Interface Connection Link   Ch:  1   64K Call In Progress         20178          DATA          0      0   Ch:  2   Waiting for call ! The Connection (number) =0 and Link (number) = 0 is a typical sign. ! These values never acquire <>0 (non-zero) value.  776-isdn: SJB12A75> 01/01/1995 18:34:24  L12  1  Disconnected Remotely Cause 16  Normal Disconnect  776-isdn: SJB12A75> 01/01/1995 18:34:25  L27  1  Disconnected  776-isdn: SJB12A75> 01/01/1995 18:34:33  L05  0  20178  Outgoing Call Initiated  776-isdn: SJB12A75> 01/01/1995 18:34:34  L08  1  20178  Call Connected  776-isdn: SJB12A75> 01/01/1995 18:34:56  L12  1  Disconnected Remotely Cause 16  Normal Disconnect 

Another important message from the output of the IOS-based ISDN router is

 Jul 30 11:01:59 PDT: %ISDN-6-DISCONNECT: Interface BRI0:1     disconnected from 4086175555 , call lasted 22 seconds. 

The symptoms are confusing, and checking the call messages does not provide any useful information. The indications from the switch are even more confusing because the switch reports that the call is disconnected and the reason is normal, but the end-to-end communication is not normal. To understand what is occurring, you must review the PPP protocol. As a second-layer protocol, PPP works after Q.931 is complete (the REQUEST and CONNECT messages are sent and confirmed). Once finished (the switch reports a successful connection), and the call is treated as a callout, the next phase of the PPP protocol starts, which is LCP. This protocol establishes various parameters, as described in Chapter 12, and negotiates the line parameters in a typical handshake manner. You can see outgoing requests, incoming requests, and confirmations. In one of the cases, the output of the core router, which of course is reporting the same situation, can be viewed in Example 13-10.

Example 13-10. The Output of the Core Router
 7206-isdn#debug ppp negotiation Jul 30 11:01:37 PDT: %LINK-3-UPDOWN: Interface Serial2/1:3, changed state to up Jul 30 11:01:37 PDT: Se2/1:3 PPP: Treating connection as a callin ! The core accepts a call and treats it as an incoming call from a remote user. Jul 30 11:01:37 PDT: Se2/1:3 PPP: Phase is ESTABLISHING,     Passive Open [0 sess, 1 load] Jul 30 11:01:37 PDT: Se2/1:3 CHAP: Using alternate hostname SJB12A75 ! The router recognizes the alternate hostname, configured under the user's ! dialer interface, or the hostname, used as a username in the remote ! user's configuration. Jul 30 11:01:37 PDT: Se2/1:3 LCP: State is Listen ! The LCP is open Jul 30 11:01:39 PDT: Se2/1:3 LCP: TIMEout: State Listen ! LCP state is listen Jul 30 11:01:39 PDT: Se2/1:3 CHAP: Using alternate hostname SJB12A75 Jul 30 11:01:39 PDT: Se2/1:3 LCP: O CONFREQ [Listen] id 37 len 30 ! First attempt to establish a connection (sending confirmation request OUT) Jul 30 11:01:39 PDT: Se2/1:3 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:01:39 PDT: Se2/1:3 LCP:    MagicNumber 0x48430C39 (0x050648430C39) Jul 30 11:01:39 PDT: Se2/1:3 LCP:    MRRU 1524 (0x110405F4) Jul 30 11:01:39 PDT: Se2/1:3 LCP:    EndpointDisc 1 Local (0x130B01534E564143413031) Jul 30 11:01:41 PDT: Se2/1:3 LCP: TIMEout: State REQsent Jul 30 11:01:41 PDT: Se2/1:3 LCP: O CONFREQ [REQsent] id 38 len 30 ! Second  attempt to establish a connection (sending confirmation request OUT) Jul 30 11:01:41 PDT: Se2/1:3 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:01:41 PDT: Se2/1:3 LCP:    MagicNumber 0x48430C39 (0x050648430C39) Jul 30 11:01:41 PDT: Se2/1:3 LCP:    MRRU 1524 (0x110405F4) Jul 30 11:01:41 PDT: Se2/1:3 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) Jul 30 11:01:43 PDT: %ISDN-6-CONNECT: Interface Serial2/1:3 is now     connected to 4086174270 Jul 30 11:01:43 PDT: Se2/1:3 LCP: TIMEout: State REQsent 

Skipping the exact same sequence for attempts 3 through 9, take a look at the last (10) attempt in Example 13-11.

Example 13-11. Final Attempt
 ... Jul 30 11:01:57 PDT: Se2/1:3 LCP: O CONFREQ [REQsent] id 46 len 30 Jul 30 11:01:57 PDT: Se2/1:3 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:01:57 PDT: Se2/1:3 LCP:    MagicNumber 0x48430C39 (0x050648430C39) Jul 30 11:01:57 PDT: Se2/1:3 LCP:    MRRU 1524 (0x110405F4) Jul 30 11:01:57 PDT: Se2/1:3 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) Jul 30 11:01:59 PDT: Se2/1:3 LCP: TIMEout: State REQsent Jul 30 11:01:59 PDT: Se2/1:3 LCP: State is Listen ! Last (10th) attempt to establish a connection ! (sending confirmation request OUT) Jul 30 11:01:59 PDT: %ISDN-6-DISCONNECT: Interface Serial2/1:3     disconnected from 4086174270 , call lasted 22 seconds ! Again, the same message for "call lasted 22 seconds" and as a result ! "LCP: State is Closed" Jul 30 11:01:59 PDT: %LINK-3-UPDOWN: Interface Serial2/1:3, changed state to down Jul 30 11:01:59 PDT: Se2/1:3 LCP: State is Closed Jul 30 11:01:59 PDT: Se2/1:3 PPP: Phase is DOWN [0 sess, 1 load] 

The same output can be seen from both ends of the connection; the time interval should match. The interval between 2 consecutive attempts might vary, but every attempt takes about 2 seconds (see the timestamps). So, 10 attempts with 2 seconds for each, 1 second at the beginning, and 1 at the end is equivalent to 22 seconds. These parameters are adjustable, but there's no guarantee that if you use non-default parameters, it's going to work. It's recommended to use the default parameters and investigate further. The same case using a 77x scenario results in 8 attempts, and the interval is a little less than 3 seconds.

In order for LCP to be successful and to report the LCP phase open, both parties expect an incoming message confirming the parameters from each other, which looks like the following:

 Jul 30 11:12:21 PDT: Se2/1:5 LCP: I CONFACK [ACKsent]     [REQsent] id 118 len 8, 

The previous line indicates the confirmation acknowledgement and the expected output from this command is shown in Example 13-12.

Example 13-12. LCP Is Successfully Opened
 Jul 30 11:12:20 PDT: %LINK-3-UPDOWN: Interface Serial2/1:5, changed state to up Jul 30 11:12:20 PDT: Se2/1:5 PPP: Treating connection as a callin Jul 30 11:12:20 PDT: Se2/1:5 PPP: Phase is ESTABLISHING, Passive Open     [0 sess, 0 load] Jul 30 11:12:20 PDT: Se2/1:5 CHAP: Using alternate hostname SJB12A75 Jul 30 11:12:20 PDT: Se2/1:5 LCP: State is Listen Jul 30 11:12:21 PDT: Se2/1:5 LCP: I CONFREQ [Listen] id 1 len 36 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRU 1522 (0x010405F2) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MagicNumber 0x00119790 (0x050600119790) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRRU 1800 (0x11040708) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    EndpointDisc 3 0040.f913.abc8     (0x1309030040F913ABC8) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    LinkDiscriminator 5578 (0x170415CA) Jul 30 11:12:21 PDT: Se2/1:5 CHAP: Using alternate hostname SJB12A75 Jul 30 11:12:21 PDT: Se2/1:5 LCP: O CONFREQ [Listen] id 118 len 30 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MagicNumber 0x484CD7F7 (0x0506484CD7F7) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRRU 1524 (0x110405F4) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) Jul 30 11:12:21 PDT: Se2/1:5 LCP: O CONFREJ [Listen] id 1 len 8 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    LinkDiscriminator 5578 (0x170415CA) Jul 30 11:12:21 PDT: Se2/1:5 LCP: I CONFNAK [REQsent] id 118 len 8 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRU 1522 (0x010405F2) Jul 30 11:12:21 PDT: Se2/1:5 LCP: O CONFREQ [REQsent] id 119 len 30 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MagicNumber 0x484CD7F7 (0x0506484CD7F7) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRRU 1524 (0x110405F4) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) Jul 30 11:12:21 PDT: Se2/1:5 LCP: I CONFREQ [REQsent] id 2 len 32 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRU 1522 (0x010405F2) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MagicNumber 0x00119790 (0x050600119790) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRRU 1800 (0x11040708) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    EndpointDisc 3 0040.f913.abc8     (0x1309030040F913ABC8) Jul 30 11:12:21 PDT: Se2/1:5 LCP: O CONFACK [REQsent] id 2 len 32 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRU 1522 (0x010405F2) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MagicNumber 0x00119790 (0x050600119790) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRRU 1800 (0x11040708) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    EndpointDisc 3 0040.f913.abc8     (0x1309030040F913ABC8) Jul 30 11:12:21 PDT: Se2/1:5 LCP: I CONFACK [ACKsent] id 119 len 30 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MagicNumber 0x484CD7F7 (0x0506484CD7F7) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRRU 1524 (0x110405F4) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) Jul 30 11:12:21 PDT: Se2/1:5 LCP: State is Open ! Finally the LCP is successfully opened. 

Another type of the same problem is when both parties try to establish the LCP parameters and still cannot establish a connection, although both parties exchange messages.

If you trigger the router to make a call out with the following command and you have the #debug ppp negotiation command on, you see the ping unsuccessful:

 804-isdn#ping 151.68.10.70 

The output from the debug command will appear as displayed in Example 13-13.

Example 13-13. debug ppp negotiation Output
 *Aug 22 09::04:21.823: %LINK-3-UPDOWN: Interface BRI0:1, changed state to up *Aug 22 09::04:21.843: %ISDN-6-CONNECT: Interface BRI0:1 is now connected to     14087320178 . *Aug 22 09::04:21.847: BR0:1 PPP: Treating connection as a callout *Aug 22 09::04:21.847: BR0:1 PPP: Phase is ESTABLISHING, Active Open *Aug 22 09::04:21.851: BR0:1 LCP: O CONFREQ [Closed] id 12 len 35 *Aug 22 09::04:21.851: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:21.851: BR0:1 LCP:    MagicNumber 0xB0C6897B (0x0506B0C6897B) *Aug 22 09::04:21.855: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:21.855: BR0:1 LCP:    EndpointDisc 1 Local     (0x13100172626561756368652D6973646E) ! The remote user requests confirmation for the first time ! See the O CONFREQ message *Aug 22 09::04:23.623: BR0:1 LCP: I CONFREQ [REQsent] id 41 len 30 *Aug 22 09::04:23.623: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:23.627: BR0:1 LCP:    MagicNumber 0xBE47D746 (0x0506BE47D746) *Aug 22 09::04:23.627: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:23.627: BR0:1 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) ! The core router requests confirmation for the first time ! See the I CONFREQ message *Aug 22 09::04:23.631: BR0:1 LCP: O CONFACK [REQsent] id 41 len 30 *Aug 22 09::04:23.631: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:23.63.5: BR0:1 LCP:    MagicNumber 0xBE47D746 (0x0506BE47D746) *Aug 22 09::04:23.635: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:23.635: BR0:1 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) *Aug 22 09::04:23.851: BR0:1 LCP: TIMEout: State ACKsent ! The remote user confirms with O CONFAK, but the process times out ! First time, because there is no confirmation from the core router *Aug 22 09::04:23.851: BR0:1 LCP: O CONFREQ [ACKsent] id 13 len 35 *Aug 22 09::04:23.851: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:23.855: BR0:1 LCP:    MagicNumber 0xB0C6897B (0x0506B0C6897B) *Aug 22 09::04:23.855: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:23.855: BR0:1 LCP:    EndpointDisc 1 Local     (0x13100172626561756368652D6973646E) ! The remote user requests confirmation again ! Second time  see O CONFREQ message *Aug 22 09::04:25.623: BR0:1 LCP: I CONFREQ [ACKsent] id 42 len 30 *Aug 22 09::04:25.623: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:25.627: BR0:1 LCP:    MagicNumber 0xBE47D746 (0x0506BE47D746) *Aug 22 09::04:25.627: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:25.627: BR0:1 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) ! The core router requests confirmation for the second time ! See I CONFREQ for the second time <output omitted> *Aug 22 09::04:39.863: BR0:1 LCP: O CONFREQ [ACKsent] id 21 len 35 *Aug 22 09::04:39.863: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:39.867: BR0:1 LCP:    MagicNumber 0xB0C6897B (0x0506B0C6897B) *Aug 22 09::04:39.867: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:39.867: BR0:1 LCP:    EndpointDisc 1 Local     (0x13100172626561756368652D6973646E) *Aug 22 09::04:41.623: BR0:1 LCP: I CONFREQ [ACKsent] id 50 len 30 *Aug 22 09::04:41.623: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:41.623: BR0:1 LCP:    MagicNumber 0xBE47D746 (0x0506BE47D746) *Aug 22 09::04:41.627: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:41.627: BR0:1 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) *Aug 22 09::04:41.631: BR0:1 LCP: O CONFACK [ACKsent] id 50 len 30 *Aug 22 09::04:41.631: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:41.635: BR0:1 LCP:    MagicNumber 0xBE47D746 (0x0506BE47D746) *Aug 22 09::04:41.635: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:41.635: BR0:1 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) *Aug 22 09::04:41.863: BR0:1 LCP: TIMEout: State ACKsent ! The same message "TIMEout: Stare ACKsent" for the 10th time *Aug 22 09::04:41.863: BR0:1 LCP: O CONFREQ [ACKsent] id 22 len 35 *Aug 22 09::04:41.863: BR0:1 LCP:    AuthProto CHAP (0x0305C22305) *Aug 22 09::04:41.867: BR0:1 LCP:    MagicNumber 0xB0C6897B (0x0506B0C6897B) *Aug 22 09::04:41.867: BR0:1 LCP:    MRRU 1524 (0x110405F4) *Aug 22 09::04:41.867: BR0:1 LCP:    EndpointDisc 1 Local   (0x13100172626561756368652D6973646E) *Aug 22 09::04:43.731: %ISDN-6-DISCONNECT: Interface BRI0:1     disconnected from 1408735555 gateway, call lasted 22 seconds *Aug 22 09::04:43.731: %LINK-3-UPDOWN: Interface BRI0:1,     changed state to down *Aug 22 09::04:43.751: BR0:1 LCP: State is Closed *Aug 22 09::04:43.751: BR0:1 PPP: Phase is DOWN The state LCP: State is Open cannot be reached. 

The messages "call lasted 22 second" and "The state LCP: State is Open cannot be reached" indicate that the LCP phase cannot be established. The missing message here is the incoming message, I CONFACK [ACKsent], which will confirm that the core router has agreed on parameters that the remote user was proposing.

Here are two important questions to ask when troubleshooting LCP problems:

  • Is only one user affected or are multiple users affected?

  • What's causing the problem?

The answer to the first question will focus your analysis to either the remote user's set of problems or to the core router and its set of issues. To answer the second question, you need to investigate at least three problem cases:

  • 56/64 speed problem

  • Cable mapping problem

  • Trunk problem

The 56/64 case is related to the 56/64 Kbps settings. In general, it is recommended that 56 kbps be defined and applied as part of the dialer interface or class, to see if it makes a difference. This is especially useful in rural areas or when the circuit runs through old types of switches.

The cable mapping case is related to one of the new features of IOS, called Non-Facility Associated Signaling (NFAS) groups that are a typical set-up for core routers. These groups are comprised of some T1s/PRIs grouped in a way that every single PRI has 24 available data channels and is configured as 24B. Two data channels of the PRIs in the group are designated for D channels and are called primary and backup D channels. They are configured as 23B+D. The other channels are called members, as shown in Example 13-14.

Example 13-14. show isdn nfas group 0 Output
 7206-isdn#show isdn nfas group 0          ISDN NFAS GROUP 0 ENTRIES:          The primary D is Serial2/0:23.          The backup D is Serial2/1:23.          The NFAS member is Serial2/2:23.          The NFAS member is Serial2/3:23.          The NFAS member is Serial3/0:23.          The NFAS member is Serial3/1:23.          The NFAS member is Serial3/2:23.          The NFAS member is Serial3/3:23.          There are 8 total nfas members.          There are 190 total available B channels.          The primary D-channel is DSL 0 in state IN SERVICE.          The backup D-channel is DSL 1 in state STANDBY.          The current active layer 2 DSL is 0. 7206-isdn# 

Now, recall that the D channel uses TDM technology. One of the features of TDM is that it defines a slot for every data channel (trunk), and an empty slot cannot be used by another. Also, it's important to note the calls hit a single or NFAS circuit in a particular order (for example Serial1/0, Serial1/1, and Serial2/0). As soon as the first circuit in the switch is in state=2 (busy), the calls roll to available trunks of member circuits. The hunting order set up by the telco should also be known. If it's set up as first available, troubleshooting is easier. But sometimes, the telco uses other hunting methods, such as least busy or round robin, to help spread the calls out and give the modems less of a duty cycle.

It is important to ensure the correct mapping of the circuits between the telco switch configuration and core router configuration. If the cable mapping is mismatched, the calls don't hit the core router in the way the ISDN switch sends them. The TDM cannot handle the D channel signaling correctly and a problem occurs. Monitoring the calls for long periods of time show the call order and it does not usually match the expected order. The easiest way to identify a NFAS group is to disconnect one of the circuits and determine if some of the others are affected. If the cabling is correctly set up, this should not affect any other member of the group, but if there is a mapping problem, you will see at least one of the other members of the group affected. The correct way of setting up the NFAS group and cable mapping is explained in Chapter 6.

The final case is the trunk misconfiguration problem. If it is in the remote user's side, you need to troubleshoot the user's service.

If the core side is affected, here is what happens. Typically, a group of users is trying to connect to the core router, but they have only a 20 percent success rate. As soon as they connect, they can exchange data, but the initial connect rate is low. Sometimes, users are disconnected in the middle of the session.

Here is a typical case for a group of remote users experiencing this problem. After discussing the problem with the end users, you might conclude it is more common during the early morning, and the success rate is higher later at night. It is confusing because no solid pattern or trend exists. When reviewing the core router information, all users call Serial interface 2/1. So, here is the plan of action.

The first thing to do is check the serial 2/1:23, which is the D channel of this interface, as shown in Example 13-15.

Example 13-15. show interface serial2/1:23 Output
 3640-isdn#show interface serial2/1:23 Serial2/1:23 is up, line protocol is up (spoofing)   Hardware is DSX1   Description: SJB12A75 72HCQA123132-001 408-732-5555   MTU 1500 bytes, BW 64 Kbit, DLY 20000 usec,      reliability 255/255, txload 1/255, rxload 1/255   Encapsulation PPP, loopback not set   DTR is pulsed for 1 seconds on reset   Last input 00:00:25, output 00:27:47, output hang never   Last clearing of "show interface" counters 1w2d   Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0   Queuing strategy: weighted fair   Output queue: 0/1000/64/0 (size/max total/threshold/drops)      Conversations  0/1/16 (active/max active/max total)      Reserved Conversations 0/0 (allocated/max allocated)      Available Bandwidth 48 kilobits/sec   5 minute input rate 0 bits/sec, 0 packets/sec   5 minute output rate 0 bits/sec, 0 packets/sec      93069 packets input, 847750 bytes, 0 no buffer      Received 0 broadcasts, 0 runts, 0 giants, 0 throttles      9501 input errors, 9010 CRC, 400 frame, 0 overrun, 0 ignored, 31 abort      93069 packets output, 738440 bytes, 0 underruns      0 output errors, 0 collisions, 0 interface resets      0 output buffer failures, 0 output buffers swapped out      0 carrier transitions   Timeslot(s) Used:24, Transmitter delay is 0 flags 3640-isdn# 

If the router reports a high number of errors, you might suggest that it is line quality related. However, there is no significant amount of aborts on the line. The line is not flapping (0 carrier transitions on Serial 2/1:23), and the errors are mainly format errors. You need to find out what is causing the problem and why is it worse in the morning hours versus evening hours.

NOTE

One of the most amazing stories about troubleshooting I've ever encountered was a 3 PM ISDN problem. The connection was fine, except at 3 PM every day, when the connection went down for 15 or 20 minutes. It so happened that the local loop was built under a nearby runway and every day, on schedule, a heavy Hercules aircraft would take off from that runway, which affected the connection for 15 to 20 minutes.


The second thing to do is to find a pattern. Testing with the LEC, trunk by trunk, does not provide any resolution or clues to the nature of the problem. To find a pattern, one of the possible actions here is to take a closer look at the calls, when the remote users are trying to connect to the core router's circuit Serial 2/1. In this case, the following debug commands are recommended:

 #debug isdn events #debug ppp negotiation 

From the output, you can see that the call failures occur on Serial2/0:0, Serial2/0:1, and Serial2/0:2. You must determine why users calling these channels are experiencing intermittent connections. Reviewing the circuit, you can see up to five users in the morning, but a fully used circuit late in the afternoon. After monitoring the calls of all users, you identify that sometimes some users are connected for a long time, and sometimes the same users are experiencing disconnects. It looks like it's not user related, but trunk (channel) related. You must determine if and which channels cause the disconnects. After a period of monitoring, you determine that the disconnects always result from the same Serial2/0:0 to Serial2/0:4, and the problem is definitely not user related, but channel related. But, if this is the case, why are the disconnects typical for early hours and the success rate higher for late hours?

The explanation is simple. When the users are experiencing the problem connect in the early morning, they usually connect to the first channels because there are few requests. When they connect at night, more users are connected, and as soon as the monitored users are connected to the higher channels (trunks), they are okay.

The last step is the resolution of this case.

Obviously, the first 5 channels of this particular PRI are failing. The solution is to either redefine the circuit from 23B+D to 18B+D, or busy out the first 5 trunks while working with the LEC to correct. After a few days, you need to check the Serial2/0:23 and you should see no errors, as shown in the following fragment:

 ...      93069 packets input, 847750 bytes, 0 no buffer      Received 0 broadcasts, 0 runts, 0 giants, 0 throttles      0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort ... 

NOTE

Remember that when you are troubleshooting the core router, a significant amount of output is generated by such commands as debug isdn events and debug ppp negotiations. These debugs can lead to unexpected conditions on the router. The bad news for you will be that you will see more information than is relevant to your particular test plan (in the previous case, the D channel of the serial interface 2/1:23). More helpful will be to apply conditions for the debugging, such as debug condition interface serial s2/0:23. After that, you can type debug isdn events and the router reports only the ISDN events on the serial 2/0:23.


The final activity is to finish your work with the LEC, fixing the trunks or replacing malfunctioning devices.

Authentication Problems

Before discussing authentication problems, it's important to follow the next rule of the thumb in authentication: Establish the basic IP connectivity prior to implementing any authentication. Verify that the link works, then secure it.

The authentication problems are easier to resolve when used in an enterprise environment. Two basic solutions are available:

  • Local authentication is used where the username (host name) and the password are defined in the particular box to which the remote user is trying to connect:

     username 804-isdn password 7 11456A0119461B02456B 

    The previous statement is necessary in the core router to locally define the user.

  • TACACS+ authentication is performed in a designated server.

A detailed description of core router's TACACS+ configuration is provided in Chapter 6 and applies to ISDN as well. In the TACACS+ server, every user has a profile (record). A sample of a user profile in the TACACS+ server is shown in Example 13-16.

Example 13-16. Sample of a User Profile in the TACACS+ Server
 User Profile Information user = 804-isdn{ profile_id = 5483 profile_cycle = 1 password = chap "********" service=ppp { default attribute=permit allow ".*" ".*" ".*" protocol=ip { set addr=10.19.28.137 set routing=true set route#1="10.19.28.136 255.255.255.252 10.19.28.137" default attribute=permit } protocol=multilink { default attribute=permit } protocol=ccp { default attribute=permit } protocol=lcp { default attribute=permit } } 

More discussion about authentication can be seen in Chapter 6 and Chapter 20. Visit Cisco.com for more detailed discussion and configuration examples.

For either of these solutions, you need to know what options are available for authentication.

NOTE

PPP authentications follows the LCP stage of the PPP protocol, where you need to receive LCP OPEN, before authentication takes place.


The options for PPP authentication are as follows:

 ppp authentication chap ms-chap pap [callback] [callin] [callout] [optional] 

The following steps are recommended when troubleshooting authentication problems.

First, make sure you have the encapsulation type and authentication types defined properly under the BRI and dialer interfaces from both sides:

  • encapsulation ppp

  • ppp authentication chap

Second, use the debug command to monitor the authentication process:

 3640-isdn#debug ppp negotiation 

After the LCP is open, the expected output should look like the output shown in Example 13-17.

Example 13-17. Output After the LCP Is Open
 Jul 30 11:12:21 PDT: Se2/1:5 LCP: I CONFACK [ACKsent] id 119 len 30 Jul 30 11:12:21 PDT: Se2/1:5 LCP:    AuthProto CHAP (0x0305C22305) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MagicNumber 0x484CD7F7 (0x0506484CD7F7) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    MRRU 1524 (0x110405F4) Jul 30 11:12:21 PDT: Se2/1:5 LCP:    EndpointDisc 1 Local     (0x130B01534E564143413031) Jul 30 11:12:21 PDT: Se2/1:5 LCP: State is Open ! The LCP opens successfully, PPP negotiation starts Jul 30 11:12:21 PDT: Se2/1:5 PPP: Phase is AUTHENTICATING,     by both [0 sess, 0 load] Jul 30 11:12:21 PDT: Se2/1:5 CHAP: Using alternate hostname SJB12A75 Jul 30 11:12:21 PDT: Se2/1:5 CHAP: O CHALLENGE id 33 len 29 from " SJB12A75" Jul 30 11:12:21 PDT: Se2/1:5 CHAP: I CHALLENGE id 1 len 34 from "804-isdn" !One pair exchanged - Input and Output Jul 30 11:12:21 PDT: Se2/1:5 CHAP: Waiting for peer to authenticate first Jul 30 11:12:21 PDT: Se2/1:5 CHAP: I RESPONSE id 33 len 34 from "804-isdn" Jul 30 11:12:21 PDT: Se2/1:5 CHAP: O SUCCESS id 33 len 4 Jul 30 11:12:21 PDT: Se2/1:5 CHAP: Processing saved Challenge, id 1 Jul 30 11:12:21 PDT: Se2/1:5 CHAP: Using alternate hostname SJB12A75 ! Second pair exchanged  Input and Output Jul 30 11:12:21 PDT: Se2/1:5 CHAP: O RESPONSE id 1 len 29 from " SJB12A75" Jul 30 11:12:21 PDT: Se2/1:5 CHAP: I SUCCESS id 1 len 36 msg is "chap:     User SJB12A75 authorized." ! Last pair exchanged - I and O Jul 30 11:12:21 PDT: Se2/1:5 PPP: Phase is VIRTUALIZED [0 sess, 0 load] 

There are two SUCCESS messages, and the message Phase is VIRTUALIZED ends the process.

Finally, check the way the passwords are configured. Two things are important here:

  • The username in the global configuration of the remote user's router must match the CHAP of the core router. The example includes the host name of the dialer interface of the core router is SJB12A75. This means, for the general configuration mode of the remote user's router, you must have a line starting with 804-isdn#username SJB12A75 password, and the core router must have an authentication part reflecting the host name of the remote user804-isdn.

  • The passwords from both sides must be the same. If the password from the core router is 11456A0119461B02456B (encrypted), or the word "secret" (unencrypted), you can configure the remote user's router with the following:

     804-isdn(config)#username SJB12A75 password secret 

Remember that the host names are case sensitive, so when typing or comparing, make sure that the passwords match.

End-to-End Routing Problems

The routing problems are often related to configuration rules and errors.

The first type of problem arises when no default gateway is configured, or it is misconfigured. The rule to follow is the default gateway must point to the local router's Ethernet interface. One example is when the local DHCP is defined and part of that definition is the default gateway, as shown in Example 13-18.

Example 13-18. The Local DHCP Is Defined and Part of that Definition Is the Default Gateway
 ip dhcp pool ippool    network 10.70.209.80 255.255.255.248    dns-server  20.68.10.70  20.68.10.140    netbios-name-server  20.68.235.228  20.69.2.87    domain-name cisco.com    default-router 10.70.209.81    lease infinite ! 

The second case is when there is no route to a remote network. To check if a particular IP address is in the routing table, you can use the command line:

 804-isdn#show ip route  10.20.30.40 

If the message returned is % Network not in table, you need to define a static route to the remote party's IP address. See the Cisco IOS Configuration Guide for more details.




Troubleshooting Remote Access Networks CCIE Professional Development
Troubleshooting Remote Access Networks (CCIE Professional Development)
ISBN: 1587050765
EAN: 2147483647
Year: 2002
Pages: 235

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net